Micro-batching through Nifi Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Data aggregation in Apache NifiWhat is the best practice for nifi production deploymentNiFi Flowfile Attributes from KafkaConsumerCreating larger NiFi flow files when using the ConsumeKafka processorApache Nifi HBASE lookupHow do I extract one key value pair from _consumeKafka_ in Apache Nifi?Nifi Processor with multiple inputs, triggers only after receiving certain flow filesNiFi GetMongo fetches data foreverSyslog to Kafka : Most performant workflow in NIFI?Apache Nifi KafkaTimestamp Creation Issue
What does this say in Elvish?
One-one communication
Intuitive explanation of the rank-nullity theorem
Is there any word for a place full of confusion?
How to save space when writing equations with cases?
If Windows 7 doesn't support WSL, then what is "Subsystem for UNIX-based Applications"?
How to write capital alpha?
Semigroups with no morphisms between them
Deconstruction is ambiguous
What to do with repeated rejections for phd position
Girl Hackers - Logic Puzzle
Significance of Cersei's obsession with elephants?
Why do aircraft stall warning systems use angle-of-attack vanes rather than detecting airflow separation directly?
AppleTVs create a chatty alternate WiFi network
Why is it faster to reheat something than it is to cook it?
An adverb for when you're not exaggerating
Getting prompted for verification code but where do I put it in?
Did any compiler fully use 80-bit floating point?
What makes a man succeed?
Drawing spherical mirrors
Why are vacuum tubes still used in amateur radios?
Why weren't discrete x86 CPUs ever used in game hardware?
How does the math work when buying airline miles?
Crossing US/Canada Border for less than 24 hours
Micro-batching through Nifi
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Data aggregation in Apache NifiWhat is the best practice for nifi production deploymentNiFi Flowfile Attributes from KafkaConsumerCreating larger NiFi flow files when using the ConsumeKafka processorApache Nifi HBASE lookupHow do I extract one key value pair from _consumeKafka_ in Apache Nifi?Nifi Processor with multiple inputs, triggers only after receiving certain flow filesNiFi GetMongo fetches data foreverSyslog to Kafka : Most performant workflow in NIFI?Apache Nifi KafkaTimestamp Creation Issue
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I have a scenario where my kafka messages(from same topic) are flowing through single enrichment pipeline and written at the end into HDFS and MongoDB. My Kafka consumer for HDFS will run on hourly basis(for micro-batching). So I need to know the best possible way to route flowfiles to putHDFS and putMongo based on which consumer it is coming from(Consumer for HDFS or consumer for Mongo DB).
Or please suggest if there is any other way to achieve micro-batching through Nifi.
Thanks
apache-kafka apache-nifi
add a comment |
I have a scenario where my kafka messages(from same topic) are flowing through single enrichment pipeline and written at the end into HDFS and MongoDB. My Kafka consumer for HDFS will run on hourly basis(for micro-batching). So I need to know the best possible way to route flowfiles to putHDFS and putMongo based on which consumer it is coming from(Consumer for HDFS or consumer for Mongo DB).
Or please suggest if there is any other way to achieve micro-batching through Nifi.
Thanks
apache-kafka apache-nifi
add a comment |
I have a scenario where my kafka messages(from same topic) are flowing through single enrichment pipeline and written at the end into HDFS and MongoDB. My Kafka consumer for HDFS will run on hourly basis(for micro-batching). So I need to know the best possible way to route flowfiles to putHDFS and putMongo based on which consumer it is coming from(Consumer for HDFS or consumer for Mongo DB).
Or please suggest if there is any other way to achieve micro-batching through Nifi.
Thanks
apache-kafka apache-nifi
I have a scenario where my kafka messages(from same topic) are flowing through single enrichment pipeline and written at the end into HDFS and MongoDB. My Kafka consumer for HDFS will run on hourly basis(for micro-batching). So I need to know the best possible way to route flowfiles to putHDFS and putMongo based on which consumer it is coming from(Consumer for HDFS or consumer for Mongo DB).
Or please suggest if there is any other way to achieve micro-batching through Nifi.
Thanks
apache-kafka apache-nifi
apache-kafka apache-nifi
edited Mar 22 at 11:08
Isha
asked Mar 22 at 11:01
IshaIsha
11
11
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You could set Nifi up to use a Scheduling Strategy for the processors that upload data.
And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.
This is similar to how Kafka Connect would run for its HDFS Connector
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55298202%2fmicro-batching-through-nifi%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You could set Nifi up to use a Scheduling Strategy for the processors that upload data.
And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.
This is similar to how Kafka Connect would run for its HDFS Connector
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
add a comment |
You could set Nifi up to use a Scheduling Strategy for the processors that upload data.
And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.
This is similar to how Kafka Connect would run for its HDFS Connector
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
add a comment |
You could set Nifi up to use a Scheduling Strategy for the processors that upload data.
And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.
This is similar to how Kafka Connect would run for its HDFS Connector
You could set Nifi up to use a Scheduling Strategy for the processors that upload data.
And I would think you want the Kafka consumers to always read data, building a backlog of FlowFiles in NiFi, and then having the puts run on a less-frequent basis.
This is similar to how Kafka Connect would run for its HDFS Connector
answered Mar 22 at 19:16
cricket_007cricket_007
84.6k1147120
84.6k1147120
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
add a comment |
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
Yes so I have two kafka Consumer processor in Nifi, one to serve speed layer which saves data in MongoDB and other for batch layer which saves in HDFS. So the second processor is scheduled on hourly basis. But messages from both the processors go through single enrichment pipeline before written into respective databases. So my question revolves around how am I going to differentiate between messages and route them to correct databases.
– Isha
Mar 24 at 6:00
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
It's been a while since I used NiFi. I don't think you can have a single pipeline. Not without copying the FlowFiles somehow before they are sent to a downsteam. That being said, might be better to use two separate Kafka consumer processors, with different group ids
– cricket_007
Mar 25 at 19:14
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55298202%2fmicro-batching-through-nifi%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown