how to run foreachRDD with arguments inside a for loopHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Is there a way to run Python on Android?How can I safely create a nested directory in Python?Accessing the index in 'for' loops?How do I sort a dictionary by value?“Least Astonishment” and the Mutable Default ArgumentHow do I list all files of a directory?Iterating over dictionaries using 'for' loops
As a GM, is it bad form to ask for a moment to think when improvising?
Which sphere is fastest?
Why is my arithmetic with a long long int behaving this way?
What to use instead of cling film to wrap pastry
When an imagined world resembles or has similarities with a famous world
Feasibility of lava beings?
What was the first story to feature the plot "the monsters were human all along"?
Why aren't nationalizations in Russia described as socialist?
Where are the "shires" in the UK?
How do LIGO and VIRGO know that a gravitational wave has its origin in a neutron star or a black hole?
Which US defense organization would respond to an invasion like this?
Copy previous line to current line from text file
Has the Hulk always been able to talk?
Is the book wrong about the Nyquist Sampling Criterion?
Are pressure-treated posts that have been submerged for a few days ruined?
What do "Sech" and "Vich" mean in this sentence?
Why is "breaking the mould" positively connoted?
Does XQuartz work on macOS Mojave?
It is as simple as ABC
Dangerous workplace travelling
Is Benjen dead?
Can I use a Cat5e cable with an RJ45 and Cat6 port?
Why do people keep telling me that I am a bad photographer?
Is 'contemporary' ambiguous and if so is there a better word?
how to run foreachRDD with arguments inside a for loop
How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Is there a way to run Python on Android?How can I safely create a nested directory in Python?Accessing the index in 'for' loops?How do I sort a dictionary by value?“Least Astonishment” and the Mutable Default ArgumentHow do I list all files of a directory?Iterating over dictionaries using 'for' loops
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I have a for loop creating kafka DStreams by iterating over a list of topic names. Then I want to apply a function to each RDD, but I need to pass the topic name into the function. What I am seeing is that inside the function only the last topic value is available (topic3). I understand that's because foreachRDD is lazily executed. Is there any way to pass in the topic name? I have to accomplish this without using structured streaming.
topics = ["topic1", "topic2", "topic3"]
ssc = StreamingContext(spark_context, 5)
def process_topic(rdd, topic_value):
if not rdd.isEmpty():
print "topic : "+topic_value
df = rdd.toDF().show()
for topic in topics:
directKafkaStream = KafkaUtils.createDirectStream(ssc, [topic],'metadata.broker.list': 'broker_url', 'auto.offset.reset': 'smallest')
lines = directKafkaStream.map(lambda v: json.loads(v[1]))
lines.foreachRDD(lambda x: process_topic(x, topic))
ssc.start()
time.sleep(10)
ssc.stop()
The output looks like this:
"topic3"
(df for topic1)
"topic3"
(df for topic2)
"topic3"
(df for topic3)
Would like to see:
"topic1"
(df for topic1)
"topic2"
(df for topic2)
"topic3"
(df for topic3)
python apache-kafka
add a comment |
I have a for loop creating kafka DStreams by iterating over a list of topic names. Then I want to apply a function to each RDD, but I need to pass the topic name into the function. What I am seeing is that inside the function only the last topic value is available (topic3). I understand that's because foreachRDD is lazily executed. Is there any way to pass in the topic name? I have to accomplish this without using structured streaming.
topics = ["topic1", "topic2", "topic3"]
ssc = StreamingContext(spark_context, 5)
def process_topic(rdd, topic_value):
if not rdd.isEmpty():
print "topic : "+topic_value
df = rdd.toDF().show()
for topic in topics:
directKafkaStream = KafkaUtils.createDirectStream(ssc, [topic],'metadata.broker.list': 'broker_url', 'auto.offset.reset': 'smallest')
lines = directKafkaStream.map(lambda v: json.loads(v[1]))
lines.foreachRDD(lambda x: process_topic(x, topic))
ssc.start()
time.sleep(10)
ssc.stop()
The output looks like this:
"topic3"
(df for topic1)
"topic3"
(df for topic2)
"topic3"
(df for topic3)
Would like to see:
"topic1"
(df for topic1)
"topic2"
(df for topic2)
"topic3"
(df for topic3)
python apache-kafka
add a comment |
I have a for loop creating kafka DStreams by iterating over a list of topic names. Then I want to apply a function to each RDD, but I need to pass the topic name into the function. What I am seeing is that inside the function only the last topic value is available (topic3). I understand that's because foreachRDD is lazily executed. Is there any way to pass in the topic name? I have to accomplish this without using structured streaming.
topics = ["topic1", "topic2", "topic3"]
ssc = StreamingContext(spark_context, 5)
def process_topic(rdd, topic_value):
if not rdd.isEmpty():
print "topic : "+topic_value
df = rdd.toDF().show()
for topic in topics:
directKafkaStream = KafkaUtils.createDirectStream(ssc, [topic],'metadata.broker.list': 'broker_url', 'auto.offset.reset': 'smallest')
lines = directKafkaStream.map(lambda v: json.loads(v[1]))
lines.foreachRDD(lambda x: process_topic(x, topic))
ssc.start()
time.sleep(10)
ssc.stop()
The output looks like this:
"topic3"
(df for topic1)
"topic3"
(df for topic2)
"topic3"
(df for topic3)
Would like to see:
"topic1"
(df for topic1)
"topic2"
(df for topic2)
"topic3"
(df for topic3)
python apache-kafka
I have a for loop creating kafka DStreams by iterating over a list of topic names. Then I want to apply a function to each RDD, but I need to pass the topic name into the function. What I am seeing is that inside the function only the last topic value is available (topic3). I understand that's because foreachRDD is lazily executed. Is there any way to pass in the topic name? I have to accomplish this without using structured streaming.
topics = ["topic1", "topic2", "topic3"]
ssc = StreamingContext(spark_context, 5)
def process_topic(rdd, topic_value):
if not rdd.isEmpty():
print "topic : "+topic_value
df = rdd.toDF().show()
for topic in topics:
directKafkaStream = KafkaUtils.createDirectStream(ssc, [topic],'metadata.broker.list': 'broker_url', 'auto.offset.reset': 'smallest')
lines = directKafkaStream.map(lambda v: json.loads(v[1]))
lines.foreachRDD(lambda x: process_topic(x, topic))
ssc.start()
time.sleep(10)
ssc.stop()
The output looks like this:
"topic3"
(df for topic1)
"topic3"
(df for topic2)
"topic3"
(df for topic3)
Would like to see:
"topic1"
(df for topic1)
"topic2"
(df for topic2)
"topic3"
(df for topic3)
python apache-kafka
python apache-kafka
asked Mar 23 at 1:41
EVSEVS
11
11
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55309819%2fhow-to-run-foreachrdd-with-arguments-inside-a-for-loop%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55309819%2fhow-to-run-foreachrdd-with-arguments-inside-a-for-loop%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown