How to get all the available parameters for env.readTextFile()?How do I efficiently iterate over each entry in a Java Map?Does a finally block always get executed in Java?How do I call one constructor from another in Java?How do I read / convert an InputStream into a String in Java?How do I generate random integers within a specific range in Java?How to get an enum value from a string value in Java?Does Java support default parameter values?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?How do I fix android.os.NetworkOnMainThreadException?
Generator for parity?
What are good ways to improve as a writer other than writing courses?
How can I read one message at a time from /var/mail
What happen if I gain the control of aura that enchants an opponent's creature? Would the aura stay attached?
Replace data between quotes in a file
Dereferencing a pointer in a for loop initializer creates a seg fault
Why does this Pokémon I just hatched need to be healed?
Improve survivability of bicycle container
What is a "Genuine Geraldo interviewee"?
During the Space Shuttle Columbia Disaster of 2003, Why Did The Flight Director Say, "Lock the doors."?
Is it true that control+alt+delete only became a thing because IBM would not build Bill Gates a computer with a task manager button?
How do I calculate the difference in lens reach between a superzoom compact and a DSLR zoom lens?
How to help new students accept function notation
Can you use Shapechange with character feats to Grapple the Tarrasque?
English - Acceptable use of parentheses in an author's name
Is there a way to create a report for the failed entries while calling REST API
Does two puncture wounds mean venomous snake?
Word or idiom defining something barely functional
Looking for a new job because of relocation - is it okay to tell the real reason?
I want to copy my HOME folder to a USB flash drive but I can't. I accidentally removed Python 3 and lost many important stuff
Is multiplication of real numbers uniquely defined as being distributive over addition?
How to display a duet in lyrics?
Dropdowns & Chevrons for Right to Left languages
How quickly could a country build a tall concrete wall around a city?
How to get all the available parameters for env.readTextFile()?
How do I efficiently iterate over each entry in a Java Map?Does a finally block always get executed in Java?How do I call one constructor from another in Java?How do I read / convert an InputStream into a String in Java?How do I generate random integers within a specific range in Java?How to get an enum value from a string value in Java?Does Java support default parameter values?How do I determine whether an array contains a particular value in Java?How do I convert a String to an int in Java?How do I fix android.os.NetworkOnMainThreadException?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to read multiple .gz files from HDFS using Dataset API (env.readTextFile()
) but the sizes of the files vary a lot, which makes it hard to improve efficiency by increasing parallelism. So, I want to know whether there are parameters which can cope with this data skew thing. Or, do I have to make the input files have similar sizes ?
Below is the code I am using right now, which is copied from Flink DataSet API Programming Guide
// enable recursive enumeration of nested input files
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// create a configuration object
Configuration parameters = new Configuration();
// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true);
// pass the configuration to the data source
DataSet<String> logs = env.readTextFile("file:///path/with.nested/files")
.withParameters(parameters);
java apache-flink
add a comment |
I am trying to read multiple .gz files from HDFS using Dataset API (env.readTextFile()
) but the sizes of the files vary a lot, which makes it hard to improve efficiency by increasing parallelism. So, I want to know whether there are parameters which can cope with this data skew thing. Or, do I have to make the input files have similar sizes ?
Below is the code I am using right now, which is copied from Flink DataSet API Programming Guide
// enable recursive enumeration of nested input files
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// create a configuration object
Configuration parameters = new Configuration();
// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true);
// pass the configuration to the data source
DataSet<String> logs = env.readTextFile("file:///path/with.nested/files")
.withParameters(parameters);
java apache-flink
add a comment |
I am trying to read multiple .gz files from HDFS using Dataset API (env.readTextFile()
) but the sizes of the files vary a lot, which makes it hard to improve efficiency by increasing parallelism. So, I want to know whether there are parameters which can cope with this data skew thing. Or, do I have to make the input files have similar sizes ?
Below is the code I am using right now, which is copied from Flink DataSet API Programming Guide
// enable recursive enumeration of nested input files
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// create a configuration object
Configuration parameters = new Configuration();
// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true);
// pass the configuration to the data source
DataSet<String> logs = env.readTextFile("file:///path/with.nested/files")
.withParameters(parameters);
java apache-flink
I am trying to read multiple .gz files from HDFS using Dataset API (env.readTextFile()
) but the sizes of the files vary a lot, which makes it hard to improve efficiency by increasing parallelism. So, I want to know whether there are parameters which can cope with this data skew thing. Or, do I have to make the input files have similar sizes ?
Below is the code I am using right now, which is copied from Flink DataSet API Programming Guide
// enable recursive enumeration of nested input files
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// create a configuration object
Configuration parameters = new Configuration();
// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true);
// pass the configuration to the data source
DataSet<String> logs = env.readTextFile("file:///path/with.nested/files")
.withParameters(parameters);
java apache-flink
java apache-flink
asked Mar 27 at 6:46
iamabugiamabug
431 silver badge7 bronze badges
431 silver badge7 bronze badges
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55371305%2fhow-to-get-all-the-available-parameters-for-env-readtextfile%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.
Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55371305%2fhow-to-get-all-the-available-parameters-for-env-readtextfile%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown