Use Spark SQL JDBC Server/Beeline or spark-sql Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Spark Standalone Mode multiple shell sessions (applications)how to solve java.lang.OutOfMemoryError: Java heap space when train word2vec model in Spark?Spark on Mesos - running multiple Streaming jobsRunning a distributed Spark Job Server with multiple workers in a Spark standalone clusterSpark jdbc reuse connectionSPARK_WORKER_INSTANCES setting not working in Spark Standalone WindowsHow do I run multiple spark applications in parallel in standalone masterspark-scheduling across applicationSpark JDBC connection to SQL Server times out oftenReducing Apache Spark Startup Time
Karn the great creator - 'card from outside the game' in sealed
Co-worker has annoying ringtone
Semigroups with no morphisms between them
How were pictures turned from film to a big picture in a picture frame before digital scanning?
What initially awakened the Balrog?
Dyck paths with extra diagonals from valleys (Laser construction)
How could we fake a moon landing now?
1-probability to calculate two events in a row
What does Turing mean by this statement?
Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode
Crossing US/Canada Border for less than 24 hours
A term for a woman complaining about things/begging in a cute/childish way
What order were files/directories output in dir?
Google .dev domain strangely redirects to https
Dynamic filling of a region of a polar plot
Random body shuffle every night—can we still function?
Trademark violation for app?
How can I set the aperture on my DSLR when it's attached to a telescope instead of a lens?
What would you call this weird metallic apparatus that allows you to lift people?
Do I really need to have a message in a novel to appeal to readers?
Significance of Cersei's obsession with elephants?
How many morphisms from 1 to 1+1 can there be?
What is the meaning of 'breadth' in breadth first search?
Why are my pictures showing a dark band on one edge?
Use Spark SQL JDBC Server/Beeline or spark-sql
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Spark Standalone Mode multiple shell sessions (applications)how to solve java.lang.OutOfMemoryError: Java heap space when train word2vec model in Spark?Spark on Mesos - running multiple Streaming jobsRunning a distributed Spark Job Server with multiple workers in a Spark standalone clusterSpark jdbc reuse connectionSPARK_WORKER_INSTANCES setting not working in Spark Standalone WindowsHow do I run multiple spark applications in parallel in standalone masterspark-scheduling across applicationSpark JDBC connection to SQL Server times out oftenReducing Apache Spark Startup Time
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
In Spark SQL, there are two options to submit sql.
spark-sql
, for each sql, it will kick off a new Spark application.Spark JDBC Server and Beeline
, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources
We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).
As of spark-sql and jdbc server/beeline
, which option is better for my case?
To me, I would like to use spark-sql
, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.
If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?
apache-spark
add a comment |
In Spark SQL, there are two options to submit sql.
spark-sql
, for each sql, it will kick off a new Spark application.Spark JDBC Server and Beeline
, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources
We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).
As of spark-sql and jdbc server/beeline
, which option is better for my case?
To me, I would like to use spark-sql
, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.
If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?
apache-spark
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01
add a comment |
In Spark SQL, there are two options to submit sql.
spark-sql
, for each sql, it will kick off a new Spark application.Spark JDBC Server and Beeline
, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources
We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).
As of spark-sql and jdbc server/beeline
, which option is better for my case?
To me, I would like to use spark-sql
, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.
If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?
apache-spark
In Spark SQL, there are two options to submit sql.
spark-sql
, for each sql, it will kick off a new Spark application.Spark JDBC Server and Beeline
, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources
We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).
As of spark-sql and jdbc server/beeline
, which option is better for my case?
To me, I would like to use spark-sql
, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.
If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?
apache-spark
apache-spark
edited Mar 23 at 1:28
Tom
asked Mar 22 at 8:39
TomTom
1,5501336
1,5501336
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01
add a comment |
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55295749%2fuse-spark-sql-jdbc-server-beeline-or-spark-sql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55295749%2fuse-spark-sql-jdbc-server-beeline-or-spark-sql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?
– Ajay Srivastava
Mar 23 at 2:13
@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.
– Tom
Mar 23 at 3:01