Use Spark SQL JDBC Server/Beeline or spark-sql Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Spark Standalone Mode multiple shell sessions (applications)how to solve java.lang.OutOfMemoryError: Java heap space when train word2vec model in Spark?Spark on Mesos - running multiple Streaming jobsRunning a distributed Spark Job Server with multiple workers in a Spark standalone clusterSpark jdbc reuse connectionSPARK_WORKER_INSTANCES setting not working in Spark Standalone WindowsHow do I run multiple spark applications in parallel in standalone masterspark-scheduling across applicationSpark JDBC connection to SQL Server times out oftenReducing Apache Spark Startup Time

Karn the great creator - 'card from outside the game' in sealed

Co-worker has annoying ringtone

Semigroups with no morphisms between them

How were pictures turned from film to a big picture in a picture frame before digital scanning?

What initially awakened the Balrog?

Dyck paths with extra diagonals from valleys (Laser construction)

How could we fake a moon landing now?

1-probability to calculate two events in a row

What does Turing mean by this statement?

Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode

Crossing US/Canada Border for less than 24 hours

A term for a woman complaining about things/begging in a cute/childish way

What order were files/directories output in dir?

Google .dev domain strangely redirects to https

Dynamic filling of a region of a polar plot

Random body shuffle every night—can we still function?

Trademark violation for app?

How can I set the aperture on my DSLR when it's attached to a telescope instead of a lens?

What would you call this weird metallic apparatus that allows you to lift people?

Do I really need to have a message in a novel to appeal to readers?

Significance of Cersei's obsession with elephants?

How many morphisms from 1 to 1+1 can there be?

What is the meaning of 'breadth' in breadth first search?

Why are my pictures showing a dark band on one edge?



Use Spark SQL JDBC Server/Beeline or spark-sql



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Spark Standalone Mode multiple shell sessions (applications)how to solve java.lang.OutOfMemoryError: Java heap space when train word2vec model in Spark?Spark on Mesos - running multiple Streaming jobsRunning a distributed Spark Job Server with multiple workers in a Spark standalone clusterSpark jdbc reuse connectionSPARK_WORKER_INSTANCES setting not working in Spark Standalone WindowsHow do I run multiple spark applications in parallel in standalone masterspark-scheduling across applicationSpark JDBC connection to SQL Server times out oftenReducing Apache Spark Startup Time



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















In Spark SQL, there are two options to submit sql.



  1. spark-sql, for each sql, it will kick off a new Spark application.


  2. Spark JDBC Server and Beeline, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources


We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).



As of spark-sql and jdbc server/beeline, which option is better for my case?
To me, I would like to use spark-sql, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.



If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?










share|improve this question
























  • Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

    – Ajay Srivastava
    Mar 23 at 2:13











  • @AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

    – Tom
    Mar 23 at 3:01

















1















In Spark SQL, there are two options to submit sql.



  1. spark-sql, for each sql, it will kick off a new Spark application.


  2. Spark JDBC Server and Beeline, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources


We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).



As of spark-sql and jdbc server/beeline, which option is better for my case?
To me, I would like to use spark-sql, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.



If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?










share|improve this question
























  • Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

    – Ajay Srivastava
    Mar 23 at 2:13











  • @AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

    – Tom
    Mar 23 at 3:01













1












1








1


0






In Spark SQL, there are two options to submit sql.



  1. spark-sql, for each sql, it will kick off a new Spark application.


  2. Spark JDBC Server and Beeline, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources


We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).



As of spark-sql and jdbc server/beeline, which option is better for my case?
To me, I would like to use spark-sql, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.



If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?










share|improve this question
















In Spark SQL, there are two options to submit sql.



  1. spark-sql, for each sql, it will kick off a new Spark application.


  2. Spark JDBC Server and Beeline, The Jdbc Server is actually a long running standalone spark application, and the sqls submitted to it will share the resources


We are having about 30 big sql queries,each would like to occupy 200 cores and 800G memory to finish in reasonable time(30 mins).



As of spark-sql and jdbc server/beeline, which option is better for my case?
To me, I would like to use spark-sql, and I have no idea how many resources should be given to jdbc server to make my queries to finish in reasonable time.



If I can submit the 30 queries to Jdbc Server, then how many resources(cores/memory) that this Jdbc Server should be given(5000+ cores and 10T+ memory?)?







apache-spark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 23 at 1:28







Tom

















asked Mar 22 at 8:39









TomTom

1,5501336




1,5501336












  • Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

    – Ajay Srivastava
    Mar 23 at 2:13











  • @AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

    – Tom
    Mar 23 at 3:01

















  • Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

    – Ajay Srivastava
    Mar 23 at 2:13











  • @AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

    – Tom
    Mar 23 at 3:01
















Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

– Ajay Srivastava
Mar 23 at 2:13





Does each query work on different tables/RDDs ? 800 G of memory is used for caching tables ?

– Ajay Srivastava
Mar 23 at 2:13













@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

– Tom
Mar 23 at 3:01





@AjaySrivastava, some of the tables are cached,since we are using 200 executors, each given 4G memory.

– Tom
Mar 23 at 3:01












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55295749%2fuse-spark-sql-jdbc-server-beeline-or-spark-sql%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55295749%2fuse-spark-sql-jdbc-server-beeline-or-spark-sql%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript