Spark Scheduler pool jobs are not running parallel as I expected The 2019 Stack Overflow Developer Survey Results Are InWhat is Spark Job ?Apache Spark - How does internal job scheduler in spark define what are users and what are poolsRunning scheduled Spark jobSubmitting Spark Job On Scheduler PoolSpark Job Scheduling Running 2 jobs ConcurrentlyParallel Jobs -Windows SchedulerRunning spark job in parallelScheduling a Spark Streaming JobHow to run multiple Spark jobs in parallel?schedule spark job in spark-shell

How to type this arrow in math mode?

Worn-tile Scrabble

Why didn't the Event Horizon Telescope team mention Sagittarius A*?

What could be the right powersource for 15 seconds lifespan disposable giant chainsaw?

If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?

Why is the maximum length of OpenWrt’s root password 8 characters?

Is there any way to tell whether the shot is going to hit you or not?

Shouldn't "much" here be used instead of "more"?

What to do when moving next to a bird sanctuary with a loosely-domesticated cat?

Earliest use of the term "Galois extension"?

Write faster on AT24C32

Is "plugging out" electronic devices an American expression?

Can one be advised by a professor who is very far away?

Who coined the term "madman theory"?

Apparent duplicates between Haynes service instructions and MOT

What is the meaning of the verb "bear" in this context?

What does ひと匙 mean in this manga and has it been used colloquially?

Can a rogue use sneak attack with weapons that have the thrown property even if they are not thrown?

How can I autofill dates in Excel excluding Sunday?

Are there incongruent pythagorean triangles with the same perimeter and same area?

Did 3000BC Egyptians use meteoric iron weapons?

What is the accessibility of a package's `Private` context variables?

What did it mean to "align" a radio?

Why do we hear so much about the Trump administration deciding to impose and then remove tariffs?



Spark Scheduler pool jobs are not running parallel as I expected



The 2019 Stack Overflow Developer Survey Results Are InWhat is Spark Job ?Apache Spark - How does internal job scheduler in spark define what are users and what are poolsRunning scheduled Spark jobSubmitting Spark Job On Scheduler PoolSpark Job Scheduling Running 2 jobs ConcurrentlyParallel Jobs -Windows SchedulerRunning spark job in parallelScheduling a Spark Streaming JobHow to run multiple Spark jobs in parallel?schedule spark job in spark-shell



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








-1















I am trying to run two spark actions as below and I expect them to run parallely as they both use differenct pools. Does scheduling using pools meant that, different independent actions will run parallelly? I mean If I have 200 cores, then pool1 uses 100 cores and pool2 uses 100 cores and then process the action.
In my case after first dataframe action is completed in pool1 then dataframe action2 is started.



spark.setLocalProperty("spark.scheduler.pool","pool1")
dataframe.show(100,false)

spark.setLocalProperty("spark.scheduler.pool","pool2")
dataframe2.show(100,false)


My pool configuration xml



<?xml version="1.0"?>

<allocations>
<pool name="pool1">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
<pool name="pool2">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
</allocations>









share|improve this question






















  • Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

    – Anurag Sharma
    Mar 22 at 9:20


















-1















I am trying to run two spark actions as below and I expect them to run parallely as they both use differenct pools. Does scheduling using pools meant that, different independent actions will run parallelly? I mean If I have 200 cores, then pool1 uses 100 cores and pool2 uses 100 cores and then process the action.
In my case after first dataframe action is completed in pool1 then dataframe action2 is started.



spark.setLocalProperty("spark.scheduler.pool","pool1")
dataframe.show(100,false)

spark.setLocalProperty("spark.scheduler.pool","pool2")
dataframe2.show(100,false)


My pool configuration xml



<?xml version="1.0"?>

<allocations>
<pool name="pool1">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
<pool name="pool2">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
</allocations>









share|improve this question






















  • Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

    – Anurag Sharma
    Mar 22 at 9:20














-1












-1








-1


1






I am trying to run two spark actions as below and I expect them to run parallely as they both use differenct pools. Does scheduling using pools meant that, different independent actions will run parallelly? I mean If I have 200 cores, then pool1 uses 100 cores and pool2 uses 100 cores and then process the action.
In my case after first dataframe action is completed in pool1 then dataframe action2 is started.



spark.setLocalProperty("spark.scheduler.pool","pool1")
dataframe.show(100,false)

spark.setLocalProperty("spark.scheduler.pool","pool2")
dataframe2.show(100,false)


My pool configuration xml



<?xml version="1.0"?>

<allocations>
<pool name="pool1">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
<pool name="pool2">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
</allocations>









share|improve this question














I am trying to run two spark actions as below and I expect them to run parallely as they both use differenct pools. Does scheduling using pools meant that, different independent actions will run parallelly? I mean If I have 200 cores, then pool1 uses 100 cores and pool2 uses 100 cores and then process the action.
In my case after first dataframe action is completed in pool1 then dataframe action2 is started.



spark.setLocalProperty("spark.scheduler.pool","pool1")
dataframe.show(100,false)

spark.setLocalProperty("spark.scheduler.pool","pool2")
dataframe2.show(100,false)


My pool configuration xml



<?xml version="1.0"?>

<allocations>
<pool name="pool1">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
<pool name="pool2">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
</pool>
</allocations>






apache-spark job-scheduling






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 22 at 3:39









user7481861user7481861

153




153












  • Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

    – Anurag Sharma
    Mar 22 at 9:20


















  • Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

    – Anurag Sharma
    Mar 22 at 9:20

















Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

– Anurag Sharma
Mar 22 at 9:20






Have you set conf property spark.scheduler.allocation.file to pool configuration xml? conf.set("spark.scheduler.allocation.file", "/path/to/file")

– Anurag Sharma
Mar 22 at 9:20













1 Answer
1






active

oldest

votes


















0














As per given details, your job must run parallely based on spark configuration but there are few parameters which need to be considered,



  1. Is YARN your cluster manager ? and if it is then have you configured the pool in configuration in YARN.


  2. I can see you are using FAIR scheduler which means scheduler is being overridden then have configured the same in YARN ?


TO configured FAIR scheduler please go through below link, everything is given in details,
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55292547%2fspark-scheduler-pool-jobs-are-not-running-parallel-as-i-expected%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    As per given details, your job must run parallely based on spark configuration but there are few parameters which need to be considered,



    1. Is YARN your cluster manager ? and if it is then have you configured the pool in configuration in YARN.


    2. I can see you are using FAIR scheduler which means scheduler is being overridden then have configured the same in YARN ?


    TO configured FAIR scheduler please go through below link, everything is given in details,
    http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html






    share|improve this answer



























      0














      As per given details, your job must run parallely based on spark configuration but there are few parameters which need to be considered,



      1. Is YARN your cluster manager ? and if it is then have you configured the pool in configuration in YARN.


      2. I can see you are using FAIR scheduler which means scheduler is being overridden then have configured the same in YARN ?


      TO configured FAIR scheduler please go through below link, everything is given in details,
      http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html






      share|improve this answer

























        0












        0








        0







        As per given details, your job must run parallely based on spark configuration but there are few parameters which need to be considered,



        1. Is YARN your cluster manager ? and if it is then have you configured the pool in configuration in YARN.


        2. I can see you are using FAIR scheduler which means scheduler is being overridden then have configured the same in YARN ?


        TO configured FAIR scheduler please go through below link, everything is given in details,
        http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html






        share|improve this answer













        As per given details, your job must run parallely based on spark configuration but there are few parameters which need to be considered,



        1. Is YARN your cluster manager ? and if it is then have you configured the pool in configuration in YARN.


        2. I can see you are using FAIR scheduler which means scheduler is being overridden then have configured the same in YARN ?


        TO configured FAIR scheduler please go through below link, everything is given in details,
        http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 22 at 5:43









        Dhrub ThakurDhrub Thakur

        114




        114





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55292547%2fspark-scheduler-pool-jobs-are-not-running-parallel-as-i-expected%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

            용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

            155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해