sbatch sends compute node to 'drained' status Unicorn Meta Zoo #1: Why another podcast? Announcing the arrival of Valued Associate #679: Cesar Manara Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!segmention fault using openMPI jobs with SLURMSlurm Multiprocessing Python JobHow can I know the node name for my running job on slurmDoes Slrum support running multiple jobs on one node at the same time?How to distribute slurm tasks evenly over the nodes?How to submit parallel job steps with SLURM?Cancel jobs running on the same partition on SLURMWhy does slurm assign more tasks than I asked when I “sbatch” multiple jobs with a .sh file?Slurm can't run more than one sbatch taskRunning mpirun with srun on multiple nodes gives a different communicator

Who is Alexandra K. Trenfor? Did she say the quote?

How would I use different systems of magic when they are capable of the same effects?

How would this chord from "Rocket Man" be analyzed?

My admission is revoked after accepting the admission offer

Split coins into combinations of different denominations

Multiple options vs single option UI

Additive group of local rings

Does the set of sets which are elements of every set exist?

What is this word supposed to be?

A strange hotel

Is it acceptable to use working hours to read general interest books?

Is there any hidden 'W' sound after 'comment' in : Comment est-elle?

What is the term for a person whose job is to place products on shelves in stores?

"Whatever a Russian does, they end up making the Kalashnikov gun"? Are there any similar proverbs in English?

Could Neutrino technically as side-effect, incentivize centralization of the bitcoin network?

Is accepting an invalid credit card number a security issue?

Check if a string is entirely made of the same substring

Are all CP/M-80 implementations binary compatible?

Why does the Cisco show run command not show the full version, while the show version command does?

Protagonist's race is hidden - should I reveal it?

Would reducing the reference voltage of an ADC have any effect on accuracy?

Where did Arya get these scars?

Why didn't the Space Shuttle bounce back into space many times as possible so that it loose lot of kinetic energy over there?

Retract an already submitted recommendation letter (written for an undergrad student)



sbatch sends compute node to 'drained' status



Unicorn Meta Zoo #1: Why another podcast?
Announcing the arrival of Valued Associate #679: Cesar Manara
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!segmention fault using openMPI jobs with SLURMSlurm Multiprocessing Python JobHow can I know the node name for my running job on slurmDoes Slrum support running multiple jobs on one node at the same time?How to distribute slurm tasks evenly over the nodes?How to submit parallel job steps with SLURM?Cancel jobs running on the same partition on SLURMWhy does slurm assign more tasks than I asked when I “sbatch” multiple jobs with a .sh file?Slurm can't run more than one sbatch taskRunning mpirun with srun on multiple nodes gives a different communicator



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















On newly installed and configured compute nodes in our small cluster I am unable to submit slurm jobs using a batch script and the 'sbatch' command. After submitting, the requested node changes to the 'drained' status. However, I can run the same command interactively using 'srun'.



Works:
srun -p debug --ntasks=1 --nodes=1 --job-name=test --nodelist=node6 -l echo 'test'



Does not work:
sbatch test.slurm

with test.slurm:



#!/bin/sh
#SBATCH --job-name=test
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --nodelist=node6
#SBATCH --partition=debug

echo 'test'


It gives me:



PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 1:00:00 1 drain node6


and I have to resume the node.



All nodes run Debian 9.8, use Infiniband and NIS.
I have made sure that all nodes have the same config, version of packages and daemons running. So, I don't see what I am missing.










share|improve this question

















  • 1





    You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

    – Keldorn
    Mar 26 at 3:02












  • Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

    – Iomsn
    Mar 27 at 9:48

















0















On newly installed and configured compute nodes in our small cluster I am unable to submit slurm jobs using a batch script and the 'sbatch' command. After submitting, the requested node changes to the 'drained' status. However, I can run the same command interactively using 'srun'.



Works:
srun -p debug --ntasks=1 --nodes=1 --job-name=test --nodelist=node6 -l echo 'test'



Does not work:
sbatch test.slurm

with test.slurm:



#!/bin/sh
#SBATCH --job-name=test
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --nodelist=node6
#SBATCH --partition=debug

echo 'test'


It gives me:



PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 1:00:00 1 drain node6


and I have to resume the node.



All nodes run Debian 9.8, use Infiniband and NIS.
I have made sure that all nodes have the same config, version of packages and daemons running. So, I don't see what I am missing.










share|improve this question

















  • 1





    You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

    – Keldorn
    Mar 26 at 3:02












  • Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

    – Iomsn
    Mar 27 at 9:48













0












0








0








On newly installed and configured compute nodes in our small cluster I am unable to submit slurm jobs using a batch script and the 'sbatch' command. After submitting, the requested node changes to the 'drained' status. However, I can run the same command interactively using 'srun'.



Works:
srun -p debug --ntasks=1 --nodes=1 --job-name=test --nodelist=node6 -l echo 'test'



Does not work:
sbatch test.slurm

with test.slurm:



#!/bin/sh
#SBATCH --job-name=test
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --nodelist=node6
#SBATCH --partition=debug

echo 'test'


It gives me:



PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 1:00:00 1 drain node6


and I have to resume the node.



All nodes run Debian 9.8, use Infiniband and NIS.
I have made sure that all nodes have the same config, version of packages and daemons running. So, I don't see what I am missing.










share|improve this question














On newly installed and configured compute nodes in our small cluster I am unable to submit slurm jobs using a batch script and the 'sbatch' command. After submitting, the requested node changes to the 'drained' status. However, I can run the same command interactively using 'srun'.



Works:
srun -p debug --ntasks=1 --nodes=1 --job-name=test --nodelist=node6 -l echo 'test'



Does not work:
sbatch test.slurm

with test.slurm:



#!/bin/sh
#SBATCH --job-name=test
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --nodelist=node6
#SBATCH --partition=debug

echo 'test'


It gives me:



PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 1:00:00 1 drain node6


and I have to resume the node.



All nodes run Debian 9.8, use Infiniband and NIS.
I have made sure that all nodes have the same config, version of packages and daemons running. So, I don't see what I am missing.







slurm sbatch






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 22 at 15:49









IomsnIomsn

185




185







  • 1





    You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

    – Keldorn
    Mar 26 at 3:02












  • Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

    – Iomsn
    Mar 27 at 9:48












  • 1





    You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

    – Keldorn
    Mar 26 at 3:02












  • Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

    – Iomsn
    Mar 27 at 9:48







1




1





You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

– Keldorn
Mar 26 at 3:02






You can see the reason the node went to drain with scontrol show node node6 | grep Reason. Or on the slurm controler log files.

– Keldorn
Mar 26 at 3:02














Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

– Iomsn
Mar 27 at 9:48





Thanks, Keldorn. That provided some useful information. We finally fixed the issue.

– Iomsn
Mar 27 at 9:48












1 Answer
1






active

oldest

votes


















0














Seems like the issue was connected to the present NIS. Just needed to add to the end of /etc/passwd this line:



+::::::


and restart slurmd on the node:



/etc/init.d/slurmd restart





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55303363%2fsbatch-sends-compute-node-to-drained-status%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Seems like the issue was connected to the present NIS. Just needed to add to the end of /etc/passwd this line:



    +::::::


    and restart slurmd on the node:



    /etc/init.d/slurmd restart





    share|improve this answer



























      0














      Seems like the issue was connected to the present NIS. Just needed to add to the end of /etc/passwd this line:



      +::::::


      and restart slurmd on the node:



      /etc/init.d/slurmd restart





      share|improve this answer

























        0












        0








        0







        Seems like the issue was connected to the present NIS. Just needed to add to the end of /etc/passwd this line:



        +::::::


        and restart slurmd on the node:



        /etc/init.d/slurmd restart





        share|improve this answer













        Seems like the issue was connected to the present NIS. Just needed to add to the end of /etc/passwd this line:



        +::::::


        and restart slurmd on the node:



        /etc/init.d/slurmd restart






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 27 at 9:52









        IomsnIomsn

        185




        185





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55303363%2fsbatch-sends-compute-node-to-drained-status%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

            용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

            155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해