mpirun performance analysisPerformance optimization strategies of last resortImprove INSERT-per-second performance of SQLite?Why does changing 0.1f to 0 slow down performance by 10x?Swift Beta performance: sorting arraysReplacing a 32-bit loop counter with 64-bit introduces crazy performance deviationsImproving UI performance for displaying dynamic real-time data“A request was made to bind to that would result in binding more processes than cpus on a resource” mpirun command (for mpi4py)Training NN in ubuntu with tensorflow gpu makes computer restartMPIRUN Segmentation fault whenever I use a hostfileSlowing down with multiple concurrent MPI LAMMPS jobs

Where are the Wazirs?

How do I explain that I don't want to maintain old projects?

As a supervisor, what feedback would you expect from a PhD who quits?

Why am I getting unevenly-spread results when using $RANDOM?

Is this really the Saturn V computer only, or are there other systems here as well?

Can one block with a protection from color creature?

What was the significance of Spider-Man: Far From Home being an MCU Phase 3 film instead of a Phase 4 film?

Computer name naming convention for security

Users forgotting to regenerate PDF before sending it

aligning diagram with arrows

Need a non-volatile memory IC with near unlimited read/write operations capability

How do resistors generate different heat if we make the current fixed and changed the voltage and resistance? Notice the flow of charge is constant

In layman's terms, does the Luckstone just give a passive +1 to all d20 rolls and saves except for death saves?

When do flights get cancelled due to fog?

How do I separate enchants from items?

Curly braces adjustment in tikz?

How can I review my manager, who is fine?

What do you call a situation where you have choices but no good choice?

SQL Server Sch-S locks on unrelated tables

Wires do not connect in Circuitikz

Is it okay to use open source code to do an interview task?

How do "gefälligst" and "ruhig" have different tones?

Can you create a free-floating MASYU puzzle?

How did the IEC decide to create kibibytes?



mpirun performance analysis


Performance optimization strategies of last resortImprove INSERT-per-second performance of SQLite?Why does changing 0.1f to 0 slow down performance by 10x?Swift Beta performance: sorting arraysReplacing a 32-bit loop counter with 64-bit introduces crazy performance deviationsImproving UI performance for displaying dynamic real-time data“A request was made to bind to that would result in binding more processes than cpus on a resource” mpirun command (for mpi4py)Training NN in ubuntu with tensorflow gpu makes computer restartMPIRUN Segmentation fault whenever I use a hostfileSlowing down with multiple concurrent MPI LAMMPS jobs






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I'm running mpirun (OpenMPI) with 86 processes on 12 CPUs and 2 GPUs on Ubuntu 18.04. The application that is being run is training neural networks.



After a day or so of training the iterations slow down dramatically. The code works fine on a single thread, network traffic (file reads) are well within spec and the CPUs and GPUs show no excessive load.



So I think that problem is with the mpirun.



Are there non-intrusive tools available to show the performance of the MPI runs? I've been looking at Performance Co-Pilot but I don't see any MPI profiling in the software itself.










share|improve this question

















  • 1





    did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

    – Gilles Gouaillardet
    Mar 26 at 0:00











  • @GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

    – Kevin Johnsrude
    Mar 26 at 15:52

















1















I'm running mpirun (OpenMPI) with 86 processes on 12 CPUs and 2 GPUs on Ubuntu 18.04. The application that is being run is training neural networks.



After a day or so of training the iterations slow down dramatically. The code works fine on a single thread, network traffic (file reads) are well within spec and the CPUs and GPUs show no excessive load.



So I think that problem is with the mpirun.



Are there non-intrusive tools available to show the performance of the MPI runs? I've been looking at Performance Co-Pilot but I don't see any MPI profiling in the software itself.










share|improve this question

















  • 1





    did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

    – Gilles Gouaillardet
    Mar 26 at 0:00











  • @GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

    – Kevin Johnsrude
    Mar 26 at 15:52













1












1








1


1






I'm running mpirun (OpenMPI) with 86 processes on 12 CPUs and 2 GPUs on Ubuntu 18.04. The application that is being run is training neural networks.



After a day or so of training the iterations slow down dramatically. The code works fine on a single thread, network traffic (file reads) are well within spec and the CPUs and GPUs show no excessive load.



So I think that problem is with the mpirun.



Are there non-intrusive tools available to show the performance of the MPI runs? I've been looking at Performance Co-Pilot but I don't see any MPI profiling in the software itself.










share|improve this question














I'm running mpirun (OpenMPI) with 86 processes on 12 CPUs and 2 GPUs on Ubuntu 18.04. The application that is being run is training neural networks.



After a day or so of training the iterations slow down dramatically. The code works fine on a single thread, network traffic (file reads) are well within spec and the CPUs and GPUs show no excessive load.



So I think that problem is with the mpirun.



Are there non-intrusive tools available to show the performance of the MPI runs? I've been looking at Performance Co-Pilot but I don't see any MPI profiling in the software itself.







performance ubuntu mpi






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 25 at 21:41









Kevin JohnsrudeKevin Johnsrude

2,3002 gold badges18 silver badges43 bronze badges




2,3002 gold badges18 silver badges43 bronze badges







  • 1





    did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

    – Gilles Gouaillardet
    Mar 26 at 0:00











  • @GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

    – Kevin Johnsrude
    Mar 26 at 15:52












  • 1





    did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

    – Gilles Gouaillardet
    Mar 26 at 0:00











  • @GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

    – Kevin Johnsrude
    Mar 26 at 15:52







1




1





did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

– Gilles Gouaillardet
Mar 26 at 0:00





did you check the memory usage ? if there is a memory leak, your nodes will start swapping after a while, and become very slow.

– Gilles Gouaillardet
Mar 26 at 0:00













@GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

– Kevin Johnsrude
Mar 26 at 15:52





@GillesGouaillardet yes but there's no swapping going on and memory usage is between 60% and 80%

– Kevin Johnsrude
Mar 26 at 15:52












1 Answer
1






active

oldest

votes


















0














Callgrind and kcachegrind might be useful. A brief look here [1] may help you as well.



[1] https://www.open-mpi.org/faq/?category=debugging#parallel-debuggers






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55346850%2fmpirun-performance-analysis%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Callgrind and kcachegrind might be useful. A brief look here [1] may help you as well.



    [1] https://www.open-mpi.org/faq/?category=debugging#parallel-debuggers






    share|improve this answer



























      0














      Callgrind and kcachegrind might be useful. A brief look here [1] may help you as well.



      [1] https://www.open-mpi.org/faq/?category=debugging#parallel-debuggers






      share|improve this answer

























        0












        0








        0







        Callgrind and kcachegrind might be useful. A brief look here [1] may help you as well.



        [1] https://www.open-mpi.org/faq/?category=debugging#parallel-debuggers






        share|improve this answer













        Callgrind and kcachegrind might be useful. A brief look here [1] may help you as well.



        [1] https://www.open-mpi.org/faq/?category=debugging#parallel-debuggers







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 25 at 23:06









        Adrian NegruAdrian Negru

        1732 silver badges10 bronze badges




        1732 silver badges10 bronze badges


















            Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







            Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55346850%2fmpirun-performance-analysis%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

            용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

            155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해