get dataframe row count based on conditionsFind values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise

How to build up towards a "Moment of Reckoning" when my story is told in the first person?

Horizontal, Slanted, Stacked Lines in TikZ

How to convert diagonal matrix to rectangular matrix

I make billions (#6)

What exactly is a "murder hobo"?

Users forgetting to regenerate PDF before sending it

What would +1/+2/+3 items be called in game?

Optimization models for portfolio optimization

Moving millions of files to a different directory with specfic name patterns

how does the Raspberry Pi PoE shield work?

Party going through airport security at separate times?

What is the meaning of “Can I have a slice?” In NYC?

Is there a way I can open the Windows 10 Ubuntu bash without running the ~/.bashrc script?

How was the Shuttle loaded and unloaded from its carrier aircraft?

Intern not wearing safety equipment; how could I have handled this differently?

How does the Melf's Minute Meteors spell interact with the Evocation wizard's Sculpt Spells feature?

Need a non-volatile memory IC with near unlimited read/write operations capability

Found and corrected a mistake on someone's else paper -- praxis?

What does the multimeter dial do internally?

Why different specifications for telescopes and binoculars?

VHDL: is there a way to create an entity into which constants can be passed?

Can a landlord force all residents to use the landlord's in-house debit card accounts?

How should I ask for a "pint" in countries that use metric?

Can Jimmy hang on his rope?



get dataframe row count based on conditions


Find values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








38















I want to get the count of dataframe rows based on conditional selection. I tried the following code.



print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()


output:



IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64


The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.










share|improve this question




























    38















    I want to get the count of dataframe rows based on conditional selection. I tried the following code.



    print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()


    output:



    IP 57
    Time 57
    Method 57
    Resource 57
    Status 57
    Bytes 57
    Referrer 57
    Agent 57
    dtype: int64


    The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.










    share|improve this question
























      38












      38








      38


      9






      I want to get the count of dataframe rows based on conditional selection. I tried the following code.



      print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()


      output:



      IP 57
      Time 57
      Method 57
      Resource 57
      Status 57
      Bytes 57
      Referrer 57
      Agent 57
      dtype: int64


      The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.










      share|improve this question














      I want to get the count of dataframe rows based on conditional selection. I tried the following code.



      print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()


      output:



      IP 57
      Time 57
      Method 57
      Resource 57
      Status 57
      Bytes 57
      Referrer 57
      Agent 57
      dtype: int64


      The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.







      python pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jun 26 '13 at 13:56









      Nilani AlgiriyageNilani Algiriyage

      7,20624 gold badges63 silver badges99 bronze badges




      7,20624 gold badges63 silver badges99 bronze badges






















          2 Answers
          2






          active

          oldest

          votes


















          55














          You are asking for the condition where all the conditions are true,
          so len of the frame is the answer, unless I misunderstand what you are asking



          In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

          In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
          Out[18]:
          A B C D
          12 0.491683 0.137766 0.859753 -1.041487
          13 0.376200 0.575667 1.534179 1.247358
          14 0.428739 1.539973 1.057848 -1.254489

          In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
          Out[19]:
          A 3
          B 3
          C 3
          D 3
          dtype: int64

          In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
          Out[20]: 3





          share|improve this answer























          • Yes! That is what i wanted :) Thanks very much!

            – Nilani Algiriyage
            Jun 26 '13 at 14:39






          • 5





            Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

            – Leandro Lima
            Dec 25 '17 at 17:08



















          5














          For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:



          In [1]: import pandas as pd
          import numpy as np
          df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


          In [2]: df.head()
          Out[2]:
          A B C D
          0 -2.019868 1.227246 -0.489257 0.149053
          1 0.223285 -0.087784 -0.053048 -0.108584
          2 -0.140556 -0.299735 -1.765956 0.517803
          3 -0.589489 0.400487 0.107856 0.194890
          4 1.309088 -0.596996 -0.623519 0.020400

          In [3]: %time sum((df['A']>0) & (df['B']>0))
          CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
          Wall time: 1.12 ms
          Out[3]: 4

          In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
          CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
          Wall time: 1.42 ms
          Out[4]: 4


          Keep in mind that this technique only works for counting the number of rows that comply with your predicate.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17322109%2fget-dataframe-row-count-based-on-conditions%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            55














            You are asking for the condition where all the conditions are true,
            so len of the frame is the answer, unless I misunderstand what you are asking



            In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

            In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
            Out[18]:
            A B C D
            12 0.491683 0.137766 0.859753 -1.041487
            13 0.376200 0.575667 1.534179 1.247358
            14 0.428739 1.539973 1.057848 -1.254489

            In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
            Out[19]:
            A 3
            B 3
            C 3
            D 3
            dtype: int64

            In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
            Out[20]: 3





            share|improve this answer























            • Yes! That is what i wanted :) Thanks very much!

              – Nilani Algiriyage
              Jun 26 '13 at 14:39






            • 5





              Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

              – Leandro Lima
              Dec 25 '17 at 17:08
















            55














            You are asking for the condition where all the conditions are true,
            so len of the frame is the answer, unless I misunderstand what you are asking



            In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

            In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
            Out[18]:
            A B C D
            12 0.491683 0.137766 0.859753 -1.041487
            13 0.376200 0.575667 1.534179 1.247358
            14 0.428739 1.539973 1.057848 -1.254489

            In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
            Out[19]:
            A 3
            B 3
            C 3
            D 3
            dtype: int64

            In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
            Out[20]: 3





            share|improve this answer























            • Yes! That is what i wanted :) Thanks very much!

              – Nilani Algiriyage
              Jun 26 '13 at 14:39






            • 5





              Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

              – Leandro Lima
              Dec 25 '17 at 17:08














            55












            55








            55







            You are asking for the condition where all the conditions are true,
            so len of the frame is the answer, unless I misunderstand what you are asking



            In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

            In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
            Out[18]:
            A B C D
            12 0.491683 0.137766 0.859753 -1.041487
            13 0.376200 0.575667 1.534179 1.247358
            14 0.428739 1.539973 1.057848 -1.254489

            In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
            Out[19]:
            A 3
            B 3
            C 3
            D 3
            dtype: int64

            In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
            Out[20]: 3





            share|improve this answer













            You are asking for the condition where all the conditions are true,
            so len of the frame is the answer, unless I misunderstand what you are asking



            In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

            In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
            Out[18]:
            A B C D
            12 0.491683 0.137766 0.859753 -1.041487
            13 0.376200 0.575667 1.534179 1.247358
            14 0.428739 1.539973 1.057848 -1.254489

            In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
            Out[19]:
            A 3
            B 3
            C 3
            D 3
            dtype: int64

            In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
            Out[20]: 3






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jun 26 '13 at 14:14









            JeffJeff

            84.3k13 gold badges165 silver badges147 bronze badges




            84.3k13 gold badges165 silver badges147 bronze badges












            • Yes! That is what i wanted :) Thanks very much!

              – Nilani Algiriyage
              Jun 26 '13 at 14:39






            • 5





              Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

              – Leandro Lima
              Dec 25 '17 at 17:08


















            • Yes! That is what i wanted :) Thanks very much!

              – Nilani Algiriyage
              Jun 26 '13 at 14:39






            • 5





              Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

              – Leandro Lima
              Dec 25 '17 at 17:08

















            Yes! That is what i wanted :) Thanks very much!

            – Nilani Algiriyage
            Jun 26 '13 at 14:39





            Yes! That is what i wanted :) Thanks very much!

            – Nilani Algiriyage
            Jun 26 '13 at 14:39




            5




            5





            Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

            – Leandro Lima
            Dec 25 '17 at 17:08






            Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

            – Leandro Lima
            Dec 25 '17 at 17:08














            5














            For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:



            In [1]: import pandas as pd
            import numpy as np
            df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


            In [2]: df.head()
            Out[2]:
            A B C D
            0 -2.019868 1.227246 -0.489257 0.149053
            1 0.223285 -0.087784 -0.053048 -0.108584
            2 -0.140556 -0.299735 -1.765956 0.517803
            3 -0.589489 0.400487 0.107856 0.194890
            4 1.309088 -0.596996 -0.623519 0.020400

            In [3]: %time sum((df['A']>0) & (df['B']>0))
            CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
            Wall time: 1.12 ms
            Out[3]: 4

            In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
            CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
            Wall time: 1.42 ms
            Out[4]: 4


            Keep in mind that this technique only works for counting the number of rows that comply with your predicate.






            share|improve this answer



























              5














              For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:



              In [1]: import pandas as pd
              import numpy as np
              df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


              In [2]: df.head()
              Out[2]:
              A B C D
              0 -2.019868 1.227246 -0.489257 0.149053
              1 0.223285 -0.087784 -0.053048 -0.108584
              2 -0.140556 -0.299735 -1.765956 0.517803
              3 -0.589489 0.400487 0.107856 0.194890
              4 1.309088 -0.596996 -0.623519 0.020400

              In [3]: %time sum((df['A']>0) & (df['B']>0))
              CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
              Wall time: 1.12 ms
              Out[3]: 4

              In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
              CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
              Wall time: 1.42 ms
              Out[4]: 4


              Keep in mind that this technique only works for counting the number of rows that comply with your predicate.






              share|improve this answer

























                5












                5








                5







                For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:



                In [1]: import pandas as pd
                import numpy as np
                df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


                In [2]: df.head()
                Out[2]:
                A B C D
                0 -2.019868 1.227246 -0.489257 0.149053
                1 0.223285 -0.087784 -0.053048 -0.108584
                2 -0.140556 -0.299735 -1.765956 0.517803
                3 -0.589489 0.400487 0.107856 0.194890
                4 1.309088 -0.596996 -0.623519 0.020400

                In [3]: %time sum((df['A']>0) & (df['B']>0))
                CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
                Wall time: 1.12 ms
                Out[3]: 4

                In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
                CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
                Wall time: 1.42 ms
                Out[4]: 4


                Keep in mind that this technique only works for counting the number of rows that comply with your predicate.






                share|improve this answer













                For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:



                In [1]: import pandas as pd
                import numpy as np
                df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


                In [2]: df.head()
                Out[2]:
                A B C D
                0 -2.019868 1.227246 -0.489257 0.149053
                1 0.223285 -0.087784 -0.053048 -0.108584
                2 -0.140556 -0.299735 -1.765956 0.517803
                3 -0.589489 0.400487 0.107856 0.194890
                4 1.309088 -0.596996 -0.623519 0.020400

                In [3]: %time sum((df['A']>0) & (df['B']>0))
                CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
                Wall time: 1.12 ms
                Out[3]: 4

                In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
                CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
                Wall time: 1.42 ms
                Out[4]: 4


                Keep in mind that this technique only works for counting the number of rows that comply with your predicate.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jun 27 '18 at 10:27









                Enias CailliauEnias Cailliau

                1762 silver badges12 bronze badges




                1762 silver badges12 bronze badges



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17322109%2fget-dataframe-row-count-based-on-conditions%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                    Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                    Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript