Count number of rows containing string per index using pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasPandas - How to flatten a hierarchical index in columnsHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Get list from pandas DataFrame column headersPandas: conditional rolling countWhy is “1000000000000000 in range(1000000000000001)” so fast in Python 3?

What to say to a student who has failed?

Did the British navy fail to take into account the ballistics correction due to Coriolis force during WW1 Falkland Islands battle?

How to respectfully refuse to assist co-workers with IT issues?

Does merkle root contain hashes of transactions from previous blocks?

What is a CirKle Word™?

Lost property on Portuguese trains

Non-visual Computers - thoughts?

Can I get temporary health insurance while moving to the US?

Architectural feasibility of a tiered circular stone keep

Tex Quotes(UVa 272)

How to prevent clipped screen edges on my TV, HDMI-connected?

How to determine car loan length as a function of how long I plan to keep a car

Is there any way to keep a player from killing an NPC?

Papers on arXiv solving the same problem at the same time

'Us students' - Does this apposition need a comma?

Wrong arrangement of boxes in raster of tcolorbox

Change my first, I'm entertaining

Are the A380 engines interchangeable (given they are not all equipped with reverse)?

Is gzip atomic?

How do thermal tapes transfer heat despite their low thermal conductivity?

Why is the UK so keen to remove the "backstop" when their leadership seems to think that no border will be needed in Northern Ireland?

Sql server sleeping state is increasing using ADO.NET?

Can a Rogue PC teach an NPC to perform Sneak Attack?

Is there any way white can win?



Count number of rows containing string per index using pandas


Adding new column to existing DataFrame in Python pandas“Large data” work flows using pandasPandas - How to flatten a hierarchical index in columnsHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Get list from pandas DataFrame column headersPandas: conditional rolling countWhy is “1000000000000000 in range(1000000000000001)” so fast in Python 3?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I have a data set like this:



index sentence
1 bobby went to the gym
1 sally the bad
1 days are good
2 sunny side up
2 the weird


I want to count how many times 'the' appears in the columns 'sentence' by index:



index count_the 
1 2
2 1


how would I do this in pandas?










share|improve this question






























    0















    I have a data set like this:



    index sentence
    1 bobby went to the gym
    1 sally the bad
    1 days are good
    2 sunny side up
    2 the weird


    I want to count how many times 'the' appears in the columns 'sentence' by index:



    index count_the 
    1 2
    2 1


    how would I do this in pandas?










    share|improve this question


























      0












      0








      0








      I have a data set like this:



      index sentence
      1 bobby went to the gym
      1 sally the bad
      1 days are good
      2 sunny side up
      2 the weird


      I want to count how many times 'the' appears in the columns 'sentence' by index:



      index count_the 
      1 2
      2 1


      how would I do this in pandas?










      share|improve this question














      I have a data set like this:



      index sentence
      1 bobby went to the gym
      1 sally the bad
      1 days are good
      2 sunny side up
      2 the weird


      I want to count how many times 'the' appears in the columns 'sentence' by index:



      index count_the 
      1 2
      2 1


      how would I do this in pandas?







      python-3.x pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 27 at 18:17









      song0089song0089

      1,0143 gold badges19 silver badges43 bronze badges




      1,0143 gold badges19 silver badges43 bronze badges

























          4 Answers
          4






          active

          oldest

          votes


















          0















          df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
          df['counts'] = df['sentence'].str.count('the')
          print(df.groupby('index')['counts'].sum())





          share|improve this answer

























          • Thank you! this worked well

            – song0089
            Mar 27 at 19:33











          • Welcome @song0089

            – Akhilesh
            Mar 28 at 2:47


















          1















          First groupby.Series.apply, then use series.str.count:



          df = df.groupby('index').sentence.apply(' '.join).reset_index()

          print(df)
          index sentence
          0 1 bobby went to the gym sally the bad days are good
          1 2 sunny side up the weird

          df['count_the'] = df.sentence.str.count('the')

          print(df.drop(['sentence'],axis=1))
          index count_the
          0 1 2
          1 2 1





          share|improve this answer
































            0















            one way from findall , notice I treat the index columns as index here



            df.sentence.str.findall(r'btheb').str.len().sum(level=0)
            Out[363]:
            index
            1 2
            2 1
            Name: sentence, dtype: int64





            share|improve this answer
































              0















              Also you can use groupby()+ apply():



              df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')


              or groupby()+ apply():



              df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')





              share|improve this answer





























                Your Answer






                StackExchange.ifUsing("editor", function ()
                StackExchange.using("externalEditor", function ()
                StackExchange.using("snippets", function ()
                StackExchange.snippets.init();
                );
                );
                , "code-snippets");

                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "1"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384046%2fcount-number-of-rows-containing-string-per-index-using-pandas%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                0















                df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
                df['counts'] = df['sentence'].str.count('the')
                print(df.groupby('index')['counts'].sum())





                share|improve this answer

























                • Thank you! this worked well

                  – song0089
                  Mar 27 at 19:33











                • Welcome @song0089

                  – Akhilesh
                  Mar 28 at 2:47















                0















                df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
                df['counts'] = df['sentence'].str.count('the')
                print(df.groupby('index')['counts'].sum())





                share|improve this answer

























                • Thank you! this worked well

                  – song0089
                  Mar 27 at 19:33











                • Welcome @song0089

                  – Akhilesh
                  Mar 28 at 2:47













                0














                0










                0









                df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
                df['counts'] = df['sentence'].str.count('the')
                print(df.groupby('index')['counts'].sum())





                share|improve this answer













                df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
                df['counts'] = df['sentence'].str.count('the')
                print(df.groupby('index')['counts'].sum())






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 27 at 18:30









                AkhileshAkhilesh

                5891 gold badge4 silver badges13 bronze badges




                5891 gold badge4 silver badges13 bronze badges















                • Thank you! this worked well

                  – song0089
                  Mar 27 at 19:33











                • Welcome @song0089

                  – Akhilesh
                  Mar 28 at 2:47

















                • Thank you! this worked well

                  – song0089
                  Mar 27 at 19:33











                • Welcome @song0089

                  – Akhilesh
                  Mar 28 at 2:47
















                Thank you! this worked well

                – song0089
                Mar 27 at 19:33





                Thank you! this worked well

                – song0089
                Mar 27 at 19:33













                Welcome @song0089

                – Akhilesh
                Mar 28 at 2:47





                Welcome @song0089

                – Akhilesh
                Mar 28 at 2:47













                1















                First groupby.Series.apply, then use series.str.count:



                df = df.groupby('index').sentence.apply(' '.join).reset_index()

                print(df)
                index sentence
                0 1 bobby went to the gym sally the bad days are good
                1 2 sunny side up the weird

                df['count_the'] = df.sentence.str.count('the')

                print(df.drop(['sentence'],axis=1))
                index count_the
                0 1 2
                1 2 1





                share|improve this answer





























                  1















                  First groupby.Series.apply, then use series.str.count:



                  df = df.groupby('index').sentence.apply(' '.join).reset_index()

                  print(df)
                  index sentence
                  0 1 bobby went to the gym sally the bad days are good
                  1 2 sunny side up the weird

                  df['count_the'] = df.sentence.str.count('the')

                  print(df.drop(['sentence'],axis=1))
                  index count_the
                  0 1 2
                  1 2 1





                  share|improve this answer



























                    1














                    1










                    1









                    First groupby.Series.apply, then use series.str.count:



                    df = df.groupby('index').sentence.apply(' '.join).reset_index()

                    print(df)
                    index sentence
                    0 1 bobby went to the gym sally the bad days are good
                    1 2 sunny side up the weird

                    df['count_the'] = df.sentence.str.count('the')

                    print(df.drop(['sentence'],axis=1))
                    index count_the
                    0 1 2
                    1 2 1





                    share|improve this answer













                    First groupby.Series.apply, then use series.str.count:



                    df = df.groupby('index').sentence.apply(' '.join).reset_index()

                    print(df)
                    index sentence
                    0 1 bobby went to the gym sally the bad days are good
                    1 2 sunny side up the weird

                    df['count_the'] = df.sentence.str.count('the')

                    print(df.drop(['sentence'],axis=1))
                    index count_the
                    0 1 2
                    1 2 1






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Mar 27 at 18:28









                    ErfanErfan

                    11.2k2 gold badges7 silver badges28 bronze badges




                    11.2k2 gold badges7 silver badges28 bronze badges
























                        0















                        one way from findall , notice I treat the index columns as index here



                        df.sentence.str.findall(r'btheb').str.len().sum(level=0)
                        Out[363]:
                        index
                        1 2
                        2 1
                        Name: sentence, dtype: int64





                        share|improve this answer





























                          0















                          one way from findall , notice I treat the index columns as index here



                          df.sentence.str.findall(r'btheb').str.len().sum(level=0)
                          Out[363]:
                          index
                          1 2
                          2 1
                          Name: sentence, dtype: int64





                          share|improve this answer



























                            0














                            0










                            0









                            one way from findall , notice I treat the index columns as index here



                            df.sentence.str.findall(r'btheb').str.len().sum(level=0)
                            Out[363]:
                            index
                            1 2
                            2 1
                            Name: sentence, dtype: int64





                            share|improve this answer













                            one way from findall , notice I treat the index columns as index here



                            df.sentence.str.findall(r'btheb').str.len().sum(level=0)
                            Out[363]:
                            index
                            1 2
                            2 1
                            Name: sentence, dtype: int64






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Mar 27 at 18:41









                            WeNYoBenWeNYoBen

                            155k8 gold badges54 silver badges84 bronze badges




                            155k8 gold badges54 silver badges84 bronze badges
























                                0















                                Also you can use groupby()+ apply():



                                df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')


                                or groupby()+ apply():



                                df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')





                                share|improve this answer































                                  0















                                  Also you can use groupby()+ apply():



                                  df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')


                                  or groupby()+ apply():



                                  df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')





                                  share|improve this answer





























                                    0














                                    0










                                    0









                                    Also you can use groupby()+ apply():



                                    df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')


                                    or groupby()+ apply():



                                    df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')





                                    share|improve this answer















                                    Also you can use groupby()+ apply():



                                    df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')


                                    or groupby()+ apply():



                                    df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')






                                    share|improve this answer














                                    share|improve this answer



                                    share|improve this answer








                                    edited Mar 27 at 20:24

























                                    answered Mar 27 at 19:57









                                    LoochieLoochie

                                    1,0363 silver badges11 bronze badges




                                    1,0363 silver badges11 bronze badges






























                                        draft saved

                                        draft discarded
















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid


                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.

                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384046%2fcount-number-of-rows-containing-string-per-index-using-pandas%23new-answer', 'question_page');

                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                                        Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                                        Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript