Python Pandas Ratio of values in group to group total for each groupAdding new column to existing DataFrame in Python pandasHow to drop rows of Pandas DataFrame whose value in a certain column is NaN“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Python Pandas - Group by an aggregate (count of conditional values)Pandas Grouping - Values as Percent of Grouped Totals Not WorkingPandas Count Positive/Negative/Neutral ValuesGroup by value in column A and divide total total of each value by value in column bHow to classify observations based on their covariates in dataframe and numpy?

Rapid change in character

Can two aircraft be allowed to stay on the same runway at the same time?

math mode in ticks ( tikzpicture )

Why are JWST optics not enclosed like HST?

Why does Sauron not permit his followers to use his name?

How do I get my neighbour to stop disturbing with loud music?

How were US credit cards verified in-store in the 1980's?

Which language is the closest lexically to Spanish?

Is it good practice to speed up and slow down where not written in a song?

Is it possible for a person to be tricked into becoming a lich?

Create a list of snaking numbers under 50,000

Can inductive kick be discharged without freewheeling diode, in this example?

Is Borg adaptation only temporary?

Are there indian reservations in United States where you can't live if you aren't a tribal member?

Which is the correct version of Mussorgsky's Pictures at an Exhibition?

Can I lend a small amount of my own money to a bank at the federal funds rate?

Don't look at what I did there

LWC: Is it safe to rely on window.location.href to get the page url?

Should a TA point out a professor's mistake while attending their lecture?

In Endgame, wouldn't Stark have remembered Hulk busting out of the stairwell?

What are ways to record who took the pictures if a camera is used by multiple people?

Why is there no Disney logo in MCU movies?

Cheap oscilloscope showing 16 MHz square wave

My colleague treats me like he's my boss, yet we're on the same level



Python Pandas Ratio of values in group to group total for each group


Adding new column to existing DataFrame in Python pandasHow to drop rows of Pandas DataFrame whose value in a certain column is NaN“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Python Pandas - Group by an aggregate (count of conditional values)Pandas Grouping - Values as Percent of Grouped Totals Not WorkingPandas Count Positive/Negative/Neutral ValuesGroup by value in column A and divide total total of each value by value in column bHow to classify observations based on their covariates in dataframe and numpy?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I want to find the ratio of the counts of values in a group to the total values in the group while also keeping the other columns. I used a group by to transform my matrix into one similar to the example below. I grouped by the injury time and then the incident type to find the count of each incident per month.



Instead of count though, I want it to be the count/total count of incident for the month.



For example if there is a data frame that looks like this.



 Injury_Time Incident_Type Count
2017-01 Slip 4
2017-01 Concussion 12
2017-01 Struck by 19
2017-01 Exposure 5
2017-02 Slip 28
2017-02 Concussion 10
2017-02 Struck by 2
2017-02 Exposure 10
... ... ...


Instead I want the data frame to look like this.



 Injury_Time Incident_Type Count
2017-01 Slip 0.1
2017-01 Concussion 0.3
2017-01 Struck by 0.475
2017-01 Exposure 0.125
2017-02 Slip 0.56
2017-02 Concussion 0.2
2017-02 Struck by 0.04
2017-02 Exposure 0.2
... ... ...


For example for the first Slip incident on 2017-01. It would be calculated as 4/40 = 0.1 since the group total is (4 + 12 + 19 + 5 = 40). For the second group value of slip it would be 28/50 since (28 + 10 + 2 + 10 = 50), so the first value is 28/50 = 0.56. The same will be done for each value in each group as well.



Is there a good method of doing this for each group in the data frame?



Here is the code for creating the example data frame.



df = pd.DataFrame([["2017-01", "Slip", 4], ["2017-01", "Concussion", 12], ["2017-01", "Struck by", 19], ["2017-01", "Exposure", 5], ["2017-02", "Slip", 28], ["2017-02", "Concussion", 10], ["2017-02", "Struck by", 2], ["2017-02", "Exposure", 10]], columns=["Injury_Time", "Incident_Type", "Count"])


Please let me know if you have any questions.



Thank you for your help.










share|improve this question






























    1















    I want to find the ratio of the counts of values in a group to the total values in the group while also keeping the other columns. I used a group by to transform my matrix into one similar to the example below. I grouped by the injury time and then the incident type to find the count of each incident per month.



    Instead of count though, I want it to be the count/total count of incident for the month.



    For example if there is a data frame that looks like this.



     Injury_Time Incident_Type Count
    2017-01 Slip 4
    2017-01 Concussion 12
    2017-01 Struck by 19
    2017-01 Exposure 5
    2017-02 Slip 28
    2017-02 Concussion 10
    2017-02 Struck by 2
    2017-02 Exposure 10
    ... ... ...


    Instead I want the data frame to look like this.



     Injury_Time Incident_Type Count
    2017-01 Slip 0.1
    2017-01 Concussion 0.3
    2017-01 Struck by 0.475
    2017-01 Exposure 0.125
    2017-02 Slip 0.56
    2017-02 Concussion 0.2
    2017-02 Struck by 0.04
    2017-02 Exposure 0.2
    ... ... ...


    For example for the first Slip incident on 2017-01. It would be calculated as 4/40 = 0.1 since the group total is (4 + 12 + 19 + 5 = 40). For the second group value of slip it would be 28/50 since (28 + 10 + 2 + 10 = 50), so the first value is 28/50 = 0.56. The same will be done for each value in each group as well.



    Is there a good method of doing this for each group in the data frame?



    Here is the code for creating the example data frame.



    df = pd.DataFrame([["2017-01", "Slip", 4], ["2017-01", "Concussion", 12], ["2017-01", "Struck by", 19], ["2017-01", "Exposure", 5], ["2017-02", "Slip", 28], ["2017-02", "Concussion", 10], ["2017-02", "Struck by", 2], ["2017-02", "Exposure", 10]], columns=["Injury_Time", "Incident_Type", "Count"])


    Please let me know if you have any questions.



    Thank you for your help.










    share|improve this question


























      1












      1








      1








      I want to find the ratio of the counts of values in a group to the total values in the group while also keeping the other columns. I used a group by to transform my matrix into one similar to the example below. I grouped by the injury time and then the incident type to find the count of each incident per month.



      Instead of count though, I want it to be the count/total count of incident for the month.



      For example if there is a data frame that looks like this.



       Injury_Time Incident_Type Count
      2017-01 Slip 4
      2017-01 Concussion 12
      2017-01 Struck by 19
      2017-01 Exposure 5
      2017-02 Slip 28
      2017-02 Concussion 10
      2017-02 Struck by 2
      2017-02 Exposure 10
      ... ... ...


      Instead I want the data frame to look like this.



       Injury_Time Incident_Type Count
      2017-01 Slip 0.1
      2017-01 Concussion 0.3
      2017-01 Struck by 0.475
      2017-01 Exposure 0.125
      2017-02 Slip 0.56
      2017-02 Concussion 0.2
      2017-02 Struck by 0.04
      2017-02 Exposure 0.2
      ... ... ...


      For example for the first Slip incident on 2017-01. It would be calculated as 4/40 = 0.1 since the group total is (4 + 12 + 19 + 5 = 40). For the second group value of slip it would be 28/50 since (28 + 10 + 2 + 10 = 50), so the first value is 28/50 = 0.56. The same will be done for each value in each group as well.



      Is there a good method of doing this for each group in the data frame?



      Here is the code for creating the example data frame.



      df = pd.DataFrame([["2017-01", "Slip", 4], ["2017-01", "Concussion", 12], ["2017-01", "Struck by", 19], ["2017-01", "Exposure", 5], ["2017-02", "Slip", 28], ["2017-02", "Concussion", 10], ["2017-02", "Struck by", 2], ["2017-02", "Exposure", 10]], columns=["Injury_Time", "Incident_Type", "Count"])


      Please let me know if you have any questions.



      Thank you for your help.










      share|improve this question














      I want to find the ratio of the counts of values in a group to the total values in the group while also keeping the other columns. I used a group by to transform my matrix into one similar to the example below. I grouped by the injury time and then the incident type to find the count of each incident per month.



      Instead of count though, I want it to be the count/total count of incident for the month.



      For example if there is a data frame that looks like this.



       Injury_Time Incident_Type Count
      2017-01 Slip 4
      2017-01 Concussion 12
      2017-01 Struck by 19
      2017-01 Exposure 5
      2017-02 Slip 28
      2017-02 Concussion 10
      2017-02 Struck by 2
      2017-02 Exposure 10
      ... ... ...


      Instead I want the data frame to look like this.



       Injury_Time Incident_Type Count
      2017-01 Slip 0.1
      2017-01 Concussion 0.3
      2017-01 Struck by 0.475
      2017-01 Exposure 0.125
      2017-02 Slip 0.56
      2017-02 Concussion 0.2
      2017-02 Struck by 0.04
      2017-02 Exposure 0.2
      ... ... ...


      For example for the first Slip incident on 2017-01. It would be calculated as 4/40 = 0.1 since the group total is (4 + 12 + 19 + 5 = 40). For the second group value of slip it would be 28/50 since (28 + 10 + 2 + 10 = 50), so the first value is 28/50 = 0.56. The same will be done for each value in each group as well.



      Is there a good method of doing this for each group in the data frame?



      Here is the code for creating the example data frame.



      df = pd.DataFrame([["2017-01", "Slip", 4], ["2017-01", "Concussion", 12], ["2017-01", "Struck by", 19], ["2017-01", "Exposure", 5], ["2017-02", "Slip", 28], ["2017-02", "Concussion", 10], ["2017-02", "Struck by", 2], ["2017-02", "Exposure", 10]], columns=["Injury_Time", "Incident_Type", "Count"])


      Please let me know if you have any questions.



      Thank you for your help.







      python pandas numpy aggregate pandas-groupby






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 27 at 23:12









      mrsquidmrsquid

      1472 silver badges11 bronze badges




      1472 silver badges11 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          1















          You can use transform here:



          In [11]: df.groupby("Injury_Time")["Count"].transform("sum")
          Out[11]:
          0 40
          1 40
          2 40
          3 40
          4 50
          5 50
          6 50
          7 50
          Name: Count, dtype: int64

          In [12]: df["Count"] / df.groupby("Injury_Time")["Count"].transform("sum")
          Out[12]:
          0 0.100
          1 0.300
          2 0.475
          3 0.125
          4 0.560
          5 0.200
          6 0.040
          7 0.200
          Name: Count, dtype: float64


          See split-apply-combine section of the docs.






          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55387882%2fpython-pandas-ratio-of-values-in-group-to-group-total-for-each-group%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1















            You can use transform here:



            In [11]: df.groupby("Injury_Time")["Count"].transform("sum")
            Out[11]:
            0 40
            1 40
            2 40
            3 40
            4 50
            5 50
            6 50
            7 50
            Name: Count, dtype: int64

            In [12]: df["Count"] / df.groupby("Injury_Time")["Count"].transform("sum")
            Out[12]:
            0 0.100
            1 0.300
            2 0.475
            3 0.125
            4 0.560
            5 0.200
            6 0.040
            7 0.200
            Name: Count, dtype: float64


            See split-apply-combine section of the docs.






            share|improve this answer





























              1















              You can use transform here:



              In [11]: df.groupby("Injury_Time")["Count"].transform("sum")
              Out[11]:
              0 40
              1 40
              2 40
              3 40
              4 50
              5 50
              6 50
              7 50
              Name: Count, dtype: int64

              In [12]: df["Count"] / df.groupby("Injury_Time")["Count"].transform("sum")
              Out[12]:
              0 0.100
              1 0.300
              2 0.475
              3 0.125
              4 0.560
              5 0.200
              6 0.040
              7 0.200
              Name: Count, dtype: float64


              See split-apply-combine section of the docs.






              share|improve this answer



























                1














                1










                1









                You can use transform here:



                In [11]: df.groupby("Injury_Time")["Count"].transform("sum")
                Out[11]:
                0 40
                1 40
                2 40
                3 40
                4 50
                5 50
                6 50
                7 50
                Name: Count, dtype: int64

                In [12]: df["Count"] / df.groupby("Injury_Time")["Count"].transform("sum")
                Out[12]:
                0 0.100
                1 0.300
                2 0.475
                3 0.125
                4 0.560
                5 0.200
                6 0.040
                7 0.200
                Name: Count, dtype: float64


                See split-apply-combine section of the docs.






                share|improve this answer













                You can use transform here:



                In [11]: df.groupby("Injury_Time")["Count"].transform("sum")
                Out[11]:
                0 40
                1 40
                2 40
                3 40
                4 50
                5 50
                6 50
                7 50
                Name: Count, dtype: int64

                In [12]: df["Count"] / df.groupby("Injury_Time")["Count"].transform("sum")
                Out[12]:
                0 0.100
                1 0.300
                2 0.475
                3 0.125
                4 0.560
                5 0.200
                6 0.040
                7 0.200
                Name: Count, dtype: float64


                See split-apply-combine section of the docs.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 27 at 23:22









                Andy HaydenAndy Hayden

                211k62 gold badges465 silver badges456 bronze badges




                211k62 gold badges465 silver badges456 bronze badges





















                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55387882%2fpython-pandas-ratio-of-values-in-group-to-group-total-for-each-group%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                    Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                    Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript