count rolling changes in singular column by ID and Week NumberCount the number occurrences of a character in a stringHow to change the order of DataFrame columns?Change data type of columns in PandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?How to count the NaN values in a column in pandas DataFramegroup by rolling sum in pandasManipulate day-of-week numberGroupBy aggregate count based on specific columnCalculate average counts of rows in pandasHow to groupby week starting at a particular time

Is it OK to accept a job opportunity while planning on not taking it?

Short story where a flexible reality hardens to an unchanging one

Pgfplots fillbetween and Tikz shade

What is an expert set in the fonts field?

Does downing a character at the start of its turn require an immediate Death Saving Throw?

Can we have too many dialogue tags and follow up actions?

What is a "staved" town, like in "Staverton"?

What is the best word describing the nature of expiring in a short amount of time, connoting "losing public attention"?

What gave NASA the confidence for a translunar injection in Apollo 8?

Is the apartment I want to rent a scam?

How did pilots avoid thunderstorms and related weather before “reliable” airborne weather radar was introduced on airliners?

Graphical Actual Execution Plan - Estimated vs Actual number of rows

Why are Oscar, India, and X-Ray (O, I, and X) not used as taxiway identifiers?

How can I calculate the cost of Skyss bus tickets

Bounded Torsion, without Mazur’s Theorem

Why does the salt in the oceans not sink to the bottom?

pi4 no module named gpio

How could Barty Crouch Jr. have run out of Polyjuice Potion at the end of the Goblet of Fire movie?

Chemistry Riddle

Giving blur shadow to plot

What rules turn any attack that hits a given target into a critical hit?

How can I show that the speed of light in vacuum is the same in all reference frames?

What kind of anatomy does a centaur have?

What does a black-and-white Puerto Rican flag signify?



count rolling changes in singular column by ID and Week Number


Count the number occurrences of a character in a stringHow to change the order of DataFrame columns?Change data type of columns in PandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?How to count the NaN values in a column in pandas DataFramegroup by rolling sum in pandasManipulate day-of-week numberGroupBy aggregate count based on specific columnCalculate average counts of rows in pandasHow to groupby week starting at a particular time






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















been trying to work on this problem for the last few hours with no luck.



I have a dataframe as follows:



id = [1,1,1,2,2,2]
weeks = [1,2,3,1,2,3]
contr = [16,16,22,37,37,16]


df = pd.DataFrame('ID' : id,
'Week' : weeks,
'Contract' : contr)

print(df)
ID Week Contract
0 1 1 16
1 1 2 16
2 1 3 22
3 2 1 37
4 2 2 37
5 2 3 16


now what I'm trying to do is count the number of changes for a contract by ID in a given week (my df is small, around 1.8 million rows)



so what I thought could work was to do a rolling count of a value which isn't equal to the one above of which I've tried by playing around with this code :



df['count'] = df['ID'].groupby((df['Contract'] != df['Contract'].shift(-1)).cumsum()).cumcount()


but this is not giving me the desired result,



what I'm after is something like the following



 print(df)
ID Week Contract count
0 1 1 16 0 # First instance is this is ignored
1 1 2 16 0 # No Change so 0
2 1 3 22 1 # Change here so 1
3 2 1 37 0
4 2 2 37 0
5 2 3 16 1
6 2 4 16 0 # This should be 0 as the change was in the prev Week


(if this doesn't meet a minimal question please let me know).










share|improve this question






























    1















    been trying to work on this problem for the last few hours with no luck.



    I have a dataframe as follows:



    id = [1,1,1,2,2,2]
    weeks = [1,2,3,1,2,3]
    contr = [16,16,22,37,37,16]


    df = pd.DataFrame('ID' : id,
    'Week' : weeks,
    'Contract' : contr)

    print(df)
    ID Week Contract
    0 1 1 16
    1 1 2 16
    2 1 3 22
    3 2 1 37
    4 2 2 37
    5 2 3 16


    now what I'm trying to do is count the number of changes for a contract by ID in a given week (my df is small, around 1.8 million rows)



    so what I thought could work was to do a rolling count of a value which isn't equal to the one above of which I've tried by playing around with this code :



    df['count'] = df['ID'].groupby((df['Contract'] != df['Contract'].shift(-1)).cumsum()).cumcount()


    but this is not giving me the desired result,



    what I'm after is something like the following



     print(df)
    ID Week Contract count
    0 1 1 16 0 # First instance is this is ignored
    1 1 2 16 0 # No Change so 0
    2 1 3 22 1 # Change here so 1
    3 2 1 37 0
    4 2 2 37 0
    5 2 3 16 1
    6 2 4 16 0 # This should be 0 as the change was in the prev Week


    (if this doesn't meet a minimal question please let me know).










    share|improve this question


























      1












      1








      1








      been trying to work on this problem for the last few hours with no luck.



      I have a dataframe as follows:



      id = [1,1,1,2,2,2]
      weeks = [1,2,3,1,2,3]
      contr = [16,16,22,37,37,16]


      df = pd.DataFrame('ID' : id,
      'Week' : weeks,
      'Contract' : contr)

      print(df)
      ID Week Contract
      0 1 1 16
      1 1 2 16
      2 1 3 22
      3 2 1 37
      4 2 2 37
      5 2 3 16


      now what I'm trying to do is count the number of changes for a contract by ID in a given week (my df is small, around 1.8 million rows)



      so what I thought could work was to do a rolling count of a value which isn't equal to the one above of which I've tried by playing around with this code :



      df['count'] = df['ID'].groupby((df['Contract'] != df['Contract'].shift(-1)).cumsum()).cumcount()


      but this is not giving me the desired result,



      what I'm after is something like the following



       print(df)
      ID Week Contract count
      0 1 1 16 0 # First instance is this is ignored
      1 1 2 16 0 # No Change so 0
      2 1 3 22 1 # Change here so 1
      3 2 1 37 0
      4 2 2 37 0
      5 2 3 16 1
      6 2 4 16 0 # This should be 0 as the change was in the prev Week


      (if this doesn't meet a minimal question please let me know).










      share|improve this question
















      been trying to work on this problem for the last few hours with no luck.



      I have a dataframe as follows:



      id = [1,1,1,2,2,2]
      weeks = [1,2,3,1,2,3]
      contr = [16,16,22,37,37,16]


      df = pd.DataFrame('ID' : id,
      'Week' : weeks,
      'Contract' : contr)

      print(df)
      ID Week Contract
      0 1 1 16
      1 1 2 16
      2 1 3 22
      3 2 1 37
      4 2 2 37
      5 2 3 16


      now what I'm trying to do is count the number of changes for a contract by ID in a given week (my df is small, around 1.8 million rows)



      so what I thought could work was to do a rolling count of a value which isn't equal to the one above of which I've tried by playing around with this code :



      df['count'] = df['ID'].groupby((df['Contract'] != df['Contract'].shift(-1)).cumsum()).cumcount()


      but this is not giving me the desired result,



      what I'm after is something like the following



       print(df)
      ID Week Contract count
      0 1 1 16 0 # First instance is this is ignored
      1 1 2 16 0 # No Change so 0
      2 1 3 22 1 # Change here so 1
      3 2 1 37 0
      4 2 2 37 0
      5 2 3 16 1
      6 2 4 16 0 # This should be 0 as the change was in the prev Week


      (if this doesn't meet a minimal question please let me know).







      python pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 26 at 14:57







      Datanovice

















      asked Mar 26 at 14:30









      DatanoviceDatanovice

      1,1061 gold badge5 silver badges18 bronze badges




      1,1061 gold badge5 silver badges18 bronze badges






















          2 Answers
          2






          active

          oldest

          votes


















          2














          I think using diff to get the value change or not , then we need another groupby to cumsum by ID



          s=df.groupby('ID').Contract.diff().ne(0)
          s.groupby(df['ID']).cumsum()-1
          Out[33]:
          0 0.0
          1 0.0
          2 1.0
          3 0.0
          4 0.0
          5 1.0
          Name: Contract, dtype: float64
          df['Count']=s.groupby(df['ID']).cumsum()-1





          share|improve this answer























          • awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

            – Datanovice
            Mar 26 at 14:53











          • @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

            – WeNYoBen
            Mar 26 at 14:55












          • Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

            – Datanovice
            Mar 26 at 14:58











          • @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

            – WeNYoBen
            Mar 26 at 15:10












          • you are an absolute genius. Thank you Wen!

            – Datanovice
            Mar 26 at 15:26


















          1














          Using apply:



          df['Count']=df.groupby('ID')['Contract'].apply(lambda x: (~x.duplicated()).cumsum()-1)
          #or df.groupby('ID')['Contract'].transform(lambda x: pd.factorize(x)[0])
          print(df)

          ID Week Contract Count
          0 1 1 16 0
          1 1 2 16 0
          2 1 3 22 1
          3 2 1 37 0
          4 2 2 37 0
          5 2 3 16 1





          share|improve this answer























          • Thanks Anky, tested this and it works the same as Wen's!

            – Datanovice
            Mar 26 at 14:58













          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359635%2fcount-rolling-changes-in-singular-column-by-id-and-week-number%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          I think using diff to get the value change or not , then we need another groupby to cumsum by ID



          s=df.groupby('ID').Contract.diff().ne(0)
          s.groupby(df['ID']).cumsum()-1
          Out[33]:
          0 0.0
          1 0.0
          2 1.0
          3 0.0
          4 0.0
          5 1.0
          Name: Contract, dtype: float64
          df['Count']=s.groupby(df['ID']).cumsum()-1





          share|improve this answer























          • awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

            – Datanovice
            Mar 26 at 14:53











          • @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

            – WeNYoBen
            Mar 26 at 14:55












          • Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

            – Datanovice
            Mar 26 at 14:58











          • @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

            – WeNYoBen
            Mar 26 at 15:10












          • you are an absolute genius. Thank you Wen!

            – Datanovice
            Mar 26 at 15:26















          2














          I think using diff to get the value change or not , then we need another groupby to cumsum by ID



          s=df.groupby('ID').Contract.diff().ne(0)
          s.groupby(df['ID']).cumsum()-1
          Out[33]:
          0 0.0
          1 0.0
          2 1.0
          3 0.0
          4 0.0
          5 1.0
          Name: Contract, dtype: float64
          df['Count']=s.groupby(df['ID']).cumsum()-1





          share|improve this answer























          • awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

            – Datanovice
            Mar 26 at 14:53











          • @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

            – WeNYoBen
            Mar 26 at 14:55












          • Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

            – Datanovice
            Mar 26 at 14:58











          • @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

            – WeNYoBen
            Mar 26 at 15:10












          • you are an absolute genius. Thank you Wen!

            – Datanovice
            Mar 26 at 15:26













          2












          2








          2







          I think using diff to get the value change or not , then we need another groupby to cumsum by ID



          s=df.groupby('ID').Contract.diff().ne(0)
          s.groupby(df['ID']).cumsum()-1
          Out[33]:
          0 0.0
          1 0.0
          2 1.0
          3 0.0
          4 0.0
          5 1.0
          Name: Contract, dtype: float64
          df['Count']=s.groupby(df['ID']).cumsum()-1





          share|improve this answer













          I think using diff to get the value change or not , then we need another groupby to cumsum by ID



          s=df.groupby('ID').Contract.diff().ne(0)
          s.groupby(df['ID']).cumsum()-1
          Out[33]:
          0 0.0
          1 0.0
          2 1.0
          3 0.0
          4 0.0
          5 1.0
          Name: Contract, dtype: float64
          df['Count']=s.groupby(df['ID']).cumsum()-1






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 26 at 14:33









          WeNYoBenWeNYoBen

          148k8 gold badges51 silver badges84 bronze badges




          148k8 gold badges51 silver badges84 bronze badges












          • awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

            – Datanovice
            Mar 26 at 14:53











          • @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

            – WeNYoBen
            Mar 26 at 14:55












          • Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

            – Datanovice
            Mar 26 at 14:58











          • @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

            – WeNYoBen
            Mar 26 at 15:10












          • you are an absolute genius. Thank you Wen!

            – Datanovice
            Mar 26 at 15:26

















          • awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

            – Datanovice
            Mar 26 at 14:53











          • @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

            – WeNYoBen
            Mar 26 at 14:55












          • Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

            – Datanovice
            Mar 26 at 14:58











          • @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

            – WeNYoBen
            Mar 26 at 15:10












          • you are an absolute genius. Thank you Wen!

            – Datanovice
            Mar 26 at 15:26
















          awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

          – Datanovice
          Mar 26 at 14:53





          awesome, this works!, only downside (this is my fault nothing to do with you) is that the consecutive changes are also listed with a 1, would there be a way to list these as a 0 with this line of code?

          – Datanovice
          Mar 26 at 14:53













          @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

          – WeNYoBen
          Mar 26 at 14:55






          @Datanovice would you like show to us in your sample data and your expected output ? I think I can fix it

          – WeNYoBen
          Mar 26 at 14:55














          Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

          – Datanovice
          Mar 26 at 14:58





          Made a slight edit to the expected output, with the code I've used above I get a 1 where in the final output I'd like a 0. If not it's fine I can work around it

          – Datanovice
          Mar 26 at 14:58













          @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

          – WeNYoBen
          Mar 26 at 15:10






          @Datanovice I made a mistake paste wrong one (s.groupby(df['ID']).cumsum()-1).mask(df.duplicated(['ID','Contract']),0)

          – WeNYoBen
          Mar 26 at 15:10














          you are an absolute genius. Thank you Wen!

          – Datanovice
          Mar 26 at 15:26





          you are an absolute genius. Thank you Wen!

          – Datanovice
          Mar 26 at 15:26













          1














          Using apply:



          df['Count']=df.groupby('ID')['Contract'].apply(lambda x: (~x.duplicated()).cumsum()-1)
          #or df.groupby('ID')['Contract'].transform(lambda x: pd.factorize(x)[0])
          print(df)

          ID Week Contract Count
          0 1 1 16 0
          1 1 2 16 0
          2 1 3 22 1
          3 2 1 37 0
          4 2 2 37 0
          5 2 3 16 1





          share|improve this answer























          • Thanks Anky, tested this and it works the same as Wen's!

            – Datanovice
            Mar 26 at 14:58















          1














          Using apply:



          df['Count']=df.groupby('ID')['Contract'].apply(lambda x: (~x.duplicated()).cumsum()-1)
          #or df.groupby('ID')['Contract'].transform(lambda x: pd.factorize(x)[0])
          print(df)

          ID Week Contract Count
          0 1 1 16 0
          1 1 2 16 0
          2 1 3 22 1
          3 2 1 37 0
          4 2 2 37 0
          5 2 3 16 1





          share|improve this answer























          • Thanks Anky, tested this and it works the same as Wen's!

            – Datanovice
            Mar 26 at 14:58













          1












          1








          1







          Using apply:



          df['Count']=df.groupby('ID')['Contract'].apply(lambda x: (~x.duplicated()).cumsum()-1)
          #or df.groupby('ID')['Contract'].transform(lambda x: pd.factorize(x)[0])
          print(df)

          ID Week Contract Count
          0 1 1 16 0
          1 1 2 16 0
          2 1 3 22 1
          3 2 1 37 0
          4 2 2 37 0
          5 2 3 16 1





          share|improve this answer













          Using apply:



          df['Count']=df.groupby('ID')['Contract'].apply(lambda x: (~x.duplicated()).cumsum()-1)
          #or df.groupby('ID')['Contract'].transform(lambda x: pd.factorize(x)[0])
          print(df)

          ID Week Contract Count
          0 1 1 16 0
          1 1 2 16 0
          2 1 3 22 1
          3 2 1 37 0
          4 2 2 37 0
          5 2 3 16 1






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 26 at 14:48









          anky_91anky_91

          20.2k5 gold badges11 silver badges28 bronze badges




          20.2k5 gold badges11 silver badges28 bronze badges












          • Thanks Anky, tested this and it works the same as Wen's!

            – Datanovice
            Mar 26 at 14:58

















          • Thanks Anky, tested this and it works the same as Wen's!

            – Datanovice
            Mar 26 at 14:58
















          Thanks Anky, tested this and it works the same as Wen's!

          – Datanovice
          Mar 26 at 14:58





          Thanks Anky, tested this and it works the same as Wen's!

          – Datanovice
          Mar 26 at 14:58

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359635%2fcount-rolling-changes-in-singular-column-by-id-and-week-number%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript