Subtract rows varying one column but keeping others fixed Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How to drop rows of Pandas DataFrame whose value in certain columns is NaNSelect rows from a DataFrame based on values in a column in pandasAdd columns to a list of data.framesR rollapply bottom to top?subtracting controls from multiple datasetsRemoving rows based where data isn't sequential in R, dplyrR efficient reformatting and sequencing between predictor values stored in columnsHow to select value in different rows for each different column in R?Normalize multiple values using values of one factor in RFinding counts of unique values in column for each unique value in other column

Simple Line in LaTeX Help!

Did any compiler fully use 80-bit floating point?

Why shouldn't this prove the Prime Number Theorem?

Found this skink in my tomato plant bucket. Is he trapped? Or could he leave if he wanted?

what is the log of the PDF for a Normal Distribution?

Can humans save crash-landed aliens?

What does 丫 mean? 丫是什么意思?

How to change the tick of the color bar legend to black

Did pre-Columbian Americans know the spherical shape of the Earth?

Co-worker has annoying ringtone

Can an iPhone 7 be made to function as a NFC Tag?

How to write capital alpha?

How often does castling occur in grandmaster games?

Was Kant an Intuitionist about mathematical objects?

Monty Hall Problem-Probability Paradox

Nose gear failure in single prop aircraft: belly landing or nose-gear up landing?

Why datecode is SO IMPORTANT to chip manufacturers?

malloc in main() or malloc in another function: allocating memory for a struct and its members

How does light 'choose' between wave and particle behaviour?

How much damage would a cupful of neutron star matter do to the Earth?

GDP with Intermediate Production

Why is the change of basis formula counter-intuitive? [See details]

What would you call this weird metallic apparatus that allows you to lift people?

Why complex landing gears are used instead of simple,reliability and light weight muscle wire or shape memory alloys?



Subtract rows varying one column but keeping others fixed



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!How to drop rows of Pandas DataFrame whose value in certain columns is NaNSelect rows from a DataFrame based on values in a column in pandasAdd columns to a list of data.framesR rollapply bottom to top?subtracting controls from multiple datasetsRemoving rows based where data isn't sequential in R, dplyrR efficient reformatting and sequencing between predictor values stored in columnsHow to select value in different rows for each different column in R?Normalize multiple values using values of one factor in RFinding counts of unique values in column for each unique value in other column



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I have an experiment where I need to subtract values of two different treatments from the Control (baseline), but these subtractions must correspond to other columns, named block and year sampled.



Dummy data frame:



df <- data.frame("Treatment" = c("Control","Treat1", "Treat2"), 
"Block" = rep(1:3, each=3), "Year" = rep(2011:2013, each=3),
"Value" = c(6,12,4,3,9,5,6,3,1));df

Treatment Block Year Value
1 Control 1 2011 6
2 Treat1 1 2011 12
3 Treat2 1 2011 4
4 Control 2 2012 3
5 Treat1 2 2012 9
6 Treat2 2 2012 5
7 Control 3 2013 6
8 Treat1 3 2013 3
9 Treat2 3 2013 1


Desired output:



 Treatment Block Year Value
1 Control-Treat1 1 2011 -6
2 Control-Treat2 1 2011 2
3 Control-Treat1 2 2012 -6
4 Control-Treat2 2 2012 -2
5 Control-Treat1 3 2013 3
6 Control-Treat2 3 2013 5


Any suggestion, preferably using dplyr?



I have found similar questions but none addressing this specific issue.










share|improve this question






























    1















    I have an experiment where I need to subtract values of two different treatments from the Control (baseline), but these subtractions must correspond to other columns, named block and year sampled.



    Dummy data frame:



    df <- data.frame("Treatment" = c("Control","Treat1", "Treat2"), 
    "Block" = rep(1:3, each=3), "Year" = rep(2011:2013, each=3),
    "Value" = c(6,12,4,3,9,5,6,3,1));df

    Treatment Block Year Value
    1 Control 1 2011 6
    2 Treat1 1 2011 12
    3 Treat2 1 2011 4
    4 Control 2 2012 3
    5 Treat1 2 2012 9
    6 Treat2 2 2012 5
    7 Control 3 2013 6
    8 Treat1 3 2013 3
    9 Treat2 3 2013 1


    Desired output:



     Treatment Block Year Value
    1 Control-Treat1 1 2011 -6
    2 Control-Treat2 1 2011 2
    3 Control-Treat1 2 2012 -6
    4 Control-Treat2 2 2012 -2
    5 Control-Treat1 3 2013 3
    6 Control-Treat2 3 2013 5


    Any suggestion, preferably using dplyr?



    I have found similar questions but none addressing this specific issue.










    share|improve this question


























      1












      1








      1


      0






      I have an experiment where I need to subtract values of two different treatments from the Control (baseline), but these subtractions must correspond to other columns, named block and year sampled.



      Dummy data frame:



      df <- data.frame("Treatment" = c("Control","Treat1", "Treat2"), 
      "Block" = rep(1:3, each=3), "Year" = rep(2011:2013, each=3),
      "Value" = c(6,12,4,3,9,5,6,3,1));df

      Treatment Block Year Value
      1 Control 1 2011 6
      2 Treat1 1 2011 12
      3 Treat2 1 2011 4
      4 Control 2 2012 3
      5 Treat1 2 2012 9
      6 Treat2 2 2012 5
      7 Control 3 2013 6
      8 Treat1 3 2013 3
      9 Treat2 3 2013 1


      Desired output:



       Treatment Block Year Value
      1 Control-Treat1 1 2011 -6
      2 Control-Treat2 1 2011 2
      3 Control-Treat1 2 2012 -6
      4 Control-Treat2 2 2012 -2
      5 Control-Treat1 3 2013 3
      6 Control-Treat2 3 2013 5


      Any suggestion, preferably using dplyr?



      I have found similar questions but none addressing this specific issue.










      share|improve this question
















      I have an experiment where I need to subtract values of two different treatments from the Control (baseline), but these subtractions must correspond to other columns, named block and year sampled.



      Dummy data frame:



      df <- data.frame("Treatment" = c("Control","Treat1", "Treat2"), 
      "Block" = rep(1:3, each=3), "Year" = rep(2011:2013, each=3),
      "Value" = c(6,12,4,3,9,5,6,3,1));df

      Treatment Block Year Value
      1 Control 1 2011 6
      2 Treat1 1 2011 12
      3 Treat2 1 2011 4
      4 Control 2 2012 3
      5 Treat1 2 2012 9
      6 Treat2 2 2012 5
      7 Control 3 2013 6
      8 Treat1 3 2013 3
      9 Treat2 3 2013 1


      Desired output:



       Treatment Block Year Value
      1 Control-Treat1 1 2011 -6
      2 Control-Treat2 1 2011 2
      3 Control-Treat1 2 2012 -6
      4 Control-Treat2 2 2012 -2
      5 Control-Treat1 3 2013 3
      6 Control-Treat2 3 2013 5


      Any suggestion, preferably using dplyr?



      I have found similar questions but none addressing this specific issue.







      r dataframe dplyr






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 22 at 12:37









      Ronak Shah

      48.6k104370




      48.6k104370










      asked Mar 22 at 12:00









      LucasLucas

      83




      83






















          5 Answers
          5






          active

          oldest

          votes


















          1














          We can use dplyr, group_by Block and subtract Value where Treatment == "Control" from each Value and remove the "Control" rows.



          library(dplyr)

          df %>%
          group_by(Block) %>%
          mutate(Value = Value[which.max(Treatment == "Control")] - Value) %>%
          filter(Treatment != "Control")

          # Treatment Block Year Value
          # <fct> <int> <int> <dbl>
          #1 Treat1 1 2011 -6
          #2 Treat2 1 2011 2
          #3 Treat1 2 2012 -6
          #4 Treat2 2 2012 -2
          #5 Treat1 3 2013 3
          #6 Treat2 3 2013 5


          Not sure, if the values in Treatment column in expected output (Control-Treat1, Control-Treat2) are shown only for demonstration purpose of the calculation or OP really wants that as output. In case if that is needed as output we can use



          df %>%
          group_by(Block) %>%
          mutate(Value = Value[which.max(Treatment == "Control")] - Value,
          Treatment = paste0("Control-", Treatment)) %>%
          filter(Treatment != "Control-Control")

          # Treatment Block Year Value
          # <chr> <int> <int> <dbl>
          #1 Control-Treat1 1 2011 -6
          #2 Control-Treat2 1 2011 2
          #3 Control-Treat1 2 2012 -6
          #4 Control-Treat2 2 2012 -2
          #5 Control-Treat1 3 2013 3
          #6 Control-Treat2 3 2013 5





          share|improve this answer

























          • Exactly what I was looking for, thank you very much!

            – Lucas
            Mar 22 at 14:43


















          1














          A somehow different tidyverse possibility could be:



          df %>%
          spread(Treatment, Value) %>%
          gather(var, val, -c(Block, Year, Control)) %>%
          mutate(Value = Control - val,
          Treatment = paste("Control", var, sep = " - ")) %>%
          select(Treatment, Block, Year, Value) %>%
          arrange(Block)

          Treatment Block Year Value
          1 Control - Treat1 1 2011 -6
          2 Control - Treat2 1 2011 2
          3 Control - Treat1 2 2012 -6
          4 Control - Treat2 2 2012 -2
          5 Control - Treat1 3 2013 3
          6 Control - Treat2 3 2013 5





          share|improve this answer






























            1














            This can be done with an SQL self join like this:



            library(sqldf)
            sqldf("select a.Treatment || '-' || b.Treatment as Treatment,
            a.Block,
            a.Year,
            a.Value - b.Value as Value
            from df a
            join df b on a.block = b.block and
            a.Treatment = 'Control' and
            b.Treatment != 'Control'")


            giving:



             Treatment Block Year Value
            1 Control-Treat1 1 2011 -6
            2 Control-Treat2 1 2011 2
            3 Control-Treat1 2 2012 -6
            4 Control-Treat2 2 2012 -2
            5 Control-Treat1 3 2013 3
            6 Control-Treat2 3 2013 5





            share|improve this answer
































              0














              Another dplyr-tidyr approach: You can remove unwanted columns with select:



              library(tidyr)
              library(dplyr)
              dummy_df %>%
              spread(Treatment,Value) %>%
              gather(key,value,Treat1:Treat2) %>%
              group_by(Block,Year,key) %>%
              mutate(Val=Control-value)
              # A tibble: 6 x 6
              # Groups: Block, Year, key [6]
              Block Year Control key value Val
              <int> <int> <dbl> <chr> <dbl> <dbl>
              1 1 2011 6 Treat1 12 -6
              2 2 2012 3 Treat1 9 -6
              3 3 2013 6 Treat1 3 3
              4 1 2011 6 Treat2 4 2
              5 2 2012 3 Treat2 5 -2
              6 3 2013 6 Treat2 1 5


              Just the exact output:



              dummy_df %>% 
              spread(Treatment,Value) %>%
              gather(key,value,Treat1:Treat2) %>%
              mutate(Treatment=paste0("Control-",key)) %>%
              group_by(Block,Year,Treatment) %>%
              mutate(Val=Control-value) %>%
              select(Treatment,everything(),-value,-key)%>%
              arrange(Year)


              Result:



              # A tibble: 6 x 5
              # Groups: Block, Year, Treatment [6]
              Treatment Block Year Control Val
              <chr> <int> <int> <dbl> <dbl>
              1 Control-Treat1 1 2011 6 -6
              2 Control-Treat2 1 2011 6 2
              3 Control-Treat1 2 2012 3 -6
              4 Control-Treat2 2 2012 3 -2
              5 Control-Treat1 3 2013 6 3
              6 Control-Treat2 3 2013 6 5





              share|improve this answer
































                0














                Another tidyverse solution. We can use filter to separate "Control" and "Treatment" to different data frames, use left_join to combine them by Block and Year, and then process the data frame.



                library(tidyverse)

                df2 <- df %>%
                filter(!Treatment %in% "Control") %>%
                left_join(df %>% filter(Treatment %in% "Control"),
                .,
                by = c("Block", "Year")) %>%
                mutate(Value = Value.x - Value.y) %>%
                unite(Treatment, Treatment.x, Treatment.y, sep = "-") %>%
                select(names(df))
                # Treatment Block Year Value
                # 1 Control-Treat1 1 2011 -6
                # 2 Control-Treat2 1 2011 2
                # 3 Control-Treat1 2 2012 -6
                # 4 Control-Treat2 2 2012 -2
                # 5 Control-Treat1 3 2013 3
                # 6 Control-Treat2 3 2013 5





                share|improve this answer























                  Your Answer






                  StackExchange.ifUsing("editor", function ()
                  StackExchange.using("externalEditor", function ()
                  StackExchange.using("snippets", function ()
                  StackExchange.snippets.init();
                  );
                  );
                  , "code-snippets");

                  StackExchange.ready(function()
                  var channelOptions =
                  tags: "".split(" "),
                  id: "1"
                  ;
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function()
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled)
                  StackExchange.using("snippets", function()
                  createEditor();
                  );

                  else
                  createEditor();

                  );

                  function createEditor()
                  StackExchange.prepareEditor(
                  heartbeatType: 'answer',
                  autoActivateHeartbeat: false,
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader:
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  ,
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  );



                  );













                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function ()
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55299174%2fsubtract-rows-varying-one-column-but-keeping-others-fixed%23new-answer', 'question_page');

                  );

                  Post as a guest















                  Required, but never shown

























                  5 Answers
                  5






                  active

                  oldest

                  votes








                  5 Answers
                  5






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  1














                  We can use dplyr, group_by Block and subtract Value where Treatment == "Control" from each Value and remove the "Control" rows.



                  library(dplyr)

                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value) %>%
                  filter(Treatment != "Control")

                  # Treatment Block Year Value
                  # <fct> <int> <int> <dbl>
                  #1 Treat1 1 2011 -6
                  #2 Treat2 1 2011 2
                  #3 Treat1 2 2012 -6
                  #4 Treat2 2 2012 -2
                  #5 Treat1 3 2013 3
                  #6 Treat2 3 2013 5


                  Not sure, if the values in Treatment column in expected output (Control-Treat1, Control-Treat2) are shown only for demonstration purpose of the calculation or OP really wants that as output. In case if that is needed as output we can use



                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value,
                  Treatment = paste0("Control-", Treatment)) %>%
                  filter(Treatment != "Control-Control")

                  # Treatment Block Year Value
                  # <chr> <int> <int> <dbl>
                  #1 Control-Treat1 1 2011 -6
                  #2 Control-Treat2 1 2011 2
                  #3 Control-Treat1 2 2012 -6
                  #4 Control-Treat2 2 2012 -2
                  #5 Control-Treat1 3 2013 3
                  #6 Control-Treat2 3 2013 5





                  share|improve this answer

























                  • Exactly what I was looking for, thank you very much!

                    – Lucas
                    Mar 22 at 14:43















                  1














                  We can use dplyr, group_by Block and subtract Value where Treatment == "Control" from each Value and remove the "Control" rows.



                  library(dplyr)

                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value) %>%
                  filter(Treatment != "Control")

                  # Treatment Block Year Value
                  # <fct> <int> <int> <dbl>
                  #1 Treat1 1 2011 -6
                  #2 Treat2 1 2011 2
                  #3 Treat1 2 2012 -6
                  #4 Treat2 2 2012 -2
                  #5 Treat1 3 2013 3
                  #6 Treat2 3 2013 5


                  Not sure, if the values in Treatment column in expected output (Control-Treat1, Control-Treat2) are shown only for demonstration purpose of the calculation or OP really wants that as output. In case if that is needed as output we can use



                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value,
                  Treatment = paste0("Control-", Treatment)) %>%
                  filter(Treatment != "Control-Control")

                  # Treatment Block Year Value
                  # <chr> <int> <int> <dbl>
                  #1 Control-Treat1 1 2011 -6
                  #2 Control-Treat2 1 2011 2
                  #3 Control-Treat1 2 2012 -6
                  #4 Control-Treat2 2 2012 -2
                  #5 Control-Treat1 3 2013 3
                  #6 Control-Treat2 3 2013 5





                  share|improve this answer

























                  • Exactly what I was looking for, thank you very much!

                    – Lucas
                    Mar 22 at 14:43













                  1












                  1








                  1







                  We can use dplyr, group_by Block and subtract Value where Treatment == "Control" from each Value and remove the "Control" rows.



                  library(dplyr)

                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value) %>%
                  filter(Treatment != "Control")

                  # Treatment Block Year Value
                  # <fct> <int> <int> <dbl>
                  #1 Treat1 1 2011 -6
                  #2 Treat2 1 2011 2
                  #3 Treat1 2 2012 -6
                  #4 Treat2 2 2012 -2
                  #5 Treat1 3 2013 3
                  #6 Treat2 3 2013 5


                  Not sure, if the values in Treatment column in expected output (Control-Treat1, Control-Treat2) are shown only for demonstration purpose of the calculation or OP really wants that as output. In case if that is needed as output we can use



                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value,
                  Treatment = paste0("Control-", Treatment)) %>%
                  filter(Treatment != "Control-Control")

                  # Treatment Block Year Value
                  # <chr> <int> <int> <dbl>
                  #1 Control-Treat1 1 2011 -6
                  #2 Control-Treat2 1 2011 2
                  #3 Control-Treat1 2 2012 -6
                  #4 Control-Treat2 2 2012 -2
                  #5 Control-Treat1 3 2013 3
                  #6 Control-Treat2 3 2013 5





                  share|improve this answer















                  We can use dplyr, group_by Block and subtract Value where Treatment == "Control" from each Value and remove the "Control" rows.



                  library(dplyr)

                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value) %>%
                  filter(Treatment != "Control")

                  # Treatment Block Year Value
                  # <fct> <int> <int> <dbl>
                  #1 Treat1 1 2011 -6
                  #2 Treat2 1 2011 2
                  #3 Treat1 2 2012 -6
                  #4 Treat2 2 2012 -2
                  #5 Treat1 3 2013 3
                  #6 Treat2 3 2013 5


                  Not sure, if the values in Treatment column in expected output (Control-Treat1, Control-Treat2) are shown only for demonstration purpose of the calculation or OP really wants that as output. In case if that is needed as output we can use



                  df %>%
                  group_by(Block) %>%
                  mutate(Value = Value[which.max(Treatment == "Control")] - Value,
                  Treatment = paste0("Control-", Treatment)) %>%
                  filter(Treatment != "Control-Control")

                  # Treatment Block Year Value
                  # <chr> <int> <int> <dbl>
                  #1 Control-Treat1 1 2011 -6
                  #2 Control-Treat2 1 2011 2
                  #3 Control-Treat1 2 2012 -6
                  #4 Control-Treat2 2 2012 -2
                  #5 Control-Treat1 3 2013 3
                  #6 Control-Treat2 3 2013 5






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Mar 22 at 12:11

























                  answered Mar 22 at 12:04









                  Ronak ShahRonak Shah

                  48.6k104370




                  48.6k104370












                  • Exactly what I was looking for, thank you very much!

                    – Lucas
                    Mar 22 at 14:43

















                  • Exactly what I was looking for, thank you very much!

                    – Lucas
                    Mar 22 at 14:43
















                  Exactly what I was looking for, thank you very much!

                  – Lucas
                  Mar 22 at 14:43





                  Exactly what I was looking for, thank you very much!

                  – Lucas
                  Mar 22 at 14:43













                  1














                  A somehow different tidyverse possibility could be:



                  df %>%
                  spread(Treatment, Value) %>%
                  gather(var, val, -c(Block, Year, Control)) %>%
                  mutate(Value = Control - val,
                  Treatment = paste("Control", var, sep = " - ")) %>%
                  select(Treatment, Block, Year, Value) %>%
                  arrange(Block)

                  Treatment Block Year Value
                  1 Control - Treat1 1 2011 -6
                  2 Control - Treat2 1 2011 2
                  3 Control - Treat1 2 2012 -6
                  4 Control - Treat2 2 2012 -2
                  5 Control - Treat1 3 2013 3
                  6 Control - Treat2 3 2013 5





                  share|improve this answer



























                    1














                    A somehow different tidyverse possibility could be:



                    df %>%
                    spread(Treatment, Value) %>%
                    gather(var, val, -c(Block, Year, Control)) %>%
                    mutate(Value = Control - val,
                    Treatment = paste("Control", var, sep = " - ")) %>%
                    select(Treatment, Block, Year, Value) %>%
                    arrange(Block)

                    Treatment Block Year Value
                    1 Control - Treat1 1 2011 -6
                    2 Control - Treat2 1 2011 2
                    3 Control - Treat1 2 2012 -6
                    4 Control - Treat2 2 2012 -2
                    5 Control - Treat1 3 2013 3
                    6 Control - Treat2 3 2013 5





                    share|improve this answer

























                      1












                      1








                      1







                      A somehow different tidyverse possibility could be:



                      df %>%
                      spread(Treatment, Value) %>%
                      gather(var, val, -c(Block, Year, Control)) %>%
                      mutate(Value = Control - val,
                      Treatment = paste("Control", var, sep = " - ")) %>%
                      select(Treatment, Block, Year, Value) %>%
                      arrange(Block)

                      Treatment Block Year Value
                      1 Control - Treat1 1 2011 -6
                      2 Control - Treat2 1 2011 2
                      3 Control - Treat1 2 2012 -6
                      4 Control - Treat2 2 2012 -2
                      5 Control - Treat1 3 2013 3
                      6 Control - Treat2 3 2013 5





                      share|improve this answer













                      A somehow different tidyverse possibility could be:



                      df %>%
                      spread(Treatment, Value) %>%
                      gather(var, val, -c(Block, Year, Control)) %>%
                      mutate(Value = Control - val,
                      Treatment = paste("Control", var, sep = " - ")) %>%
                      select(Treatment, Block, Year, Value) %>%
                      arrange(Block)

                      Treatment Block Year Value
                      1 Control - Treat1 1 2011 -6
                      2 Control - Treat2 1 2011 2
                      3 Control - Treat1 2 2012 -6
                      4 Control - Treat2 2 2012 -2
                      5 Control - Treat1 3 2013 3
                      6 Control - Treat2 3 2013 5






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Mar 22 at 12:14









                      tmfmnktmfmnk

                      4,2061516




                      4,2061516





















                          1














                          This can be done with an SQL self join like this:



                          library(sqldf)
                          sqldf("select a.Treatment || '-' || b.Treatment as Treatment,
                          a.Block,
                          a.Year,
                          a.Value - b.Value as Value
                          from df a
                          join df b on a.block = b.block and
                          a.Treatment = 'Control' and
                          b.Treatment != 'Control'")


                          giving:



                           Treatment Block Year Value
                          1 Control-Treat1 1 2011 -6
                          2 Control-Treat2 1 2011 2
                          3 Control-Treat1 2 2012 -6
                          4 Control-Treat2 2 2012 -2
                          5 Control-Treat1 3 2013 3
                          6 Control-Treat2 3 2013 5





                          share|improve this answer





























                            1














                            This can be done with an SQL self join like this:



                            library(sqldf)
                            sqldf("select a.Treatment || '-' || b.Treatment as Treatment,
                            a.Block,
                            a.Year,
                            a.Value - b.Value as Value
                            from df a
                            join df b on a.block = b.block and
                            a.Treatment = 'Control' and
                            b.Treatment != 'Control'")


                            giving:



                             Treatment Block Year Value
                            1 Control-Treat1 1 2011 -6
                            2 Control-Treat2 1 2011 2
                            3 Control-Treat1 2 2012 -6
                            4 Control-Treat2 2 2012 -2
                            5 Control-Treat1 3 2013 3
                            6 Control-Treat2 3 2013 5





                            share|improve this answer



























                              1












                              1








                              1







                              This can be done with an SQL self join like this:



                              library(sqldf)
                              sqldf("select a.Treatment || '-' || b.Treatment as Treatment,
                              a.Block,
                              a.Year,
                              a.Value - b.Value as Value
                              from df a
                              join df b on a.block = b.block and
                              a.Treatment = 'Control' and
                              b.Treatment != 'Control'")


                              giving:



                               Treatment Block Year Value
                              1 Control-Treat1 1 2011 -6
                              2 Control-Treat2 1 2011 2
                              3 Control-Treat1 2 2012 -6
                              4 Control-Treat2 2 2012 -2
                              5 Control-Treat1 3 2013 3
                              6 Control-Treat2 3 2013 5





                              share|improve this answer















                              This can be done with an SQL self join like this:



                              library(sqldf)
                              sqldf("select a.Treatment || '-' || b.Treatment as Treatment,
                              a.Block,
                              a.Year,
                              a.Value - b.Value as Value
                              from df a
                              join df b on a.block = b.block and
                              a.Treatment = 'Control' and
                              b.Treatment != 'Control'")


                              giving:



                               Treatment Block Year Value
                              1 Control-Treat1 1 2011 -6
                              2 Control-Treat2 1 2011 2
                              3 Control-Treat1 2 2012 -6
                              4 Control-Treat2 2 2012 -2
                              5 Control-Treat1 3 2013 3
                              6 Control-Treat2 3 2013 5






                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Mar 25 at 13:12

























                              answered Mar 22 at 13:25









                              G. GrothendieckG. Grothendieck

                              154k11137245




                              154k11137245





















                                  0














                                  Another dplyr-tidyr approach: You can remove unwanted columns with select:



                                  library(tidyr)
                                  library(dplyr)
                                  dummy_df %>%
                                  spread(Treatment,Value) %>%
                                  gather(key,value,Treat1:Treat2) %>%
                                  group_by(Block,Year,key) %>%
                                  mutate(Val=Control-value)
                                  # A tibble: 6 x 6
                                  # Groups: Block, Year, key [6]
                                  Block Year Control key value Val
                                  <int> <int> <dbl> <chr> <dbl> <dbl>
                                  1 1 2011 6 Treat1 12 -6
                                  2 2 2012 3 Treat1 9 -6
                                  3 3 2013 6 Treat1 3 3
                                  4 1 2011 6 Treat2 4 2
                                  5 2 2012 3 Treat2 5 -2
                                  6 3 2013 6 Treat2 1 5


                                  Just the exact output:



                                  dummy_df %>% 
                                  spread(Treatment,Value) %>%
                                  gather(key,value,Treat1:Treat2) %>%
                                  mutate(Treatment=paste0("Control-",key)) %>%
                                  group_by(Block,Year,Treatment) %>%
                                  mutate(Val=Control-value) %>%
                                  select(Treatment,everything(),-value,-key)%>%
                                  arrange(Year)


                                  Result:



                                  # A tibble: 6 x 5
                                  # Groups: Block, Year, Treatment [6]
                                  Treatment Block Year Control Val
                                  <chr> <int> <int> <dbl> <dbl>
                                  1 Control-Treat1 1 2011 6 -6
                                  2 Control-Treat2 1 2011 6 2
                                  3 Control-Treat1 2 2012 3 -6
                                  4 Control-Treat2 2 2012 3 -2
                                  5 Control-Treat1 3 2013 6 3
                                  6 Control-Treat2 3 2013 6 5





                                  share|improve this answer





























                                    0














                                    Another dplyr-tidyr approach: You can remove unwanted columns with select:



                                    library(tidyr)
                                    library(dplyr)
                                    dummy_df %>%
                                    spread(Treatment,Value) %>%
                                    gather(key,value,Treat1:Treat2) %>%
                                    group_by(Block,Year,key) %>%
                                    mutate(Val=Control-value)
                                    # A tibble: 6 x 6
                                    # Groups: Block, Year, key [6]
                                    Block Year Control key value Val
                                    <int> <int> <dbl> <chr> <dbl> <dbl>
                                    1 1 2011 6 Treat1 12 -6
                                    2 2 2012 3 Treat1 9 -6
                                    3 3 2013 6 Treat1 3 3
                                    4 1 2011 6 Treat2 4 2
                                    5 2 2012 3 Treat2 5 -2
                                    6 3 2013 6 Treat2 1 5


                                    Just the exact output:



                                    dummy_df %>% 
                                    spread(Treatment,Value) %>%
                                    gather(key,value,Treat1:Treat2) %>%
                                    mutate(Treatment=paste0("Control-",key)) %>%
                                    group_by(Block,Year,Treatment) %>%
                                    mutate(Val=Control-value) %>%
                                    select(Treatment,everything(),-value,-key)%>%
                                    arrange(Year)


                                    Result:



                                    # A tibble: 6 x 5
                                    # Groups: Block, Year, Treatment [6]
                                    Treatment Block Year Control Val
                                    <chr> <int> <int> <dbl> <dbl>
                                    1 Control-Treat1 1 2011 6 -6
                                    2 Control-Treat2 1 2011 6 2
                                    3 Control-Treat1 2 2012 3 -6
                                    4 Control-Treat2 2 2012 3 -2
                                    5 Control-Treat1 3 2013 6 3
                                    6 Control-Treat2 3 2013 6 5





                                    share|improve this answer



























                                      0












                                      0








                                      0







                                      Another dplyr-tidyr approach: You can remove unwanted columns with select:



                                      library(tidyr)
                                      library(dplyr)
                                      dummy_df %>%
                                      spread(Treatment,Value) %>%
                                      gather(key,value,Treat1:Treat2) %>%
                                      group_by(Block,Year,key) %>%
                                      mutate(Val=Control-value)
                                      # A tibble: 6 x 6
                                      # Groups: Block, Year, key [6]
                                      Block Year Control key value Val
                                      <int> <int> <dbl> <chr> <dbl> <dbl>
                                      1 1 2011 6 Treat1 12 -6
                                      2 2 2012 3 Treat1 9 -6
                                      3 3 2013 6 Treat1 3 3
                                      4 1 2011 6 Treat2 4 2
                                      5 2 2012 3 Treat2 5 -2
                                      6 3 2013 6 Treat2 1 5


                                      Just the exact output:



                                      dummy_df %>% 
                                      spread(Treatment,Value) %>%
                                      gather(key,value,Treat1:Treat2) %>%
                                      mutate(Treatment=paste0("Control-",key)) %>%
                                      group_by(Block,Year,Treatment) %>%
                                      mutate(Val=Control-value) %>%
                                      select(Treatment,everything(),-value,-key)%>%
                                      arrange(Year)


                                      Result:



                                      # A tibble: 6 x 5
                                      # Groups: Block, Year, Treatment [6]
                                      Treatment Block Year Control Val
                                      <chr> <int> <int> <dbl> <dbl>
                                      1 Control-Treat1 1 2011 6 -6
                                      2 Control-Treat2 1 2011 6 2
                                      3 Control-Treat1 2 2012 3 -6
                                      4 Control-Treat2 2 2012 3 -2
                                      5 Control-Treat1 3 2013 6 3
                                      6 Control-Treat2 3 2013 6 5





                                      share|improve this answer















                                      Another dplyr-tidyr approach: You can remove unwanted columns with select:



                                      library(tidyr)
                                      library(dplyr)
                                      dummy_df %>%
                                      spread(Treatment,Value) %>%
                                      gather(key,value,Treat1:Treat2) %>%
                                      group_by(Block,Year,key) %>%
                                      mutate(Val=Control-value)
                                      # A tibble: 6 x 6
                                      # Groups: Block, Year, key [6]
                                      Block Year Control key value Val
                                      <int> <int> <dbl> <chr> <dbl> <dbl>
                                      1 1 2011 6 Treat1 12 -6
                                      2 2 2012 3 Treat1 9 -6
                                      3 3 2013 6 Treat1 3 3
                                      4 1 2011 6 Treat2 4 2
                                      5 2 2012 3 Treat2 5 -2
                                      6 3 2013 6 Treat2 1 5


                                      Just the exact output:



                                      dummy_df %>% 
                                      spread(Treatment,Value) %>%
                                      gather(key,value,Treat1:Treat2) %>%
                                      mutate(Treatment=paste0("Control-",key)) %>%
                                      group_by(Block,Year,Treatment) %>%
                                      mutate(Val=Control-value) %>%
                                      select(Treatment,everything(),-value,-key)%>%
                                      arrange(Year)


                                      Result:



                                      # A tibble: 6 x 5
                                      # Groups: Block, Year, Treatment [6]
                                      Treatment Block Year Control Val
                                      <chr> <int> <int> <dbl> <dbl>
                                      1 Control-Treat1 1 2011 6 -6
                                      2 Control-Treat2 1 2011 6 2
                                      3 Control-Treat1 2 2012 3 -6
                                      4 Control-Treat2 2 2012 3 -2
                                      5 Control-Treat1 3 2013 6 3
                                      6 Control-Treat2 3 2013 6 5






                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited Mar 22 at 12:19

























                                      answered Mar 22 at 12:07









                                      NelsonGonNelsonGon

                                      4,2464834




                                      4,2464834





















                                          0














                                          Another tidyverse solution. We can use filter to separate "Control" and "Treatment" to different data frames, use left_join to combine them by Block and Year, and then process the data frame.



                                          library(tidyverse)

                                          df2 <- df %>%
                                          filter(!Treatment %in% "Control") %>%
                                          left_join(df %>% filter(Treatment %in% "Control"),
                                          .,
                                          by = c("Block", "Year")) %>%
                                          mutate(Value = Value.x - Value.y) %>%
                                          unite(Treatment, Treatment.x, Treatment.y, sep = "-") %>%
                                          select(names(df))
                                          # Treatment Block Year Value
                                          # 1 Control-Treat1 1 2011 -6
                                          # 2 Control-Treat2 1 2011 2
                                          # 3 Control-Treat1 2 2012 -6
                                          # 4 Control-Treat2 2 2012 -2
                                          # 5 Control-Treat1 3 2013 3
                                          # 6 Control-Treat2 3 2013 5





                                          share|improve this answer



























                                            0














                                            Another tidyverse solution. We can use filter to separate "Control" and "Treatment" to different data frames, use left_join to combine them by Block and Year, and then process the data frame.



                                            library(tidyverse)

                                            df2 <- df %>%
                                            filter(!Treatment %in% "Control") %>%
                                            left_join(df %>% filter(Treatment %in% "Control"),
                                            .,
                                            by = c("Block", "Year")) %>%
                                            mutate(Value = Value.x - Value.y) %>%
                                            unite(Treatment, Treatment.x, Treatment.y, sep = "-") %>%
                                            select(names(df))
                                            # Treatment Block Year Value
                                            # 1 Control-Treat1 1 2011 -6
                                            # 2 Control-Treat2 1 2011 2
                                            # 3 Control-Treat1 2 2012 -6
                                            # 4 Control-Treat2 2 2012 -2
                                            # 5 Control-Treat1 3 2013 3
                                            # 6 Control-Treat2 3 2013 5





                                            share|improve this answer

























                                              0












                                              0








                                              0







                                              Another tidyverse solution. We can use filter to separate "Control" and "Treatment" to different data frames, use left_join to combine them by Block and Year, and then process the data frame.



                                              library(tidyverse)

                                              df2 <- df %>%
                                              filter(!Treatment %in% "Control") %>%
                                              left_join(df %>% filter(Treatment %in% "Control"),
                                              .,
                                              by = c("Block", "Year")) %>%
                                              mutate(Value = Value.x - Value.y) %>%
                                              unite(Treatment, Treatment.x, Treatment.y, sep = "-") %>%
                                              select(names(df))
                                              # Treatment Block Year Value
                                              # 1 Control-Treat1 1 2011 -6
                                              # 2 Control-Treat2 1 2011 2
                                              # 3 Control-Treat1 2 2012 -6
                                              # 4 Control-Treat2 2 2012 -2
                                              # 5 Control-Treat1 3 2013 3
                                              # 6 Control-Treat2 3 2013 5





                                              share|improve this answer













                                              Another tidyverse solution. We can use filter to separate "Control" and "Treatment" to different data frames, use left_join to combine them by Block and Year, and then process the data frame.



                                              library(tidyverse)

                                              df2 <- df %>%
                                              filter(!Treatment %in% "Control") %>%
                                              left_join(df %>% filter(Treatment %in% "Control"),
                                              .,
                                              by = c("Block", "Year")) %>%
                                              mutate(Value = Value.x - Value.y) %>%
                                              unite(Treatment, Treatment.x, Treatment.y, sep = "-") %>%
                                              select(names(df))
                                              # Treatment Block Year Value
                                              # 1 Control-Treat1 1 2011 -6
                                              # 2 Control-Treat2 1 2011 2
                                              # 3 Control-Treat1 2 2012 -6
                                              # 4 Control-Treat2 2 2012 -2
                                              # 5 Control-Treat1 3 2013 3
                                              # 6 Control-Treat2 3 2013 5






                                              share|improve this answer












                                              share|improve this answer



                                              share|improve this answer










                                              answered Mar 22 at 13:59









                                              wwwwww

                                              28.9k112345




                                              28.9k112345



























                                                  draft saved

                                                  draft discarded
















































                                                  Thanks for contributing an answer to Stack Overflow!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid


                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.

                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function ()
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55299174%2fsubtract-rows-varying-one-column-but-keeping-others-fixed%23new-answer', 'question_page');

                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                                                  Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                                                  Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript