R/tidyverse: calculating standard deviation across rowsCan I apply R standard deviation across rows without `apply()` function?standard deviation on dataframe does not workR Dplyr mutate, calculating standard deviation for each rowHow to efficiently calculate a running standard deviation?Efficient calculation of matrix cumulative standard deviation in rRemove rows with all or some NAs (missing values) in data.frameCalculating weighted mean and standard deviationCalculating standard deviation of each rowR Dplyr mutate, calculating standard deviation for each rowCalculating Population Standard Deviation in RR Standard Deviation Across RowsAggregate with multiple duplicates and calculate their meantidyverse: row wise calculations by group
How can I get an unreasonable manager to approve time off?
Is this use of the expression "long past" correct?
Why doesn't Adrian Toomes give up Spider-Man's identity?
Are there any important biographies of nobodies?
Can Rydberg constant be in joules?
Which languages would be most useful in Europe at the end of the 19th century?
Soft question: Examples where lack of mathematical rigour cause security breaches?
How to safely destroy (a large quantity of) valid checks?
How to hide rifle during medieval town entrance inspection?
Fixing obscure 8080 emulator bug?
Certain search in list
CROSS APPLY produces outer join
A IP can traceroute to it, but can not ping
What makes Ada the language of choice for the ISS's safety-critical systems?
Why do some employees fill out a W-4 and some don't?
With Ubuntu 18.04, how can I have a hot corner that locks the computer?
Wooden cooking layout
Non-aquatic eyes?
I have a problem assistant manager, but I can't fire him
How come the nude protesters were not arrested?
Active low-pass filters --- good to what frequencies?
Were Alexander the Great and Hephaestion lovers?
Is it legal for a bar bouncer to confiscate a fake ID
How to communicate to my GM that not being allowed to use stealth isn't fun for me?
R/tidyverse: calculating standard deviation across rows
Can I apply R standard deviation across rows without `apply()` function?standard deviation on dataframe does not workR Dplyr mutate, calculating standard deviation for each rowHow to efficiently calculate a running standard deviation?Efficient calculation of matrix cumulative standard deviation in rRemove rows with all or some NAs (missing values) in data.frameCalculating weighted mean and standard deviationCalculating standard deviation of each rowR Dplyr mutate, calculating standard deviation for each rowCalculating Population Standard Deviation in RR Standard Deviation Across RowsAggregate with multiple duplicates and calculate their meantidyverse: row wise calculations by group
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
Say I have the following data:
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
# colA colB colC colD
# 1 SampA 21 15 10
# 2 SampB 20 14 22
# 3 SampC 30 12 18
I want to get the row means and standard deviations for the values in columns B-D.
I can calculate the rowMeans as follows:
library(dplyr)
df %>% select(., matches("colB|colC|colD")) %>% mutate(rmeans = rowMeans(.))
# colB colC colD rmeans
# 1 21 15 10 15.33333
# 2 20 14 22 18.66667
# 3 30 12 18 20.00000
But when I try to calculate the standard deviation using sd()
, it throws up an error.
df %>% select(., matches("colB|colC|colD")) %>% mutate(rsds = sapply(., sd(.)))
Error in is.data.frame(x) :
(list) object cannot be coerced to type 'double'
So my question is: how do I calculate the standard deviations here?
Edit: I tried sapply()
with sd()
having read the first answer here.
Additional edit: not necessarily looking for a 'tidy' solution (base R also works just fine).
r dplyr statistics
add a comment |
Say I have the following data:
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
# colA colB colC colD
# 1 SampA 21 15 10
# 2 SampB 20 14 22
# 3 SampC 30 12 18
I want to get the row means and standard deviations for the values in columns B-D.
I can calculate the rowMeans as follows:
library(dplyr)
df %>% select(., matches("colB|colC|colD")) %>% mutate(rmeans = rowMeans(.))
# colB colC colD rmeans
# 1 21 15 10 15.33333
# 2 20 14 22 18.66667
# 3 30 12 18 20.00000
But when I try to calculate the standard deviation using sd()
, it throws up an error.
df %>% select(., matches("colB|colC|colD")) %>% mutate(rsds = sapply(., sd(.)))
Error in is.data.frame(x) :
(list) object cannot be coerced to type 'double'
So my question is: how do I calculate the standard deviations here?
Edit: I tried sapply()
with sd()
having read the first answer here.
Additional edit: not necessarily looking for a 'tidy' solution (base R also works just fine).
r dplyr statistics
add a comment |
Say I have the following data:
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
# colA colB colC colD
# 1 SampA 21 15 10
# 2 SampB 20 14 22
# 3 SampC 30 12 18
I want to get the row means and standard deviations for the values in columns B-D.
I can calculate the rowMeans as follows:
library(dplyr)
df %>% select(., matches("colB|colC|colD")) %>% mutate(rmeans = rowMeans(.))
# colB colC colD rmeans
# 1 21 15 10 15.33333
# 2 20 14 22 18.66667
# 3 30 12 18 20.00000
But when I try to calculate the standard deviation using sd()
, it throws up an error.
df %>% select(., matches("colB|colC|colD")) %>% mutate(rsds = sapply(., sd(.)))
Error in is.data.frame(x) :
(list) object cannot be coerced to type 'double'
So my question is: how do I calculate the standard deviations here?
Edit: I tried sapply()
with sd()
having read the first answer here.
Additional edit: not necessarily looking for a 'tidy' solution (base R also works just fine).
r dplyr statistics
Say I have the following data:
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
# colA colB colC colD
# 1 SampA 21 15 10
# 2 SampB 20 14 22
# 3 SampC 30 12 18
I want to get the row means and standard deviations for the values in columns B-D.
I can calculate the rowMeans as follows:
library(dplyr)
df %>% select(., matches("colB|colC|colD")) %>% mutate(rmeans = rowMeans(.))
# colB colC colD rmeans
# 1 21 15 10 15.33333
# 2 20 14 22 18.66667
# 3 30 12 18 20.00000
But when I try to calculate the standard deviation using sd()
, it throws up an error.
df %>% select(., matches("colB|colC|colD")) %>% mutate(rsds = sapply(., sd(.)))
Error in is.data.frame(x) :
(list) object cannot be coerced to type 'double'
So my question is: how do I calculate the standard deviations here?
Edit: I tried sapply()
with sd()
having read the first answer here.
Additional edit: not necessarily looking for a 'tidy' solution (base R also works just fine).
r dplyr statistics
r dplyr statistics
edited Mar 24 at 19:20
Dunois
asked Mar 24 at 18:29
DunoisDunois
1229
1229
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
Try this (using), withrowSds
from the matrixStats
package,
library(dplyr)
library(matrixStats)
columns <- c('colB', 'colC', 'colD')
df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))
Returns
colA colB colC colD Mean stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151
Your data
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
add a comment |
Here is another way using pmap
to get the rowwise mean
and sd
library(purrr)
library(dplyr)
library(tidur_
f1 <- function(x) tibble(Mean = mean(x), SD = sd(x))
df %>%
# select the numeric columns
select_if(is.numeric) %>%
# apply the f1 rowwise to get the mean and sd in transmute
transmute(out = pmap(., ~ f1(c(...)))) %>%
# unnest the list column
unnest %>%
# bind with the original dataset
bind_cols(df, .)
# colA colB colC colD Mean SD
#1 SampA 21 15 10 15.33333 5.507571
#2 SampB 20 14 22 18.66667 4.163332
#3 SampC 30 12 18 20.00000 9.165151
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance ofc(...)
?
– Dunois
Mar 24 at 21:42
1
@Dunois We are capturing all the row elements with...
and concatenating (c
) into a vector
– akrun
Mar 25 at 3:54
add a comment |
Package magrittr
pipes %>%
are not a good way to process by rows.
Maybe the following is what you want.
df %>%
select(-colA) %>%
t() %>% as.data.frame() %>%
summarise_all(sd)
# V1 V2 V3
#1 5.507571 4.163332 9.165151
Thank you for pointing that out. I am never sure when to attempt thetidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?
– Dunois
Mar 24 at 18:49
1
@Dunois Maybe yes, but the question is taggedtidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I triedrowwise()
and couldn't get it to work and so resorted tot() %>% as.data.frame()
.
– Rui Barradas
Mar 24 at 19:02
2
Here's a way to makerowwise
work :df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
add a comment |
A different tidyverse
approach could be:
df %>%
rowid_to_column() %>%
gather(var, val, -c(colA, rowid)) %>%
group_by(rowid) %>%
summarise(rsds = sd(val)) %>%
left_join(df %>%
rowid_to_column(), by = c("rowid" = "rowid")) %>%
select(-rowid)
rsds colA colB colC colD
<dbl> <fct> <dbl> <dbl> <dbl>
1 5.51 SampA 21 15 10
2 4.16 SampB 20 14 22
3 9.17 SampC 30 12 18
Here it, first, creates a row ID. Second, it performs a wide-to-long data transformation, excluding the "colA" and row ID. Third, it groups by row ID and calculates the standard deviation. Finally, it joins it with the original df on row ID.
Or alternatively, using rowwise()
and do()
:
df %>%
rowwise() %>%
do(data.frame(., rsds = sd(unlist(.[2:length(.)]))))
colA colB colC colD rsds
* <fct> <dbl> <dbl> <dbl> <dbl>
1 SampA 21 15 10 5.51
2 SampB 20 14 22 4.16
3 SampC 30 12 18 9.17
add a comment |
You can use pmap
, or rowwise
(or group by colA
) along with mutate
:
library(tidyverse)
df %>% mutate(sd = pmap(.[-1], ~sd(c(...)))) # same as transform(df, sd = apply(df[-1],1,sd))
#> colA colB colC colD sd
#> 1 SampA 21 15 10 5.507571
#> 2 SampB 20 14 22 4.163332
#> 3 SampC 30 12 18 9.165151
df %>% rowwise() %>% mutate(sd = sd(c(colB,colC,colD)))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
df %>% group_by(colA) %>% mutate(sd = sd(c(colB,colC,colD)))
#> # A tibble: 3 x 5
#> # Groups: colA [3]
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55327096%2fr-tidyverse-calculating-standard-deviation-across-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try this (using), withrowSds
from the matrixStats
package,
library(dplyr)
library(matrixStats)
columns <- c('colB', 'colC', 'colD')
df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))
Returns
colA colB colC colD Mean stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151
Your data
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
add a comment |
Try this (using), withrowSds
from the matrixStats
package,
library(dplyr)
library(matrixStats)
columns <- c('colB', 'colC', 'colD')
df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))
Returns
colA colB colC colD Mean stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151
Your data
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
add a comment |
Try this (using), withrowSds
from the matrixStats
package,
library(dplyr)
library(matrixStats)
columns <- c('colB', 'colC', 'colD')
df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))
Returns
colA colB colC colD Mean stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151
Your data
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
Try this (using), withrowSds
from the matrixStats
package,
library(dplyr)
library(matrixStats)
columns <- c('colB', 'colC', 'colD')
df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))
Returns
colA colB colC colD Mean stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151
Your data
colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df
answered Mar 24 at 18:40
Hector HaffendenHector Haffenden
916416
916416
add a comment |
add a comment |
Here is another way using pmap
to get the rowwise mean
and sd
library(purrr)
library(dplyr)
library(tidur_
f1 <- function(x) tibble(Mean = mean(x), SD = sd(x))
df %>%
# select the numeric columns
select_if(is.numeric) %>%
# apply the f1 rowwise to get the mean and sd in transmute
transmute(out = pmap(., ~ f1(c(...)))) %>%
# unnest the list column
unnest %>%
# bind with the original dataset
bind_cols(df, .)
# colA colB colC colD Mean SD
#1 SampA 21 15 10 15.33333 5.507571
#2 SampB 20 14 22 18.66667 4.163332
#3 SampC 30 12 18 20.00000 9.165151
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance ofc(...)
?
– Dunois
Mar 24 at 21:42
1
@Dunois We are capturing all the row elements with...
and concatenating (c
) into a vector
– akrun
Mar 25 at 3:54
add a comment |
Here is another way using pmap
to get the rowwise mean
and sd
library(purrr)
library(dplyr)
library(tidur_
f1 <- function(x) tibble(Mean = mean(x), SD = sd(x))
df %>%
# select the numeric columns
select_if(is.numeric) %>%
# apply the f1 rowwise to get the mean and sd in transmute
transmute(out = pmap(., ~ f1(c(...)))) %>%
# unnest the list column
unnest %>%
# bind with the original dataset
bind_cols(df, .)
# colA colB colC colD Mean SD
#1 SampA 21 15 10 15.33333 5.507571
#2 SampB 20 14 22 18.66667 4.163332
#3 SampC 30 12 18 20.00000 9.165151
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance ofc(...)
?
– Dunois
Mar 24 at 21:42
1
@Dunois We are capturing all the row elements with...
and concatenating (c
) into a vector
– akrun
Mar 25 at 3:54
add a comment |
Here is another way using pmap
to get the rowwise mean
and sd
library(purrr)
library(dplyr)
library(tidur_
f1 <- function(x) tibble(Mean = mean(x), SD = sd(x))
df %>%
# select the numeric columns
select_if(is.numeric) %>%
# apply the f1 rowwise to get the mean and sd in transmute
transmute(out = pmap(., ~ f1(c(...)))) %>%
# unnest the list column
unnest %>%
# bind with the original dataset
bind_cols(df, .)
# colA colB colC colD Mean SD
#1 SampA 21 15 10 15.33333 5.507571
#2 SampB 20 14 22 18.66667 4.163332
#3 SampC 30 12 18 20.00000 9.165151
Here is another way using pmap
to get the rowwise mean
and sd
library(purrr)
library(dplyr)
library(tidur_
f1 <- function(x) tibble(Mean = mean(x), SD = sd(x))
df %>%
# select the numeric columns
select_if(is.numeric) %>%
# apply the f1 rowwise to get the mean and sd in transmute
transmute(out = pmap(., ~ f1(c(...)))) %>%
# unnest the list column
unnest %>%
# bind with the original dataset
bind_cols(df, .)
# colA colB colC colD Mean SD
#1 SampA 21 15 10 15.33333 5.507571
#2 SampB 20 14 22 18.66667 4.163332
#3 SampC 30 12 18 20.00000 9.165151
answered Mar 24 at 21:31
akrunakrun
438k14233319
438k14233319
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance ofc(...)
?
– Dunois
Mar 24 at 21:42
1
@Dunois We are capturing all the row elements with...
and concatenating (c
) into a vector
– akrun
Mar 25 at 3:54
add a comment |
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance ofc(...)
?
– Dunois
Mar 24 at 21:42
1
@Dunois We are capturing all the row elements with...
and concatenating (c
) into a vector
– akrun
Mar 25 at 3:54
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance of
c(...)
?– Dunois
Mar 24 at 21:42
I'm sure this has probably been asked somewhere (and I can't seem to get an answer from a quick Google search), but what is the significance of
c(...)
?– Dunois
Mar 24 at 21:42
1
1
@Dunois We are capturing all the row elements with
...
and concatenating (c
) into a vector– akrun
Mar 25 at 3:54
@Dunois We are capturing all the row elements with
...
and concatenating (c
) into a vector– akrun
Mar 25 at 3:54
add a comment |
Package magrittr
pipes %>%
are not a good way to process by rows.
Maybe the following is what you want.
df %>%
select(-colA) %>%
t() %>% as.data.frame() %>%
summarise_all(sd)
# V1 V2 V3
#1 5.507571 4.163332 9.165151
Thank you for pointing that out. I am never sure when to attempt thetidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?
– Dunois
Mar 24 at 18:49
1
@Dunois Maybe yes, but the question is taggedtidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I triedrowwise()
and couldn't get it to work and so resorted tot() %>% as.data.frame()
.
– Rui Barradas
Mar 24 at 19:02
2
Here's a way to makerowwise
work :df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
add a comment |
Package magrittr
pipes %>%
are not a good way to process by rows.
Maybe the following is what you want.
df %>%
select(-colA) %>%
t() %>% as.data.frame() %>%
summarise_all(sd)
# V1 V2 V3
#1 5.507571 4.163332 9.165151
Thank you for pointing that out. I am never sure when to attempt thetidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?
– Dunois
Mar 24 at 18:49
1
@Dunois Maybe yes, but the question is taggedtidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I triedrowwise()
and couldn't get it to work and so resorted tot() %>% as.data.frame()
.
– Rui Barradas
Mar 24 at 19:02
2
Here's a way to makerowwise
work :df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
add a comment |
Package magrittr
pipes %>%
are not a good way to process by rows.
Maybe the following is what you want.
df %>%
select(-colA) %>%
t() %>% as.data.frame() %>%
summarise_all(sd)
# V1 V2 V3
#1 5.507571 4.163332 9.165151
Package magrittr
pipes %>%
are not a good way to process by rows.
Maybe the following is what you want.
df %>%
select(-colA) %>%
t() %>% as.data.frame() %>%
summarise_all(sd)
# V1 V2 V3
#1 5.507571 4.163332 9.165151
answered Mar 24 at 18:40
Rui BarradasRui Barradas
19.5k61935
19.5k61935
Thank you for pointing that out. I am never sure when to attempt thetidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?
– Dunois
Mar 24 at 18:49
1
@Dunois Maybe yes, but the question is taggedtidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I triedrowwise()
and couldn't get it to work and so resorted tot() %>% as.data.frame()
.
– Rui Barradas
Mar 24 at 19:02
2
Here's a way to makerowwise
work :df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
add a comment |
Thank you for pointing that out. I am never sure when to attempt thetidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?
– Dunois
Mar 24 at 18:49
1
@Dunois Maybe yes, but the question is taggedtidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I triedrowwise()
and couldn't get it to work and so resorted tot() %>% as.data.frame()
.
– Rui Barradas
Mar 24 at 19:02
2
Here's a way to makerowwise
work :df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
Thank you for pointing that out. I am never sure when to attempt the
tidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?– Dunois
Mar 24 at 18:49
Thank you for pointing that out. I am never sure when to attempt the
tidyverse
approach and when to stick to base R. I should have probably mentioned in the OP that I wasn't necessarily looking for a piped solution?– Dunois
Mar 24 at 18:49
1
1
@Dunois Maybe yes, but the question is tagged
tidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I tried rowwise()
and couldn't get it to work and so resorted to t() %>% as.data.frame()
.– Rui Barradas
Mar 24 at 19:02
@Dunois Maybe yes, but the question is tagged
tidyverse
and pipes are a really nice way to process data. I mentioned it mostly because I tried rowwise()
and couldn't get it to work and so resorted to t() %>% as.data.frame()
.– Rui Barradas
Mar 24 at 19:02
2
2
Here's a way to make
rowwise
work : df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
Here's a way to make
rowwise
work : df %>% rowwise() %>% summarize(sd = sd(c(colB,colC,colD)))
– Moody_Mudskipper
Mar 25 at 14:34
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
@Moody_Mudskipper You should post it as an answer.
– Rui Barradas
Mar 25 at 15:10
add a comment |
A different tidyverse
approach could be:
df %>%
rowid_to_column() %>%
gather(var, val, -c(colA, rowid)) %>%
group_by(rowid) %>%
summarise(rsds = sd(val)) %>%
left_join(df %>%
rowid_to_column(), by = c("rowid" = "rowid")) %>%
select(-rowid)
rsds colA colB colC colD
<dbl> <fct> <dbl> <dbl> <dbl>
1 5.51 SampA 21 15 10
2 4.16 SampB 20 14 22
3 9.17 SampC 30 12 18
Here it, first, creates a row ID. Second, it performs a wide-to-long data transformation, excluding the "colA" and row ID. Third, it groups by row ID and calculates the standard deviation. Finally, it joins it with the original df on row ID.
Or alternatively, using rowwise()
and do()
:
df %>%
rowwise() %>%
do(data.frame(., rsds = sd(unlist(.[2:length(.)]))))
colA colB colC colD rsds
* <fct> <dbl> <dbl> <dbl> <dbl>
1 SampA 21 15 10 5.51
2 SampB 20 14 22 4.16
3 SampC 30 12 18 9.17
add a comment |
A different tidyverse
approach could be:
df %>%
rowid_to_column() %>%
gather(var, val, -c(colA, rowid)) %>%
group_by(rowid) %>%
summarise(rsds = sd(val)) %>%
left_join(df %>%
rowid_to_column(), by = c("rowid" = "rowid")) %>%
select(-rowid)
rsds colA colB colC colD
<dbl> <fct> <dbl> <dbl> <dbl>
1 5.51 SampA 21 15 10
2 4.16 SampB 20 14 22
3 9.17 SampC 30 12 18
Here it, first, creates a row ID. Second, it performs a wide-to-long data transformation, excluding the "colA" and row ID. Third, it groups by row ID and calculates the standard deviation. Finally, it joins it with the original df on row ID.
Or alternatively, using rowwise()
and do()
:
df %>%
rowwise() %>%
do(data.frame(., rsds = sd(unlist(.[2:length(.)]))))
colA colB colC colD rsds
* <fct> <dbl> <dbl> <dbl> <dbl>
1 SampA 21 15 10 5.51
2 SampB 20 14 22 4.16
3 SampC 30 12 18 9.17
add a comment |
A different tidyverse
approach could be:
df %>%
rowid_to_column() %>%
gather(var, val, -c(colA, rowid)) %>%
group_by(rowid) %>%
summarise(rsds = sd(val)) %>%
left_join(df %>%
rowid_to_column(), by = c("rowid" = "rowid")) %>%
select(-rowid)
rsds colA colB colC colD
<dbl> <fct> <dbl> <dbl> <dbl>
1 5.51 SampA 21 15 10
2 4.16 SampB 20 14 22
3 9.17 SampC 30 12 18
Here it, first, creates a row ID. Second, it performs a wide-to-long data transformation, excluding the "colA" and row ID. Third, it groups by row ID and calculates the standard deviation. Finally, it joins it with the original df on row ID.
Or alternatively, using rowwise()
and do()
:
df %>%
rowwise() %>%
do(data.frame(., rsds = sd(unlist(.[2:length(.)]))))
colA colB colC colD rsds
* <fct> <dbl> <dbl> <dbl> <dbl>
1 SampA 21 15 10 5.51
2 SampB 20 14 22 4.16
3 SampC 30 12 18 9.17
A different tidyverse
approach could be:
df %>%
rowid_to_column() %>%
gather(var, val, -c(colA, rowid)) %>%
group_by(rowid) %>%
summarise(rsds = sd(val)) %>%
left_join(df %>%
rowid_to_column(), by = c("rowid" = "rowid")) %>%
select(-rowid)
rsds colA colB colC colD
<dbl> <fct> <dbl> <dbl> <dbl>
1 5.51 SampA 21 15 10
2 4.16 SampB 20 14 22
3 9.17 SampC 30 12 18
Here it, first, creates a row ID. Second, it performs a wide-to-long data transformation, excluding the "colA" and row ID. Third, it groups by row ID and calculates the standard deviation. Finally, it joins it with the original df on row ID.
Or alternatively, using rowwise()
and do()
:
df %>%
rowwise() %>%
do(data.frame(., rsds = sd(unlist(.[2:length(.)]))))
colA colB colC colD rsds
* <fct> <dbl> <dbl> <dbl> <dbl>
1 SampA 21 15 10 5.51
2 SampB 20 14 22 4.16
3 SampC 30 12 18 9.17
edited Mar 24 at 21:40
answered Mar 24 at 21:24
tmfmnktmfmnk
6,6761821
6,6761821
add a comment |
add a comment |
You can use pmap
, or rowwise
(or group by colA
) along with mutate
:
library(tidyverse)
df %>% mutate(sd = pmap(.[-1], ~sd(c(...)))) # same as transform(df, sd = apply(df[-1],1,sd))
#> colA colB colC colD sd
#> 1 SampA 21 15 10 5.507571
#> 2 SampB 20 14 22 4.163332
#> 3 SampC 30 12 18 9.165151
df %>% rowwise() %>% mutate(sd = sd(c(colB,colC,colD)))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
df %>% group_by(colA) %>% mutate(sd = sd(c(colB,colC,colD)))
#> # A tibble: 3 x 5
#> # Groups: colA [3]
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
add a comment |
You can use pmap
, or rowwise
(or group by colA
) along with mutate
:
library(tidyverse)
df %>% mutate(sd = pmap(.[-1], ~sd(c(...)))) # same as transform(df, sd = apply(df[-1],1,sd))
#> colA colB colC colD sd
#> 1 SampA 21 15 10 5.507571
#> 2 SampB 20 14 22 4.163332
#> 3 SampC 30 12 18 9.165151
df %>% rowwise() %>% mutate(sd = sd(c(colB,colC,colD)))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
df %>% group_by(colA) %>% mutate(sd = sd(c(colB,colC,colD)))
#> # A tibble: 3 x 5
#> # Groups: colA [3]
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
add a comment |
You can use pmap
, or rowwise
(or group by colA
) along with mutate
:
library(tidyverse)
df %>% mutate(sd = pmap(.[-1], ~sd(c(...)))) # same as transform(df, sd = apply(df[-1],1,sd))
#> colA colB colC colD sd
#> 1 SampA 21 15 10 5.507571
#> 2 SampB 20 14 22 4.163332
#> 3 SampC 30 12 18 9.165151
df %>% rowwise() %>% mutate(sd = sd(c(colB,colC,colD)))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
df %>% group_by(colA) %>% mutate(sd = sd(c(colB,colC,colD)))
#> # A tibble: 3 x 5
#> # Groups: colA [3]
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
You can use pmap
, or rowwise
(or group by colA
) along with mutate
:
library(tidyverse)
df %>% mutate(sd = pmap(.[-1], ~sd(c(...)))) # same as transform(df, sd = apply(df[-1],1,sd))
#> colA colB colC colD sd
#> 1 SampA 21 15 10 5.507571
#> 2 SampB 20 14 22 4.163332
#> 3 SampC 30 12 18 9.165151
df %>% rowwise() %>% mutate(sd = sd(c(colB,colC,colD)))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
df %>% group_by(colA) %>% mutate(sd = sd(c(colB,colC,colD)))
#> # A tibble: 3 x 5
#> # Groups: colA [3]
#> colA colB colC colD sd
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 SampA 21 15 10 5.51
#> 2 SampB 20 14 22 4.16
#> 3 SampC 30 12 18 9.17
edited Mar 25 at 17:34
answered Mar 25 at 17:29
Moody_MudskipperMoody_Mudskipper
25.9k34075
25.9k34075
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55327096%2fr-tidyverse-calculating-standard-deviation-across-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown