Group by one variable, but summary() over all other variables (mean) in RHow to sum a variable by groupR Language: How do I print / see summary statistics for sample subset?Applying mean imputation over a large subset of variables in RGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Efficient way of simultaneously deriving count of unique values and summary values for grouped values in dplyrFunction similar to group_by when groups are not mutually exlcusiveCalculating amount of observations for grouped variable and attach result to dataframeHow do I return zero counts in GROUP BY statementIncrement by 1 for every unique change in column [in R]How to import Excel table in R in order to get binary variables?
How do I intentionally fragment a SQL Server Index?
Label on a bended arrow
Can others monetize my project with GPLv3?
Using は before 欲しい instead が
Vacuum collapse -- why do strong metals implode but glass doesn't?
How many spells can a level 1 wizard learn?
Chord with lyrics - What does it mean if there is an empty space instead of a Chord?
How to decide whether an eshop is safe or compromised
Why don't politicians push for fossil fuel reduction by pointing out their scarcity?
How did Apollo 15's depressurization work?
Is it appropriate for a business to ask me for my credit report?
iPhone 8 purchased through AT&T change to T-Mobile
Best Practice: dependency on data model names
Do living authors still get paid royalties for their old work?
Are required indicators necessary for radio buttons?
90s(?) book series about two people transported to a parallel medieval world, she joins city watch, he becomes wizard
Why do some academic journals requires a separate "summary" paragraph in addition to an abstract?
Unbiased estimator of exponential of measure of a set?
Convert HTML color to OLE
Alchemist potion on Undead
Does C++20 mandate source code being stored in files?
Changing a TGV booking
Moons that can't see each other
Is there such a thing as too inconvenient?
Group by one variable, but summary() over all other variables (mean) in R
How to sum a variable by groupR Language: How do I print / see summary statistics for sample subset?Applying mean imputation over a large subset of variables in RGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Efficient way of simultaneously deriving count of unique values and summary values for grouped values in dplyrFunction similar to group_by when groups are not mutually exlcusiveCalculating amount of observations for grouped variable and attach result to dataframeHow do I return zero counts in GROUP BY statementIncrement by 1 for every unique change in column [in R]How to import Excel table in R in order to get binary variables?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I know that there are already some threads about it, but I haven't found one yet about this specific problem.
The dependent variable in my dataset is Y and I have 144 independent variables. Y and X can take only the values 1 or 0. The data looks like
Y A469 T593 K022K A835 Z935 U83F W5326 ...
Person1 1 1 1 1 0 0 0 0
Person2 1 0 1 0 1 1 0 0
Person3 0 0 0 1 0 0 1 1
...
summary(dataset)
just provides descriptive statistics over all observations. What I want is (in pseudo-code):
summary(all variables if Y == 1 and Y == 0)
It would be great if I could see how often a certain X occurs in the certain value of Y. For example, mean(X4) = 0.04 and count = 6 if Y = 1.
r group-by mean
add a comment |
I know that there are already some threads about it, but I haven't found one yet about this specific problem.
The dependent variable in my dataset is Y and I have 144 independent variables. Y and X can take only the values 1 or 0. The data looks like
Y A469 T593 K022K A835 Z935 U83F W5326 ...
Person1 1 1 1 1 0 0 0 0
Person2 1 0 1 0 1 1 0 0
Person3 0 0 0 1 0 0 1 1
...
summary(dataset)
just provides descriptive statistics over all observations. What I want is (in pseudo-code):
summary(all variables if Y == 1 and Y == 0)
It would be great if I could see how often a certain X occurs in the certain value of Y. For example, mean(X4) = 0.04 and count = 6 if Y = 1.
r group-by mean
Please provide a more complete data set to work with. You can and should usedput
to provide sample data.
– NelsonGon
Mar 27 at 15:33
1
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of coursedput()
is nicer, but this is plenty clear.
– Gregor
Mar 27 at 15:38
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40
add a comment |
I know that there are already some threads about it, but I haven't found one yet about this specific problem.
The dependent variable in my dataset is Y and I have 144 independent variables. Y and X can take only the values 1 or 0. The data looks like
Y A469 T593 K022K A835 Z935 U83F W5326 ...
Person1 1 1 1 1 0 0 0 0
Person2 1 0 1 0 1 1 0 0
Person3 0 0 0 1 0 0 1 1
...
summary(dataset)
just provides descriptive statistics over all observations. What I want is (in pseudo-code):
summary(all variables if Y == 1 and Y == 0)
It would be great if I could see how often a certain X occurs in the certain value of Y. For example, mean(X4) = 0.04 and count = 6 if Y = 1.
r group-by mean
I know that there are already some threads about it, but I haven't found one yet about this specific problem.
The dependent variable in my dataset is Y and I have 144 independent variables. Y and X can take only the values 1 or 0. The data looks like
Y A469 T593 K022K A835 Z935 U83F W5326 ...
Person1 1 1 1 1 0 0 0 0
Person2 1 0 1 0 1 1 0 0
Person3 0 0 0 1 0 0 1 1
...
summary(dataset)
just provides descriptive statistics over all observations. What I want is (in pseudo-code):
summary(all variables if Y == 1 and Y == 0)
It would be great if I could see how often a certain X occurs in the certain value of Y. For example, mean(X4) = 0.04 and count = 6 if Y = 1.
r group-by mean
r group-by mean
edited Mar 27 at 20:22
Textime
asked Mar 27 at 14:43
TextimeTextime
478 bronze badges
478 bronze badges
Please provide a more complete data set to work with. You can and should usedput
to provide sample data.
– NelsonGon
Mar 27 at 15:33
1
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of coursedput()
is nicer, but this is plenty clear.
– Gregor
Mar 27 at 15:38
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40
add a comment |
Please provide a more complete data set to work with. You can and should usedput
to provide sample data.
– NelsonGon
Mar 27 at 15:33
1
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of coursedput()
is nicer, but this is plenty clear.
– Gregor
Mar 27 at 15:38
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40
Please provide a more complete data set to work with. You can and should use
dput
to provide sample data.– NelsonGon
Mar 27 at 15:33
Please provide a more complete data set to work with. You can and should use
dput
to provide sample data.– NelsonGon
Mar 27 at 15:33
1
1
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of course
dput()
is nicer, but this is plenty clear.– Gregor
Mar 27 at 15:38
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of course
dput()
is nicer, but this is plenty clear.– Gregor
Mar 27 at 15:38
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40
add a comment |
1 Answer
1
active
oldest
votes
EDIT 2
after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
- ...
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
I think thecount
is the issue. YOu may needn()
– akrun
Mar 27 at 15:07
1
tryn
instead ofn()
– Cettt
Mar 27 at 15:20
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result asdata_summary
or something and look at it with something other than the default print method for tibbles.print.data.frame(data_summary)
,View(data_summary)
,write.csv(data_summary)
,print(data_summary, width = Inf)
, etc.
– Gregor
Mar 27 at 15:34
|
show 6 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55380011%2fgroup-by-one-variable-but-summary-over-all-other-variables-mean-in-r%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
EDIT 2
after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
- ...
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
I think thecount
is the issue. YOu may needn()
– akrun
Mar 27 at 15:07
1
tryn
instead ofn()
– Cettt
Mar 27 at 15:20
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result asdata_summary
or something and look at it with something other than the default print method for tibbles.print.data.frame(data_summary)
,View(data_summary)
,write.csv(data_summary)
,print(data_summary, width = Inf)
, etc.
– Gregor
Mar 27 at 15:34
|
show 6 more comments
EDIT 2
after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
- ...
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
I think thecount
is the issue. YOu may needn()
– akrun
Mar 27 at 15:07
1
tryn
instead ofn()
– Cettt
Mar 27 at 15:20
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result asdata_summary
or something and look at it with something other than the default print method for tibbles.print.data.frame(data_summary)
,View(data_summary)
,write.csv(data_summary)
,print(data_summary, width = Inf)
, etc.
– Gregor
Mar 27 at 15:34
|
show 6 more comments
EDIT 2
after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
- ...
EDIT 2
after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
- ...
edited Mar 27 at 15:39
Gregor
73.7k12 gold badges103 silver badges196 bronze badges
73.7k12 gold badges103 silver badges196 bronze badges
answered Mar 27 at 14:51
CetttCettt
4,8315 gold badges18 silver badges40 bronze badges
4,8315 gold badges18 silver badges40 bronze badges
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
I think thecount
is the issue. YOu may needn()
– akrun
Mar 27 at 15:07
1
tryn
instead ofn()
– Cettt
Mar 27 at 15:20
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result asdata_summary
or something and look at it with something other than the default print method for tibbles.print.data.frame(data_summary)
,View(data_summary)
,write.csv(data_summary)
,print(data_summary, width = Inf)
, etc.
– Gregor
Mar 27 at 15:34
|
show 6 more comments
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
I think thecount
is the issue. YOu may needn()
– akrun
Mar 27 at 15:07
1
tryn
instead ofn()
– Cettt
Mar 27 at 15:20
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result asdata_summary
or something and look at it with something other than the default print method for tibbles.print.data.frame(data_summary)
,View(data_summary)
,write.csv(data_summary)
,print(data_summary, width = Inf)
, etc.
– Gregor
Mar 27 at 15:34
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
I'm getting this error message: Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "c('double', 'numeric')"
– Textime
Mar 27 at 14:52
1
1
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
can you please provide a sample of your data?
– Cettt
Mar 27 at 15:03
1
1
I think the
count
is the issue. YOu may need n()
– akrun
Mar 27 at 15:07
I think the
count
is the issue. YOu may need n()
– akrun
Mar 27 at 15:07
1
1
try
n
instead of n()
– Cettt
Mar 27 at 15:20
try
n
instead of n()
– Cettt
Mar 27 at 15:20
2
2
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result as
data_summary
or something and look at it with something other than the default print method for tibbles. print.data.frame(data_summary)
, View(data_summary)
, write.csv(data_summary)
, print(data_summary, width = Inf)
, etc.– Gregor
Mar 27 at 15:34
You're right, it only shows the first 10 variables. They are all there, they just aren't printed. Save the result as
data_summary
or something and look at it with something other than the default print method for tibbles. print.data.frame(data_summary)
, View(data_summary)
, write.csv(data_summary)
, print(data_summary, width = Inf)
, etc.– Gregor
Mar 27 at 15:34
|
show 6 more comments
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55380011%2fgroup-by-one-variable-but-summary-over-all-other-variables-mean-in-r%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please provide a more complete data set to work with. You can and should use
dput
to provide sample data.– NelsonGon
Mar 27 at 15:33
1
@NelsonGon Bold assertion after it's already been nearly answered. I'm all for reproducible examples, and of course
dput()
is nicer, but this is plenty clear.– Gregor
Mar 27 at 15:38
@Gregor it seemed to me that lack of data was making it hard to find the "ideal" solution. My apologies!
– NelsonGon
Mar 27 at 15:40