How to make density plot correctly show area near the limits?How to set limits for axes in ggplot2 R plots?How to make a great R reproducible exampleHow to save a plot as image on the disk?How can we make xkcd style graphs?Splitting distribution visualisations on the y-axis in ggplot2 in rAsymmetric density plot of outcomes of 2 dices rolledsaving multiple ggplots created in a for loop to a single plotShiny & ggplot: Numeric variables not recognized in ggplot's aes() mapping statementPlot Gaussian Mixture in R using ggplot2Decimal seprarator is changed by ggplot2 in R Kernel of jupyter
Am I allowed to determine tenets of my contract as a warlock?
Quasar Redshifts
Was planting UN flag on Moon ever discussed?
Why would a home insurer offer a discount based on credit score?
How do I avoid typing "git" at the begining of every Git command?
Professor Roman loves to teach unorthodox Chemistry
What is this object?
Is all-caps blackletter no longer taboo?
In The Incredibles 2, why does Screenslaver's name use a pun on something that doesn't exist in the 1950s pastiche?
Course development: can I pay someone to make slides for the course?
What is the theme of analysis?
What's the best way to quit a job mostly because of money?
Print "N NE E SE S SW W NW"
How to Handle Many Times Series Simultaneously?
Part of my house is inexplicably gone
How can I find out about the game world without meta-influencing it?
Dependent voltage/current sources
How can powerful telekinesis avoid violating Newton's 3rd Law?
How can I list the different hex characters between two files?
What is this Amiga 2000 mod?
Oil draining out shortly after turbo hose detached/broke
What does this line mean in Zelazny's The Courts of Chaos?
What is the "books received" section in journals?
Create a cube from identical 3D objects
How to make density plot correctly show area near the limits?
How to set limits for axes in ggplot2 R plots?How to make a great R reproducible exampleHow to save a plot as image on the disk?How can we make xkcd style graphs?Splitting distribution visualisations on the y-axis in ggplot2 in rAsymmetric density plot of outcomes of 2 dices rolledsaving multiple ggplots created in a for loop to a single plotShiny & ggplot: Numeric variables not recognized in ggplot's aes() mapping statementPlot Gaussian Mixture in R using ggplot2Decimal seprarator is changed by ggplot2 in R Kernel of jupyter
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
When I plot densities with ggplot, it seems to be very wrong around the limits. I see that geom_density
and other functions allow specifying various density kernels, but none of them seem to fix the issue.
How do you correctly plot densities around the limits with ggplot?
As an example, let's plot the Chi-square distribution with 2 degrees of freedom. Using the builtin probability densities:
library(ggplot2)
u = seq(0, 2, by=0.01)
v = dchisq(u, df=2)
df = data.frame(x=u, p=v)
p = ggplot(df) +
geom_line(aes(x=x, y=p), size=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 0.5))
show(p)
We get the expected plot:
Now let's try simulating it and plotting the empirical distribution:
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_density(aes(x=x)) +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
We get an incorrect plot:
We can try to visualize the actual distribution:
library(ggplot2, dplyr, tidyr)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_point(aes(x=x, y=0.5), position=position_jitter(height=0.2), shape='.', alpha=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 1))
show(p)
And it seems to look correct, contrary to the density plot:
It seems like the problem has to do with kernels, and geom_density
does allow using different kernels. But they don't really correct the limit problem. For example, the code above with triangular
looks about the same:
Here's an idea of what I'm expecting to see (of course, I want a density, not a histogram):
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_histogram(aes(x=x), center=0.1, binwidth=0.2, fill='white', color='black') +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
r ggplot2 kernel-density
add a comment |
When I plot densities with ggplot, it seems to be very wrong around the limits. I see that geom_density
and other functions allow specifying various density kernels, but none of them seem to fix the issue.
How do you correctly plot densities around the limits with ggplot?
As an example, let's plot the Chi-square distribution with 2 degrees of freedom. Using the builtin probability densities:
library(ggplot2)
u = seq(0, 2, by=0.01)
v = dchisq(u, df=2)
df = data.frame(x=u, p=v)
p = ggplot(df) +
geom_line(aes(x=x, y=p), size=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 0.5))
show(p)
We get the expected plot:
Now let's try simulating it and plotting the empirical distribution:
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_density(aes(x=x)) +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
We get an incorrect plot:
We can try to visualize the actual distribution:
library(ggplot2, dplyr, tidyr)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_point(aes(x=x, y=0.5), position=position_jitter(height=0.2), shape='.', alpha=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 1))
show(p)
And it seems to look correct, contrary to the density plot:
It seems like the problem has to do with kernels, and geom_density
does allow using different kernels. But they don't really correct the limit problem. For example, the code above with triangular
looks about the same:
Here's an idea of what I'm expecting to see (of course, I want a density, not a histogram):
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_histogram(aes(x=x), center=0.1, binwidth=0.2, fill='white', color='black') +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
r ggplot2 kernel-density
1
I'm a bit confused by your use ofgeom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.
– Marius
Mar 24 at 23:15
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21
add a comment |
When I plot densities with ggplot, it seems to be very wrong around the limits. I see that geom_density
and other functions allow specifying various density kernels, but none of them seem to fix the issue.
How do you correctly plot densities around the limits with ggplot?
As an example, let's plot the Chi-square distribution with 2 degrees of freedom. Using the builtin probability densities:
library(ggplot2)
u = seq(0, 2, by=0.01)
v = dchisq(u, df=2)
df = data.frame(x=u, p=v)
p = ggplot(df) +
geom_line(aes(x=x, y=p), size=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 0.5))
show(p)
We get the expected plot:
Now let's try simulating it and plotting the empirical distribution:
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_density(aes(x=x)) +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
We get an incorrect plot:
We can try to visualize the actual distribution:
library(ggplot2, dplyr, tidyr)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_point(aes(x=x, y=0.5), position=position_jitter(height=0.2), shape='.', alpha=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 1))
show(p)
And it seems to look correct, contrary to the density plot:
It seems like the problem has to do with kernels, and geom_density
does allow using different kernels. But they don't really correct the limit problem. For example, the code above with triangular
looks about the same:
Here's an idea of what I'm expecting to see (of course, I want a density, not a histogram):
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_histogram(aes(x=x), center=0.1, binwidth=0.2, fill='white', color='black') +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
r ggplot2 kernel-density
When I plot densities with ggplot, it seems to be very wrong around the limits. I see that geom_density
and other functions allow specifying various density kernels, but none of them seem to fix the issue.
How do you correctly plot densities around the limits with ggplot?
As an example, let's plot the Chi-square distribution with 2 degrees of freedom. Using the builtin probability densities:
library(ggplot2)
u = seq(0, 2, by=0.01)
v = dchisq(u, df=2)
df = data.frame(x=u, p=v)
p = ggplot(df) +
geom_line(aes(x=x, y=p), size=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 0.5))
show(p)
We get the expected plot:
Now let's try simulating it and plotting the empirical distribution:
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_density(aes(x=x)) +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
We get an incorrect plot:
We can try to visualize the actual distribution:
library(ggplot2, dplyr, tidyr)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_point(aes(x=x, y=0.5), position=position_jitter(height=0.2), shape='.', alpha=1) +
theme_classic() +
coord_cartesian(xlim=c(0, 2), ylim=c(0, 1))
show(p)
And it seems to look correct, contrary to the density plot:
It seems like the problem has to do with kernels, and geom_density
does allow using different kernels. But they don't really correct the limit problem. For example, the code above with triangular
looks about the same:
Here's an idea of what I'm expecting to see (of course, I want a density, not a histogram):
library(ggplot2)
u = rchisq(10000, df=2)
df = data.frame(x=u)
p = ggplot(df) +
geom_histogram(aes(x=x), center=0.1, binwidth=0.2, fill='white', color='black') +
theme_classic() +
coord_cartesian(xlim=c(0, 2))
show(p)
r ggplot2 kernel-density
r ggplot2 kernel-density
edited Apr 10 at 23:11
Wassinger
asked Mar 24 at 23:08
WassingerWassinger
957
957
1
I'm a bit confused by your use ofgeom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.
– Marius
Mar 24 at 23:15
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21
add a comment |
1
I'm a bit confused by your use ofgeom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.
– Marius
Mar 24 at 23:15
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21
1
1
I'm a bit confused by your use of
geom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.– Marius
Mar 24 at 23:15
I'm a bit confused by your use of
geom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.– Marius
Mar 24 at 23:15
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21
add a comment |
1 Answer
1
active
oldest
votes
The usual kernel density methods have trouble when there is a constraint such as in this case for a density with only support above zero. The usual recommendation for handling this has been to use the logspline package:
install.packages("logspline")
library(logspline)
png(); fit <- logspline(rchisq(10000, 3))
plot(fit) ; dev.off()
If this needed to be done in the ggplot2 environment there is a dlogspline function:
densdf <- data.frame( y=dlogspline(seq(0,12,length=1000), fit),
x=seq(0,12,length=1000))
ggplot(densdf, aes(y=y,x=x))+geom_line()
Perhaps you were insisting on one with 2 degrees of freedom?
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329455%2fhow-to-make-density-plot-correctly-show-area-near-the-limits%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The usual kernel density methods have trouble when there is a constraint such as in this case for a density with only support above zero. The usual recommendation for handling this has been to use the logspline package:
install.packages("logspline")
library(logspline)
png(); fit <- logspline(rchisq(10000, 3))
plot(fit) ; dev.off()
If this needed to be done in the ggplot2 environment there is a dlogspline function:
densdf <- data.frame( y=dlogspline(seq(0,12,length=1000), fit),
x=seq(0,12,length=1000))
ggplot(densdf, aes(y=y,x=x))+geom_line()
Perhaps you were insisting on one with 2 degrees of freedom?
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
add a comment |
The usual kernel density methods have trouble when there is a constraint such as in this case for a density with only support above zero. The usual recommendation for handling this has been to use the logspline package:
install.packages("logspline")
library(logspline)
png(); fit <- logspline(rchisq(10000, 3))
plot(fit) ; dev.off()
If this needed to be done in the ggplot2 environment there is a dlogspline function:
densdf <- data.frame( y=dlogspline(seq(0,12,length=1000), fit),
x=seq(0,12,length=1000))
ggplot(densdf, aes(y=y,x=x))+geom_line()
Perhaps you were insisting on one with 2 degrees of freedom?
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
add a comment |
The usual kernel density methods have trouble when there is a constraint such as in this case for a density with only support above zero. The usual recommendation for handling this has been to use the logspline package:
install.packages("logspline")
library(logspline)
png(); fit <- logspline(rchisq(10000, 3))
plot(fit) ; dev.off()
If this needed to be done in the ggplot2 environment there is a dlogspline function:
densdf <- data.frame( y=dlogspline(seq(0,12,length=1000), fit),
x=seq(0,12,length=1000))
ggplot(densdf, aes(y=y,x=x))+geom_line()
Perhaps you were insisting on one with 2 degrees of freedom?
The usual kernel density methods have trouble when there is a constraint such as in this case for a density with only support above zero. The usual recommendation for handling this has been to use the logspline package:
install.packages("logspline")
library(logspline)
png(); fit <- logspline(rchisq(10000, 3))
plot(fit) ; dev.off()
If this needed to be done in the ggplot2 environment there is a dlogspline function:
densdf <- data.frame( y=dlogspline(seq(0,12,length=1000), fit),
x=seq(0,12,length=1000))
ggplot(densdf, aes(y=y,x=x))+geom_line()
Perhaps you were insisting on one with 2 degrees of freedom?
edited Apr 11 at 5:06
answered Mar 24 at 23:52
42-42-
218k16272409
218k16272409
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
add a comment |
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
This looks even less accurate than the density plot in my question.
– Wassinger
Apr 10 at 22:30
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
I plotted a distribution with three degrees of freedom. You plotted one with 2.
– 42-
Apr 11 at 5:04
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329455%2fhow-to-make-density-plot-correctly-show-area-near-the-limits%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I'm a bit confused by your use of
geom_violin
- it's usually used where you would use a boxplot, e.g. showing the distribution across multiple discrete categories. When I run the code I also get something that looks different to the image you posted.– Marius
Mar 24 at 23:15
@Marius I just pasted the wrong code by mistake, man. No need for the stats lecture.
– Wassinger
Apr 10 at 22:21