Re: Plotting Distributions with Density plots

47 views
Skip to first unread message
Message has been deleted

Ben Bond-Lamberty

unread,
Nov 19, 2012, 11:01:08 AM11/19/12
to ggp...@googlegroups.com
Hi Sarita,

You didn't include any data with your post, so I don't exactly why,
but basically the density plots are spread from -1 to 1 because the
underlying data are. I would suggest running commands like summary(z)
or print(subset(z,value<0)) to see what values <0 there are, and to
figure out what's going on.

Side note: instead of
> pdf("files_all.pdf")
> dev.off()

you can simply say "ggsave('files_all.pdf')" after plotting.

Ben

On Mon, Nov 19, 2012 at 10:52 AM, Sarita Paranjpe <heys...@gmail.com> wrote:
> Hello,
> I am try to plot distribution of three data sets, where the values are
> between 0 and 1. The distributions are spread over negative co-ordinates as
> well on the x-axis. Is it because of smoothing ? If so, how can I avoid
> that. Please find the code below and the graph output attached.
>
> a = read.table("file1txt", sep="\t", header=T)
> b = read.table("file2.txt", sep="\t", header=T)
> c = read.table("file3.txt", sep="\t", header=T)
> a1 = as.matrix(a[,4])
> b1 = as.matrix(b[,4])
> c1 = as.matrix(c[,4])
> cbind.fill<-function(...){
> nm <- list(...)
> nm<-lapply(nm, as.matrix)
> n <- max(sapply(nm, nrow))
> do.call(cbind, lapply(nm, function (x)
> rbind(x, matrix(, n-nrow(x), ncol(x)))))
> }
> bound = cbind.fill(a1,b1,c1)
> y = data.frame(bound)
> z = melt(y)
> pdf("files_all.pdf")
> s = ggplot(z, aes(x=value, fill=variable)) + geom_density(alpha=0.9) +
> xlim(c(-1,1))
> s + scale_fill_manual(values = c("#FF7F00", "#E41A1C", "#984EA3"))
> dev.off()
>
> Regards,
> SP
>
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
Message has been deleted

Ben Bond-Lamberty

unread,
Nov 19, 2012, 1:01:20 PM11/19/12
to ggp...@googlegroups.com
Thanks for sending the data, Sarita. You're right, there are no values
<0...but you're applying an xlim, I see, which probably isn't doing
what you want. That's tossing out data not in the (-1,1) range;
compare to

> ggplot(z, aes(x=value, fill=variable)) + geom_density(alpha=0.9) + coord_cartesian(xlim=c(-1,1))

So: yes, there's definitely an artifact (negative x values) being
produced when an xlim is applied; I don't know if this is an expected
consequence of the stat_density algorithm, or a quirk in the
geom_density display code. But you probably want to be using
coord_cartesian, which works fine. If you *do* want to subset your
data, I'd do it before the call to ggplot in this case.

Hope this is useful,

Ben


On Mon, Nov 19, 2012 at 12:24 PM, SP <heys...@gmail.com> wrote:
> Hi Ben,
> I am sure the values I want to plot are between 0 and 1. Please find the
> data attached and summary below :
>> summary(z)
> variable value
> X1:20049 Min. :0.000
> X2:20049 1st Qu.:0.000
> X3:20049 Median :0.176
> Mean :0.280
> 3rd Qu.:0.534
> Max. :1.005
> NA's :28556
> Thanks,
> SP

Winston Chang

unread,
Nov 19, 2012, 2:39:43 PM11/19/12
to SP, ggplot2
If you use geom_density(trim=TRUE), it should limit the density estimate to the range of the data. (This is mentioned in stat_density help page.)

-Winston

Reply all
Reply to author
Forward
Message has been deleted
0 new messages