--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2
facet_grid is deleting duplicate data before the stats are run (I
think). Consider these much simpler examples:
DF <- expand.grid(alpha = letters[1:3],
beta = LETTERS[1:3])
DF <- cbind(DF, n = 1:27)
p <- ggplot(DF, aes(alpha, n)) + geom_boxplot()
pg <- p + facet_grid(.~beta)
pw <- p + facet_wrap(~beta, ncol=3)
DF2 <- DF[c(rep(1,10),2:27),]
p2 <- ggplot(DF2, aes(alpha, n)) + geom_boxplot()
p2g <- p2 + facet_grid(.~beta)
p2w <- p2 + facet_wrap(~beta, ncol=3)
pg and pw look the same; p2g looks like pg and pw; p2w does not. p2g
SHOULD look like p2w, not pg/pw.
I got a hint of what was happening by looking at the output of all.equal
on the results of ggplot_build on the original diamond plots; the
results of the stats were different, but what really tipped me off was
that the list of the outlier points were different lengths. However,
when looking at the points, some were listed multiple times (for wrap)
and there were no duplicates (grid). From that I built this example
which shows the difference dramatically.
I've not looked at the code base to see where this is happening.
>> To post: email ggplot2-/JYPxA39Uh5...@public.gmane.org
>> To unsubscribe: email ggplot2+unsubscribe-/JYPxA39Uh5...@public.gmane.org
>> More options: http://groups.google.com/group/ggplot2
>>
>
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More evidence that it is in the facet and not something in boxplot:
ggplot(DF, aes(alpha, n)) + stat_summary(aes(colour=beta),
fun.data="mean_cl_normal", position=position_dodge(width=0.2))
ggplot(DF2, aes(alpha, n)) + stat_summary(aes(colour=beta),
fun.data="mean_cl_normal", position=position_dodge(width=0.2))
ggplot(DF, aes(alpha, n)) + stat_summary(fun.data="mean_cl_normal") +
facet_grid(.~beta)
ggplot(DF, aes(alpha, n)) + stat_summary(fun.data="mean_cl_normal") +
facet_wrap(~beta, ncol=3)
ggplot(DF2, aes(alpha, n)) + stat_summary(fun.data="mean_cl_normal") +
facet_grid(.~beta)
ggplot(DF2, aes(alpha, n)) + stat_summary(fun.data="mean_cl_normal") +
facet_wrap(~beta, ncol=3)
facet_grid with data with duplicated rows (the DF2 sets) looks just like
those without, while facet_wrap with the duplicated data has the correct
summary (which agrees with what is drawn when differentiation is just by
colour and not facet).
> I got a hint of what was happening by looking at the output of all.equal
> on the results of ggplot_build on the original diamond plots; the
> results of the stats were different, but what really tipped me off was
> that the list of the outlier points were different lengths. However,
> when looking at the points, some were listed multiple times (for wrap)
> and there were no duplicates (grid). From that I built this example
> which shows the difference dramatically.
>
> I've not looked at the code base to see where this is happening.
>
>> On Mon, Mar 12, 2012 at 2:15 PM,
>> Ben<langbnj-gM/Ye1E23mwN+BqQ9rBEUg-X...@public.gmane.org>
>>> ggplot2-/JYPxA39Uh5TLH3MbocFFw-...@public.gmane.org
>>> To unsubscribe: email
>>> ggplot2+unsubscribe-/JYPxA39Uh5TLH3MbocFFw-...@public.gmane.org
This is consistent with what I've found (see my other posts in this thread):
> median(subset(diamonds[!duplicated(diamonds),], cut=="Premium" &
clarity=="VVS2")$carat)
[1] 0.45
> median(subset(diamonds, cut=="Premium" & clarity=="VVS2")$carat)
[1] 0.455
Also, you can get these numbers out of the plot itself (not just by
back-reading from the PDF):
> ggplot_build(ggplot(diamonds, aes(clarity, carat)) + geom_boxplot() +
facet_grid(.~cut))[[1]][[1]][30,3]
[1] 0.45
> ggplot_build(ggplot(diamonds, aes(clarity, carat)) + geom_boxplot() +
facet_wrap(~cut, ncol=3))[[1]][[1]][30,3]
[1] 0.455
(figuring out it was row 30 column 3 was experimenting; the whole data
set is large to print out and I didn't want to flood the list. You can
explore it more yourself.)
> Cheers,
> Ben
>>> To post: email ggplot2-/JYPxA39Uh5...@public.gmane.org
>>> To unsubscribe: email ggplot2+unsubscribe-/JYPxA39Uh5...@public.gmane.org
Cheers,
Ben
To post: email ggplot2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.orgMore options: http://groups.google.com/group/ggplot2
To unsubscribe: email ggplot2+unsubscribe-/JYPxA39Uh5TLH3M...@public.gmane.org
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2
IIRC, this is relevant to previous discussion of removing duplicates
by facet_grid,
I cannot find pointer to it though...
Here is a workaround:
DF2 <- DF[c(rep(1,10),2:27),]
DF2$id <- 1:nrow(DF2)
p2 <- ggplot(DF2, aes(alpha, n)) + geom_boxplot()
p2g <- p2 + facet_grid(.~beta)
p2w <- p2 + facet_wrap(~beta, ncol=3)
kohske
2012年3月13日9:03 Benjamin Lang <lan...@googlemail.com>:
>>>>> To post: email ggplot2-/JYPxA39Uh5...@public.gmane.org
>>>>> To unsubscribe: email
>>>>> ggplot2+unsubscribe-/JYPxA39Uh5...@public.gmane.org
>>>>> More options: http://groups.google.com/group/ggplot2
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Brian S. Diggs, PhD
>> Senior Research Associate, Department of Surgery
>> Oregon Health & Science University
>>
>> --
>> You received this message because you are subscribed to the ggplot2
>> mailing list.
>> Please provide a reproducible example: http://gist.github.com/270442
>>
>> To post: email ggp...@googlegroups.com
>> To unsubscribe: email ggplot2+u...@googlegroups.com
>> More options: http://groups.google.com/group/ggplot2
>
>
>
>
> --
> Benjamin Lang
>
> MRC Laboratory of Molecular Biology
> Regulatory Genomics & Systems Biology (Dr. M. Madan Babu)
> Herchel Smith Research Student, University of Cambridge
>
> bl...@mrc-lmb.cam.ac.uk
>
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
--
--
Kohske Takahashi <takahash...@gmail.com>
Research Center for Advanced Science and Technology,
The University of Tokyo, Japan.
http://www.fennel.rcast.u-tokyo.ac.jp/profilee_ktakahashi.html
https://groups.google.com/group/ggplot2/browse_thread/thread/3728b94f783d5963?pli=1
kohske
2012年3月13日11:06 Kohske Takahashi <takahash...@gmail.com>:
2012年3月13日11:11 Kohske Takahashi <takahash...@gmail.com>: