Hi:
This type of graphic, which has come up in this group several times in the past, illustrates one of the limitations of the grammar of graphics. In this example, boxplots are dodged by color within cut by default [on the geom_boxplot() help page, it is documented that mapped aesthetics, when factors, are 'automatically dodged' (see the examples section). The desire is to dodge the points in the same way *and* to jitter them. The problem is that both dodging and jitter are position adjustments, and only one is allowed - in other words, it's a nontrivial balancing act to get the two working in concert and it depends on the configuration of the graph. What works in one example may not map to another.
In a fairly simple case, you can get away with using position_jitter(); here's an example from the 0.9.0 transition guide that should work in 0.8.9 as well:
ggplot(mtcars, aes(x = factor(vs), y = mpg)) +
geom_boxplot(aes(fill = factor(vs)), alpha = 0.3,
outlier.colour = NA) +
geom_point(position = position_jitter(width = 0.05),
colour = "blue", fill = "blue") +
labs(x = "vs", y = "mpg", fill = "vs")
However, if you translate this to the present example, it doesn't work because the points are jittered relative to the levels of cut, not to the levels of color nested within cut. The 'automatic' dodging of the levels of color within cut that takes place in geom_boxplot() does not transfer to geom_point(). A (single) position adjustment of points is allowed; one that works (with a little tweaking of the width) is the following:
set.seed(1410)
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
dia.DE <- dsmall[(dsmall$color=="D"|dsmall$color=="E")
& (dsmall$cut=="Ideal"|dsmall$cut=="Premium"|dsmall$cut=="Very Good"),]
ggplot(dia.DE, aes(y = price, x = cut, fill = color)) +
geom_point(aes(colour = color), position = position_dodge(width = 0.75)) +
geom_boxplot(alpha = 0.2, outlier.colour = NA)
If you replace geom_point() with geom_jitter(), the same plot should obtain, as geom_jitter() will ignore the position = position_dodge() argument. You can dodge or jitter, but you can't dodge _and_ jitter.
In 0.9.0, geom_dotplot() has a dodge = argument that allows one to specify the dodging variable; see the examples in section 3.3 of the transition guide for some illustrations as well as the geom_dotplot() help page:
http://cloud.github.com/downloads/hadley/ggplot2/guide-col.pdfOne approach to this problem using geom_dotplot() is the following:
# 0.9.0+ only:
ggplot(dia.DE, aes(y = price, x = cut, fill = color)) +
geom_boxplot(alpha = 0.2, outlier.colour = NA) +
geom_dotplot(aes(colour = color), binaxis = 'y',
stackdir = 'centerwhole', position = 'dodge')
Unfortunately, the center of the point stacks doesn't align perfectly with the whiskers of the box plots in this example. It's possible that default dodging width differs between geom_boxplot() and geom_dotplot(). Winston or someone else might be able to provide a better selection of arguments that gets the alignment right - I haven't been able to figure it out yet despite several guesses that didn't work.
If some enterprising soul is looking for a geom to create, this would be a good candidate :)
Dennis