Hello,
I am new to this list, so please forgive me if a similar question has already been asked. I am the creator of the R package EnvStats (https://cran.r-project.org/web/packages/EnvStats/EnvStats.pdf). There is a function I use quite often called stripChart. I am just starting to learn ggplot2, and have spent the past several days poring over Hadley's book, Winston’s book, StackOverflow, and other resources in an attempt to create a geom that approximates what stripChart does. I am unable to figure out how to put summary statistics below the x-axis tick marks and also at the top of the plot (outside the plotting region). Here is a simple example using the built-in dataset mtcars:
library(EnvStats)
stripChart(mpg ~ cyl, data = mtcars, col = 1:3,
xlab = "Number of Cylinders", ylab = "Miles per Gallon", p.value = TRUE)
Here is an early draft of a geom to try to reproduce most of the functionality of stripChart:
geom_stripchart <-
function(..., x.nudge = 0.3,
jitter.params = list(width = 0.3, height = 0),
mean.params = list(size = 2, position = position_nudge(x = x.nudge)),
errorbar.params = list(size = 1, width = 0.1, position = position_nudge(x = x.nudge)),
n.text = TRUE, mean.sd.text = TRUE, p.value = FALSE) {
params <- list(...)
jitter.params <- modifyList(params, jitter.params)
mean.params <- modifyList(params, mean.params)
errorbar.params <- modifyList(params, errorbar.params)
jitter <- do.call("geom_jitter", jitter.params)
mean <- do.call("stat_summary", modifyList(
list(fun.y = "mean", geom = "point"),
mean.params)
)
errorbar <- do.call("stat_summary", modifyList(
list(fun.data = "mean_cl_normal", geom = "errorbar"),
errorbar.params)
)
stripchart.list <- list(
jitter,
theme(legend.position = "none"),
mean,
errorbar
)
if(n.text || mean.sd.text) {
# Compute summary statistics (sample size, mean, SD) here?
if(n.text) {
# Add information to stripchart.list to
# compute sample size per group and add text below x-axis
}
if(mean.sd.text) {
# Add information to stripchart.list to
# compute mean and SD and add text above top of plotting region
}
}
if(p.value) {
# Add information to stripchart.list to
# compute p-value (and 95% CI for difference if only 2 groups)
# and add text above top of plotting region
}
stripchart.list
}
library(ggplot2)
dev.new()
p <- ggplot(mtcars, aes(x = factor(cyl), y = mpg, color = factor(cyl)))
p + geom_stripchart() +
xlab("Number of Cylinders") +
ylab("Miles per Gallon")
You can see that the plots are pretty much the same. The problem I’m having is figuring out how to add the sample size below each group, and to add the means and standard deviations at the top, along with the result of the ANOVA test (ignoring the issue of unequal variances at this point). I know it is straightforward to compute summary statistics and then plot them as points or text *within* the plotting area, but I don’t want to do that.
Would appreciate any help or
direction anyone can give me. Thanks!
I had already found examples showing how to place text outside the plot (e.g., using annotation_custom():
http://stackoverflow.com/questions/31079210/how-can-i-add-annotations-below-the-x-axis-in-ggplot2). The problem is that the examples show how to do this where the user has pre-defined what the annotation is. My problem is that within geom_stripchart, I have to compute summary statistics and test results based on the data that was defined in the call to ggplot(), and then pass those results to annotation_custom(). I don’t know how to get at the x and y variables that are defined in the call to ggplot().
--Steve Millard
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2
---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
1.
I want the user to be able to use facet_wrap() or facet_grid()
in addition to the geom_stripchart() function. For example:
p <-
ggplot(mtcars, aes(x = factor(cyl), y = mpg, color = factor(cyl)))
p + geom_stripchart() + facet_wrap(~am) +
xlab("Number of Cylinders") +
ylab("Miles per Gallon")
so I don’t want to mix text that has to do with the mean and SD in with values
of the factor that is being used for faceting because I don’t want the reader
to get confused. But maybe if I separate the mean and SD enough from the
level of the faceting variable that would work (not sure, however, that I could
call a labeller function within the geom_stripchart() function and then still
have an additional call to facet_wrap() work outside of geom_stripchart() ).
2. Even if I could figure out a way to use facets and labeller functions to put the information on top of the plot, I still have the problem of computing the summary statistics in the first place since I don’t know how to get at what has been defined as the x and y variables in the call to aes in the call to ggplot.