geom_boxplot with min,q1,med,q3,max specified

909 views
Skip to first unread message

Melissa Gray

unread,
Jun 27, 2014, 1:31:01 PM6/27/14
to ggp...@googlegroups.com
I'm attempting to follow this working example from the ggplot2 documentation to make a boxplot by specifying the five numbers for a box and whisker plot:

# Using precomputed statistics
# generate sample data
abc <- adply(matrix(rnorm(100), ncol = 5), 2, quantile, c(0, .25, .5, .75, 1))
b <- ggplot(abc, aes(x = X1, ymin = `0%`, lower = `25%`, middle = `50%`, upper = `75%`, ymax = `100%`))
b + geom_boxplot(stat = "identity")

I want to do the same thing with sample data of the form seen in the attached txt (sampledata.txt). My data doesn't have an X1 column and instead has row.names. Ideally I'd like to get the labels for each boxplot from the row.names but I'm not sure how. I attempted to bind a column similar to the X1 column in the example to my data to just get any boxes to show up. However it still didn't work. Below I've included my code and the error I'm getting. The plot that shows up has been attached as a pdf.


> library(ggplot2)
> boxin <- read.table("sampledata.txt", sep="", header = TRUE)
> x1 <- c(1,2,3)
> boxin <- cbind(boxin,x1)
> b <- ggplot(boxin, aes(x = x1, ymin = 'min', lower = 'q1', middle = 'med', upper = q3, ymax = 'max'))
> b + geom_boxplot(stat = "identity")
Warning messages:
1: In Ops.factor(x, from[1]) : - not meaningful for factors
2: In Ops.factor(x, from[1]) : - not meaningful for factors
3: In Ops.factor(x, from[1]) : - not meaningful for factors
4: In Ops.factor(x, from[1]) : - not meaningful for factors


I must be misunderstanding the parameters used in the example or something, as this seems simple enough. Any help would be greatly appreciated!

Melissa
sampledata.txt
tryingtoplot.pdf

Ben Bond-Lamberty

unread,
Jun 27, 2014, 1:49:25 PM6/27/14
to Melissa Gray, ggplot2
You shouldn't put quote around min, q1, med, or max in the aes() call.
Also x1 should be explicitly defined as a factor. So:

boxin$x1 <- factor(boxin$x1)
b <- ggplot(boxin, aes(x = x1, ymin = min, lower = q1, middle = med,
upper = q3, ymax = max))
b + geom_boxplot(stat = "identity")

Ben
> --
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ggplot2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ggplot2+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

eipi10

unread,
Jul 16, 2014, 1:54:39 PM7/16/14
to ggp...@googlegroups.com
You can also calculate and plot percentiles "on the fly" directly from your original data frame using stat_summary plus a function to calculate the percentiles. Here's a small self-contained example:

# Function to calculate boxplot percentiles.
bp.pctiles <- function(x) {
  r <- quantile(x, probs = c(0, 0.25, 0.5, 0.75, 1), na.rm=TRUE)
  names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
  r
}

# Fake data
dat = data.frame(y=rnorm(1000), group=rep(LETTERS[1:4], 250))

# Boxplot with max, min, 25th, 50th, and 75th percentiles, by group
ggplot(dat, aes(x=group, y=y)) +
  stat_summary(fun.data=bp.pctiles, geom="boxplot")

Joel
Reply all
Reply to author
Forward
0 new messages