help with boxplot outliers/ggplot magic

794 views
Skip to first unread message

Ben Bolker

unread,
Feb 5, 2011, 9:14:35 PM2/5/11
to ggplot2

I have tried to take up the challenge of changing the colour of
boxplot outliers, mentioned previously on this list.

Rather than making outlier.colour another aesthetic that could
possibly be set, it seemed to me to make sense (and to be easier to
hack) to set the default behavior that the outliers would be the same
colour as the colour mapped to the boxplot itself. This could then be
overridden to make the outlier colours a single fixed colour. (It would
seem pretty baroque to map the outlier colours to something other than
the same colour as the corresponding box ...)

I tried to implement this as follows in geom-boxplot.r:

draw <- function(., data, ..., outlier.colour,
outlier.shape = 16, outlier.size = 2) {

outlier.colour <- if (missing(outlier.colour)) with(data,colour) else
I(outlier.colour)

...

later in the function I inserted a debugging statement near the place
where the outliers actually get drawn

cat("outlier colour: ",with(data,outlier.colour),"\n")
outliers_grob <- with(data,
GeomPoint$draw(data.frame(
y = outliers[[1]], x = x[rep(1, length(outliers[[1]]))],
colour=outlier.colour, shape = outlier.shape, alpha = 1,
size = outlier.size, fill = NA), ...
)

According to the cat() statement, the outlier colour is getting set
the way I want it to -- but the plot is not coming out as I expect.

The file read in below contains the hacked version of GeomBoxplot
along with a statement

assignInNamespace("GeomBoxplot",GeomBoxplot,"ggplot2")


set.seed(1002)
z <- rcauchy(25)
library(ggplot2)
source(url("http://www.math.mcmaster.ca/bolker/R/misc/geom-boxplot.r"))
qplot(1,z,geom="boxplot")
qplot(1,z,geom="boxplot",colour="red")
qplot(1,z,geom="boxplot",colour="red",outlier.colour="blue")

I searched everywhere else within the code for "outlier.colour" to try
to figure out why this wasn't working -- no luck so far.

Any hints would be greatly appreciated.

Ben Bolker

Hadley Wickham

unread,
Feb 5, 2011, 10:00:02 PM2/5/11
to Ben Bolker, ggplot2
Hi Ben,

A few points:

* you're better off using outlier.colour = NULL, then checking with
is.null instead of missing - this makes it easier to see what's going
on regardless of how draw is called

* using with makes it confusing to understand where variables are
being pulled from - the next version of ggplot will make much less use
of with

So all in all, I'd probably do something like this:


if (!is.null(data$outliers) && length(data$outliers[[1]] >= 1)) {
outlier_data <- data.frame(
y = data$outliers[[1]],
x = data$x[1],
colour = outlier.colour %||% data$colour[1],
shape = outlier.shape %||% data$shape[1],
size = outlier.size %||% data$size[1],
fill = NA,
alpha = 1,
stringsAsFactors = FALSE)
outliers_grob <- GeomPoint$draw(outlier_data, ...)
} else {
outliers_grob <- NULL
}

But don't have time to try it out right now.

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Ben Bolker

unread,
Feb 5, 2011, 10:34:39 PM2/5/11
to Hadley Wickham, ggplot2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sweet! Works as desired.

Posted a revised version to
<http://www.math.mcmaster.ca/bolker/R/misc/geom-boxplot.r>; it contains
the "assignInNamespace" call at the end.

A more sensible thing to do for anyone who wants to use this in the
interim (perhaps it will get rolled into a future version of ggplot?) is
to download the file, remove the "assignInNamespace" call, download the
ggplot2 source, unzip it, replace R/geom-boxplot.r, and install from
source. But in a pinch you could source() the file as is.

This patch *does* change the default behavior, but it seems much
easier to have the default be "match the box colour" and be able to
override to make all the outliers black than vice versa ...

Updated test code:

set.seed(1002)
z <- rcauchy(50)
d <- data.frame(z,f=factor(rep(1:2,each=25)))
library(ggplot2)
## source(url("http://www.math.mcmaster.ca/bolker/R/misc/geom-boxplot.r"))
## source("geom-boxplot.r")


qplot(1,z,geom="boxplot")
qplot(1,z,geom="boxplot",colour="red")
qplot(1,z,geom="boxplot",colour="red",outlier.colour="blue")

qplot(1,z,data=d,geom="boxplot",colour=f,outlier.colour="blue")
qplot(1,z,data=d,geom="boxplot",colour=f)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1N0H8ACgkQc5UpGjwzenOvYQCdFM0bLOu/64LEp+VVqKZrhJ69
LTkAn16tiMG3ci6C6tniJ2I0XBhmZY7d
=k0fc
-----END PGP SIGNATURE-----

Hadley Wickham

unread,
Feb 5, 2011, 10:59:35 PM2/5/11
to Ben Bolker, ggplot2
Thanks Ben - it'll be in the next version of ggplot2.
Hadley

--

Reply all
Reply to author
Forward
0 new messages