Convex hulls in scatter plots

734 views
Skip to first unread message

KDB

unread,
Mar 12, 2011, 11:47:24 AM3/12/11
to ggplot2
Hi everybody,

I was wondering if there is an elegant way to produce convex hulls for
several groups in scatterplots created with the ggplot2 package. I
found this post by learnr very helpful:
http://learnr.wordpress.com/2009/04/23/ggplot2-dont-try-this-with-excel/
The use of the geom_path() or geom_polygon() seem logical to
accomplish this, but the solution proposed in this blog doesn't seem
to work in my case. I have a quite large dataset and sometimes point
are connected in the way they are order in the dataset. I know several
packages who offer functions to calculate convex hulls for x and y
data (e.g. chull() ) in package grDevices). Is there some easy way to
integrate their use with ggplot ??? I guess if we can somehow select
the points from the main dataset who lie on the convex hull as
calculated by chull() or a similar functions and connect them with
lines (geom_path) or polygons (geom_polygon) shoudl do the trick.

I am a beginner in R and found ggplot quite straight forward and easy
to use, but i don't know how to do this.
Your help would be greatly appreciated.

Kenneth



Hadley Wickham

unread,
Mar 13, 2011, 1:11:03 PM3/13/11
to KDB, ggplot2
Hi Kenneth,

The basic idea is something like:

library(ggplot2)

df <- data.frame(x = rnorm(100), y = rnorm(100))
ggplot(df, aes(x, y)) + geom_point()

hull <- df[chull(df$x, df$y), ]
ggplot(df, aes(x, y)) +
geom_point() +
geom_polygon(data = hull, fill = NA, colour = "grey50")

If you have multiple groups, you'd need to use ddply or similar to
compute the convex hull for each group.

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

KDB

unread,
Mar 14, 2011, 8:33:33 AM3/14/11
to ggplot2
Hi Hadley,

Thanks for your help as i am now capable of making convex hulls for
the entire dataset.
I still haven't managed to create it for groups. Would it be possible
to give an example of the use of ddply in this context.

With friendly greetings,

Kenneth

Hadley Wickham

unread,
Mar 14, 2011, 9:10:05 AM3/14/11
to KDB, ggplot2
Hi Kenneth,

Something like this:

library(ggplot2)

df <- data.frame(x = rnorm(100), y = rnorm(100),
z = sample(letters[1:5], 100, rep = T))
ggplot(df, aes(x, y, colour = z)) + geom_point()

find_hull <- function(df) df[chull(df$x, df$y), ]
hulls <- ddply(df, "z", find_hull)

ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_polygon(data = hulls, fill = NA)


But be careful with the use of hulls. The tend to make clusters look
more distinct than they really are (due to the gestalt of
connectedness/containedness)

df$cl <- factor(kmeans(df[c("x", "y")], 3)$cluster)
hulls <- ddply(df, "cl", find_hull)

# No obvious clusters
ggplot(df, aes(x, y, colour = cl)) +
geom_point()

# Now clusters are really obvious!!
ggplot(df, aes(x, y, colour = cl)) +
geom_point() +
geom_polygon(data = hulls, fill = NA)

Hadley

http://had.co.nz/

KDB

unread,
Mar 15, 2011, 4:54:50 AM3/15/11
to ggplot2
Thanks Hadley, it work like a charm.

One additional question: i seem to run into trouble, when i want to
give the points in my scatterplot a shape according to one parameters
and the convex hulls to group points reflect another parameter ?
How can this be avoided ?

On Mar 14, 2:10 pm, Hadley Wickham <had...@rice.edu> wrote:
> Hi Kenneth,
>
Reply all
Reply to author
Forward
0 new messages