Superpose two layers of points with different colour aesthetics

6,762 views
Skip to first unread message

JiHO

unread,
Sep 7, 2011, 12:18:31 PM9/7/11
to ggplot2
Hi all,

I have a dataset which results from a clustering analysis and which contains:
- x,y: coordinates of points
- group: number of the cluster assigned to the point (2 clusters total)
- prob: probability to be in cluster 1 (since there are two clusters,
the probability to be in cluster 2 is the complement to 1)

I want to make a plot with one layer of large points which are
coloured according to the number of the cluster (column group,
discrete scale) and one layer, above, with slightly smaller points,
which are coloured according to the probability to be in cluster 1
(column prob, continuous scale).

The expected result is a plot with point coloured with a gradient
according to prob and with an "outline" telling which group was
eventually chosen. It allows me to show both the final result of the
clustering (group) and the fact that things are actually a bit more
continuous and smooth.

In code:

d <- expand.grid(x=1:5, y=1:5)
d$prob <- (d$x/5 + d$y/5)/2
d$group <- d$prob > 0.5

library("ggplot2")
ggplot(d) + geom_point(aes(x=x, y=y, colour=group), size=6)
ggplot(d) + geom_point(aes(x=x, y=y, colour=prob), size=4)
# OK

ggplot(d) + geom_point(aes(x=x, y=y, colour=group), size=6) +
geom_point(aes(x=x, y=y, colour=prob), size=4)
# I get: Error: Continuous variable () supplied to discrete scale_hue.

I realize this is because there are two colour scales, but there are
also two layers, which could even come from different datasets. In
addition, one colour scale is discrete while the other is continuous
so there is no confusion possible and they could have separate
legends. Why is this not possible?

An alternative would be to have both colour and fill arguments to
geom_point, as with most other geoms. But, while this has an effect on
the legend, it does not change on the plot.

ggplot(d) + geom_point(aes(x=x, y=y, colour=group, fill=prob), size=4)

Can anybody see a workaround? Thanks in advance!

JiHO
---
http://maururu.net

Ben Bolker

unread,
Sep 7, 2011, 12:31:54 PM9/7/11
to ggp...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

You need to use a point type which allows separate specification of
edge and fill colours:

ggplot(d) + geom_point(aes(x=x, y=y, colour=group, fill=prob), size=6,
shape=21)

I would like to increase the width of the border but it's not
immediately obvious how.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOZ5x6AAoJED2whTVMEyK9Y6cIAJjgILgXRdmmv/HYetMvWpnJ
zdoiVuRkSIV7l/hMZHB/yPwXcPFMKcKkN1OJFch4vkL90GJvivIFG59eks4EBEWV
Pg1W6U8NOVUSR4jNOCqGfmarYi86pe03W/JIKnxoE5Yi+wzjDMcxesieqhBHdrRd
A09JhAs18ldYp4hrJ2HpcXZV+F9F5kFhToSg/wd/xQC/A0W4Z5/pklA4O9TZ1p0L
f5kz/qIIXPI0aodbvxZQzc1mlPtk0V4LTb8mWQe3o1cXcnkeNY+uEETqzwUMSSVj
u8QkNxGXNyPptHdv7jsc1QSm1hLV1CSETtuvfkLbbrjU2Xol2m3Q7CY1MBuJD1k=
=0Gkv
-----END PGP SIGNATURE-----

Dennis Murphy

unread,
Sep 7, 2011, 3:05:59 PM9/7/11
to JiHO, ggplot2
Hi:

Since you only have two groups, why don't you just give them different
shapes and then use the probability scale to color them? This is just
a variant on Ben's idea, but the distinction between groups is more
apparent.

ggplot(d) + geom_point(aes(x=x, y=y, shape=group, colour=prob), size=6)

HTH,
Dennis

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

JiHO

unread,
Sep 8, 2011, 8:12:04 AM9/8/11
to Dennis Murphy, ggplot2
Thanks a lot for both answers. I had the image of the graph I wanted
in my mind so I did not look for alternatives but of course the
shape/colour combination works.
Thanks to Ben I also understand now that colour and fill work with
geom_point as they do for the other geoms.

However, it seems to me that what I intended to do should be possible,
since there are two separate colour scales. So maybe it is more a
question for Hadley: is it by design? because it is too complicated to
implement? an oversight?

Thanks again,

JiHO
---
http://maururu.net

jrandall

unread,
Sep 13, 2011, 4:25:42 AM9/13/11
to ggplot2
I think this ought to be very close to what you asked for:

ggplot(d) + geom_point(aes(x=x, y=y, fill=group), colour=alpha("black",
0), size=6, shape=21) + geom_point(aes(x=x, y=y, colour=prob), size=4)
+ scale_fill_manual(values=c("FALSE"="#3B4FB8","TRUE"="#B71B1A")) +
scale_colour_gradient(low = "#3B4FB8", high = "#B71B1A")

I believe you can only have one scale_colour and one scale_fill in a
single ggplot2 plot, so you can't have multiple colour scales which is
why your first attempt didn't work. Each aesthetic in a ggplot2 plot
must have exactly one scale associated with it and all layers need to
share the same scale for each aesthetic.

You should be able to change the colours using the parameters to
scale_fill and scale_colour (I've set the TRUE/FALSE colours in the
above example to be the ends of the continuous scale so they blend
together, but that may not be what you want). I've also made the
border of the outer symbol disappear using the alpha channel, but you
can change that to another static color if you want a border.

Josh.

Kohske Takahashi

unread,
Sep 13, 2011, 4:54:51 AM9/13/11
to jrandall, ggplot2
Hi,

here is a workaround but not elegant.

cr <- colorRamp(c("blue", "red"))
d$pc <- rgb(cr(d$prob), max=255)
d$pg <- ifelse(d$group, "black", "white")

library("ggplot2")
ggplot(d) +

geom_point(aes(x=x, y=y, colour=pg), size=8) +
geom_point(aes(x=x, y=y, colour=pc), size=4) +
scale_color_identity(legend = FALSE)


--
Kohske Takahashi <takahash...@gmail.com>

Research Center for Advanced Science and Technology,
The University of  Tokyo, Japan.
http://www.fennel.rcast.u-tokyo.ac.jp/profilee_ktakahashi.html

Casey

unread,
Sep 13, 2011, 11:22:14 AM9/13/11
to ggplot2
Hello All,

This wouldn't work with more than two clusters.
But I think d$group==TRUE can be change to d$group2==1.0

Then both colour scales are continuous.

d <- expand.grid(x=1:5, y=1:5)
d$prob <- (d$x/5 + d$y/5)/2
d$group <- d$prob > 0.5
library("ggplot2")
# works for the current example, assuming that the prob > 0.5 is what
is used to determine which cluster.
d$group2<-apply(as.matrix(d$prob),1,function(x){if ( x > 0.5 ){return
(1.0)}else{return (0)}})
# probably better, if there is a more complicated algorithm, or
change of the .5 probability.
d$group2<-apply(as.matrix(d$group),1,function(x){if ( x ){return
(1.0)}else{return (0)}})
ggplot(d) + geom_point(aes(x=x, y=y, colour=group2), size=6) +
geom_point(aes(x=x, y=y, colour=prob), size=4)


Thanks,
Casey
Reply all
Reply to author
Forward
0 new messages