overlaying two scatterplots

7,974 views
Skip to first unread message

Anatole

unread,
Mar 23, 2010, 8:04:00 PM3/23/10
to ggplot2
Hi,

I was wondering how I could overlay two scatter plots in one plot. The
reason I ask is I have two different data sets and I would like to
overlay them on top of each other. Is it possible to do that? I tried
to create two ggplot objects ('p' and 'd') and then do

p + d + geom_point()

but I am not sure if that s the right way to do this. Another problem
with this approach is I am not sure how to use scale_colour_brewer
function for each one separately.

thanx
Anatole

Ista Zahn

unread,
Mar 23, 2010, 9:30:55 PM3/23/10
to Anatole, ggplot2
Hi Anatole,

I think it works like this: given dat1 and dat2 are dataframes with
the same varables X and Y,

p <- ggplot(dat1, aes(x=X, y=Y)) + geom_point()
p + geom_point(data=dat2)

But it's almost always better to put the data in the same data.frame:

dat <- rbind(dat1, dat2)
dat$dataset <- factor(c(rep("dat1", dim(dat1)[1]), rep("dat2", dim(dat2)[1])))
ggplot(dat, aes(x=X, y=Y, shape=dataset)) + geom_point()

-Ista

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> To post to this group, send email to ggp...@googlegroups.com
> To unsubscribe from this group, send email to
> ggplot2+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/ggplot2
>
> To unsubscribe from this group, send email to ggplot2+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
>

--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Anatole

unread,
Mar 23, 2010, 9:38:07 PM3/23/10
to ggplot2
Thank you Ista, I do have them in the same table now and I am more
than happy to use them as it is but my main issue here is I would like
to specify to different coloring scheme for each...let s say light to
dark blue for dataset1 and light to dark red for dataset2....That is
really the main reason I am trying to break them off but if you (or
anyone else) knows how to do that while keeping them in the same table
I would be very thankful for it.

thanx again,
Anatole

Ista Zahn

unread,
Mar 23, 2010, 11:10:25 PM3/23/10
to Anatole, ggplot2
Well that's the thing, it's actually easier to set the color if they
are in the same table. I used shape to distinguish them in my example,
but it's just as easy to use color -- just change "shape" to "color":

ggplot(dat, aes(x=X, y=Y, color=dataset)) + geom_point()

-Ista

Anatole

unread,
Mar 24, 2010, 4:02:21 AM3/24/10
to ggplot2
thanx Ista, but this solve the problem because that would only make
the two different sets to have two different color....what I am trying
to do is once I assign a color to each dataset I want to make a
gradient of colors for each....Just think of it as you have a mix of
two different cultures (americans and hispanics) and within each
people have different body weights...What I want to do is make
american be blue color and hispanics red color, but I also want
americans to have a different shade of blue based on their body weight
(the light americans will be light blue and the heavy americans will
be darker blue)...the same for hispanics except for this group instead
of different shades of blue I will have them as different shades of
red.

The code you wrote for me will only make americans blue and hispanics
red but it wont solve the shades of colors within each group. Any idea
what I can do?

thanx
anatole

Xie Chao

unread,
Mar 24, 2010, 4:22:24 AM3/24/10
to Anatole, ggplot2
how about:

qplot(x, y, data=df, colour = american.or.hispanic, alpha = body.weight)

perhaps you also want to look at scale_gradient2 (by encoding body
weights of the two groups to two gradients)?

Xie Chao

Anatole

unread,
Mar 24, 2010, 11:49:48 AM3/24/10
to ggplot2
Thanx Xie,
I tried that and it is giving me what I needed except the colors are
really faded but I think I would be able to make it work if I try hard
enough on that....It s just surprising to me that I am not able to use
ggplot2 to customize a plot where you have a mix of two factors
(quantitative and qualitative)...I always heard that the strength of
this package was the ability and flexibility to visualize multivariate
data in a friendlier manner than what lattice package offers. Thanx so
much for your reply.

best,
Anatole


On Mar 24, 1:22 am, Xie Chao <xiech...@gmail.com> wrote:
> how about:
>
> qplot(x, y, data=df, colour = american.or.hispanic, alpha = body.weight)
>
> perhaps you also want to look at scale_gradient2 (by encoding body
> weights of the two groups to two gradients)?
>
> Xie Chao
>

hadley wickham

unread,
Mar 24, 2010, 12:06:56 PM3/24/10
to Anatole, ggplot2
> I tried that and it is giving me what I needed except the colors are
> really faded but I think I would be able to make it work if I try hard
> enough on that....It s just surprising to me that I am not able to use
> ggplot2 to customize a plot where you have a mix of two factors
> (quantitative and qualitative)...I always heard that the strength of
> this package was the ability and flexibility to visualize multivariate
> data in a friendlier manner than what lattice package offers. Thanx so
> much for your reply.

Yes, but you are asking for a bizarre plot in which will make it
difficult to compare the weights of different races. You have not said
why you want both race and colour to be mapped to colour instead of
using size and colour or shape and colour, either of which would be
easier to read.

Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

baptiste auguie

unread,
Mar 24, 2010, 12:15:49 PM3/24/10
to hadley wickham, Anatole, ggplot2
It seems to me that the hcl colour scale (issue #26
http://github.com/hadley/ggplot2/issuesearch?state=open&q=scale#issue/26
) could help for this particular problem. One variable would be mapped
to hue, the other one to chroma for example. I wonder if this concept
could be extended to other colour scales such as scale_gradient_n,
following different paths in the colour space.

Best,

baptiste

Anatole

unread,
Mar 24, 2010, 12:22:19 PM3/24/10
to ggplot2
Well, I didn't mean to provoke anyone here certainly not by expressing
my opinion and if I did I am sorry...The reason is I am plotting it
for my publication where we publish genomewide assocition study for
two different phenotypes....One of the phenotypes has overwhelmingly
large number of associations (10K) and the other only has 600. Each
association signal for either of datasets is a quantitative trait (or
at least the -log of pvalue for it is). I have tried different ways of
differentiating these two on the same plot but I thing the best
contrast comes out when I combine the following three: 1) Size (make
the smaller set to be shown with larger symbols), 2) Shape, 3) Color
(make the smaller set with a brighter color than the bigger set).
However. the Color option is a bit tricky because in each datasets you
have a spectrum of week and strong associations in different parts of
genome. So I decided to make each two different color and within each
color create a gradient. This can all be done using the good old
fashion "plot" and "points" functions in the graphics package but I
thought I would use ggplot2 because gives you more flexibility.

I hope that explains why I am trying to do what I was asking people
earlier.

Best,
Anatole

Anatole

unread,
Mar 24, 2010, 12:27:03 PM3/24/10
to ggplot2
baptiste, I clicked on the link and I got a blank page...how do I get
hcl colour scale?

On Mar 24, 9:15 am, baptiste auguie <bapt4...@googlemail.com> wrote:
> It seems to me that the hcl colour scale (issue #26http://github.com/hadley/ggplot2/issuesearch?state=open&q=scale#issue/26


> ) could help for this particular problem. One variable would be mapped
> to hue, the other one to chroma for example. I wonder if this concept
> could be extended to other colour scales such as scale_gradient_n,
> following different paths in the colour space.
>
> Best,
>
> baptiste
>

takahashi kohske

unread,
Mar 24, 2010, 1:26:14 PM3/24/10
to ggplot2
Hi,

If you want, I think, you can customize.

But this seems to be an old-fashioned way,
and you may not take advantage of user-friendly-ness of ggplot2.


dat<-data.frame(g=gl(2,25),x=runif(50),y=runif(50),z=runif(50),n=factor(1:50))

b<-colorRamp(c("#0000FF","#DDDDFF"), space="rgb", interpolate="linear")
g<-colorRamp(c("#00FF00","#DDFFDD"), space="rgb", interpolate="linear")

col.fun<-list(b,g)
col<-c()
for(g. in 1:2){
col<-c(col,nice_ramp(col.fun[[g.]],subset(dat,g==g.)$z))
}
dat$col<-col

ggplot(dat,aes(x=x,y=y,colour=n,shape=g))+geom_point()+scale_colour_manual(value=col)+opts(legend.position="none")


HTH.

2010/3/25 Anatole <agha...@ucla.edu>:

Anatole

unread,
Mar 24, 2010, 1:33:43 PM3/24/10
to ggplot2
oh wow. this worked perfectly. thanx a lot.

On Mar 24, 10:26 am, takahashi kohske <takahashi.koh...@gmail.com>
wrote:


> Hi,
>
> If you want, I think, you can customize.
>
> But this seems to be an old-fashioned way,
> and you may not take advantage of user-friendly-ness of ggplot2.
>
> dat<-data.frame(g=gl(2,25),x=runif(50),y=runif(50),z=runif(50),n=factor(1:50))
>
> b<-colorRamp(c("#0000FF","#DDDDFF"),  space="rgb", interpolate="linear")
> g<-colorRamp(c("#00FF00","#DDFFDD"),  space="rgb", interpolate="linear")
>
> col.fun<-list(b,g)
> col<-c()
> for(g. in 1:2){
>        col<-c(col,nice_ramp(col.fun[[g.]],subset(dat,g==g.)$z))}
>
> dat$col<-col
>
> ggplot(dat,aes(x=x,y=y,colour=n,shape=g))+geom_point()+scale_colour_manual(value=col)+opts(legend.position="none")
>
> HTH.
>

> 2010/3/25 Anatole <aghaz...@ucla.edu>:

baptiste auguie

unread,
Mar 24, 2010, 2:34:45 PM3/24/10
to Anatole, ggplot2
On Wed, Mar 24, 2010 at 5:27 PM, Anatole <agha...@ucla.edu> wrote:
> baptiste, I clicked on the link and I got a blank page...how do I get
> hcl colour scale?
>

weird, it should take you to the bug (feature) report page. At any
rate, the hcl colour scale is not yet implemented (as far as I know).
I tried to write one but I must be missing something basic as I can't
get it to work with more than one variable :(

Xie Chao

unread,
Mar 24, 2010, 9:51:45 PM3/24/10
to Anatole, ggplot2
On Wed, Mar 24, 2010 at 11:49 PM, Anatole <agha...@ucla.edu> wrote:
> Thanx Xie,
> I tried that and it is giving me what I needed except the colors are
> really faded but I think I would be able to make it work if I try hard
> enough on that....It s just surprising to me that I am not able to use
> ggplot2 to customize a plot where you have a mix of two factors
> (quantitative and qualitative)...I always heard that the strength of
> this package was the ability and flexibility to visualize multivariate

In fact, ggplot2 support quite a number colour_gradient scales, for
example scale_gradient2. With proper encoding, you can easily
implement your plot without creating manual colours:

df <- data.frame(g=gl(2,25),x=runif(50),y=runif(50),z=runif(50))
df <- transform(dat, gz = (as.numeric(g)-1.5) * 2 * (z + 0.5))
qplot(x, y, data=df, colour=gz) + scale_colour_gradient2()

Then you can set the gradient breaks and re-label them with
un-transformed/decoded value + grouping.

Hope this can change your impression on ggplot2.

Xie Chao

Anatole

unread,
Mar 25, 2010, 1:25:08 AM3/25/10
to ggplot2
Hi Xie,

I think I have come across incorrectly. As an R user I am a huge fan
of ggplot2 and I just want to tell the developers how much I
appreciate it.

In terms of the code you sent me I had a question: basically you re
suggesting to code the two datasets as 1 and -1 and multiply these
values to the quantitative trait which I want to use as gradient
coloring. Makes perfect sense and I think it s simple but very clever.

What I dont understand is why are you adding 0.5 to z in the following
code:

gz = (as.numeric(g)-1.5) * 2 * (z + 0.5)

...Instead, wouldnt it suffice to say

gz = (as.numeric(g)-1.5) * 2 * z

I think this will do the job too unless you had a reason to add the
0.5 increment to z.

thanx
anatole


On Mar 24, 6:51 pm, Xie Chao <xiech...@gmail.com> wrote:

Xie Chao

unread,
Mar 25, 2010, 1:35:02 AM3/25/10
to Anatole, ggplot2
On Thu, Mar 25, 2010 at 1:25 PM, Anatole <agha...@ucla.edu> wrote:
> Hi Xie,
>
> I think I have come across incorrectly. As an R user I am a huge fan
> of ggplot2 and I just want to tell the developers how much I
> appreciate it.
>
> In terms of the code you sent me I had a question: basically you re
> suggesting to code the two datasets as 1 and -1 and multiply these
> values to the quantitative trait which I want to use as gradient
> coloring. Makes perfect sense and I think it s simple but very clever.
>
> What I dont understand is why are you adding 0.5 to z in the following
> code:
>
> gz = (as.numeric(g)-1.5) * 2 * (z + 0.5)
>
> ...Instead, wouldnt it suffice to say
>
> gz = (as.numeric(g)-1.5) * 2 * z
>
> I think this will do the job too unless you had a reason to add the
> 0.5 increment to z.

The 0.5 is used to avoid using the middle color.

without the 0.5 (or any non-zero number), you will not distinguish
value 0 for group a and group b. For example:

(1 - 1.5) * 2 * 0 = 0
(2 - 1.5) * 2 * 0 = 0

then both values are represented the middle colour, or white.

But after add the 0.5:

(1 - 1.5) * 2 * (0 + 0.5) = - 0.5
(2 - 1.5) * 2 * (0 + 0.5) = +0.5

Of course if you know there is no zero value in your data, it will be
much easier to not add the 0.5.

(and sorry for misunderstood your message..)

Xie Chao

Anatole

unread,
Mar 25, 2010, 1:42:10 AM3/25/10
to ggplot2
Hi Xie,

Yes you re right...there are no zeros in the data, in fact they all
start at 5 or up. But thanx for clarifying it.

On Mar 24, 10:35 pm, Xie Chao <xiech...@gmail.com> wrote:

Reply all
Reply to author
Forward
0 new messages