plotting RDA object with ggplot2

3,293 views
Skip to first unread message

Guillaume T.R.

unread,
Sep 29, 2010, 12:21:06 PM9/29/10
to ggplot2
Hi list!

I'm trying to plot a redundancy analysis (RDA) type object in ggplot.
I've been able to plot the response and the environmental variables.
However, I'd like to add segment only to the environmental variables.
Now, the thing with my code is that I do not know how to plot segments
only to one type of values, in this case the environmental variables
(called Biplot in the data I use). Biplots usually show arrows only to
the environmental variables.

For the example, I've used datasets from the "vegan" library, a
multivariate analysis package. vegan has it's plotting device for this
kind of object, but I would like to use ggplot to create nicer looking
graphics.

Hope someone can help me out with this.

Thanks!

Guillaume

Guillaume Théroux Rancourt, agr., M.Sc.
Ph.D. Student - Plant Biology
Université Laval, Québec, Canada


++++

install.packages("vegan")
library(vegan)
data(dune)
data(dune.env)
dune.Manure <- rda(dune ~ Manure, dune.env)
plot(dune.Manure,display=c("sp", "cn", "bp"), scaling=2) ### vegan
type plot

### The next step is to isolate the values of interest from the RDA
object
scor = scores(dune.Manure, display=c("sp", "cn", "bp"), scaling=2)
### a matrix
dat.scor = cbind(data.frame(rbind(scor$species,(scor$biplot/
2))),c(rep("Species",nrow(scor$species)),rep("Biplot",nrow(scor
$biplot))))
colnames(dat.scor) = c("RDA1","RDA2","Type")
dat.scor

## Type: Species = response variables, Biplot = environmental
variables

p = ggplot(dat.scor, aes(RDA1,RDA2, label=rownames(dat.scor),
colour=factor(Type))) +
geom_vline(x=0,colour="grey50") +
geom_hline(y=0,colour="grey50") +
geom_text(angle=45, size=3) +
geom_segment(aes(x=0,y=0,xend=RDA1,yend=RDA2, size=0.5)) +
theme_bw()
p

Brandon Hurr

unread,
Sep 29, 2010, 12:53:42 PM9/29/10
to Guillaume T.R., ggplot2
Guillaume, 

I thought you would do it by subsetting your data and giving only that to geom_segment(), but it seems that doesn't work. 

ggplot(dat.scor, aes(RDA1,RDA2, label=rownames(dat.scor),
colour=factor(Type))) +
                        geom_vline(x=0,colour="grey50") +
                        geom_hline(y=0,colour="grey50") +
                        geom_text(angle=45, size=3) +
                        geom_segment(data=arrows, aes(x=0,y=0,xend=RDA1,yend=RDA2, size=0.5)) +
                        theme_bw()
Error in data.frame(x = 0, y = 0, xend = c(-0.39979597994717, 0.271873626953409,  : 
  arguments imply differing number of rows: 1, 4, 34

Which I don't really understand since... 
> arrows<-dat.scor[dat.scor$Type == 'Biplot',]
> arrows
                RDA1       RDA2   Type
Manure.L -0.39979598  0.2737701 Biplot
Manure.Q  0.27187363  0.2360109 Biplot
Manure.C -0.27461016 -0.1835809 Biplot
Manure^4 -0.03557615 -0.2526819 Biplot
> str(arrows)
'data.frame': 4 obs. of  3 variables:
 $ RDA1: num  -0.3998 0.2719 -0.2746 -0.0356
 $ RDA2: num  0.274 0.236 -0.184 -0.253
 $ Type: Factor w/ 2 levels "Biplot","Species": 1 1 1 1

No idea where the hell it's getting 34 from... 

Sorry I couldn't get it to work, but hopefully someone can explain where we're going wrong. 

Brandon



--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

Guillaume T.R.

unread,
Sep 29, 2010, 1:24:47 PM9/29/10
to ggplot2
Hi Brandon (and list),

Thanks for the idea. I modified the code by including data for each
geom. However, the names of the columns of both dataset need to be
different. If there's a more elegant way of doing this, I would sure
like to know about it.

arrows<-dat.scor[dat.scor$Type == 'Biplot',]
colnames(arrows) = c("RDA1.","RDA2.","Type.")

ggplot() +
geom_vline(x=0,colour="grey50") +
geom_hline(y=0,colour="grey50") +
geom_text(data=dat.scor, aes(RDA1,RDA2, label=rownames(dat.scor),
colour=factor(Type)), angle=45, size=2) +
geom_segment(data=arrows, aes(x=0,y=0,xend=RDA1.,yend=RDA2.,
size=0.5)) +
theme_bw()

guillaume
> > To unsubscribe: email ggplot2+u...@googlegroups.com<ggplot2%2Bunsu...@googlegroups.com>
> > More options:http://groups.google.com/group/ggplot2

Dennis Murphy

unread,
Sep 29, 2010, 2:15:01 PM9/29/10
to Guillaume T.R., ggplot2
Hi:

On Wed, Sep 29, 2010 at 10:24 AM, Guillaume T.R. <gtran...@gmail.com> wrote:
Hi Brandon (and list),

Thanks for the idea. I modified the code by including data for each
geom. However, the names of the columns of both dataset need to be
different. If there's a more elegant way of doing this, I would sure
like to know about it.

As long as the interior of the ggplot() call is empty, there shouldn't be a name conflict if you use different data frames with similarly named variables. The conflict comes when defining the infrastructure of the plot with one data frame and then using another data frame with the same variable names in a geom appended to the originally defined aesthetics. *That* causes trouble, and I experienced that while playing with your problem.  For example,

g <- ggplot(dat.scor, aes(x = RDA1, y = RDA2, ....))
g + ... + geom_segment(data = arrows, aes(x = RDA1, y = RDA2...))     # this goes kaboom

However, if you define   g <- ggplot()   and then define the data to be used inside a geom, then that ought to work even if two different data frames have the same variable names; it's kind of like with() in base R.

arrows<-dat.scor[dat.scor$Type == 'Biplot',]
colnames(arrows) = c("RDA1.","RDA2.","Type.")

ggplot() +
geom_vline(x=0,colour="grey50") +
geom_hline(y=0,colour="grey50") +
geom_text(data=dat.scor, aes(RDA1,RDA2, label=rownames(dat.scor),
colour=factor(Type)), angle=45, size=2) +
geom_segment(data=arrows, aes(x=0,y=0,xend=RDA1.,yend=RDA2.,
size=0.5)) +
theme_bw()

This is what I did, which is very similar to yours:

# Subset into two data sets, get rid of the redundant Type:
dat.scorspec <- subset(dat.scor, Type == 'Species')[, -3]
dat.scorbipl <- subset(dat.scor, Type == 'Biplot')[, -3]

p <- ggplot()
p + geom_vline(x=0,colour="grey50") +
    geom_hline(y=0,colour="grey50") +
    geom_text(data = dat.scorspec, aes(x = RDA1, y = RDA2,
              label=rownames(dat.scorspec)), angle=45, size=3,
              colour = 'blue') +
    geom_segment(data = dat.scorbipl, aes(x = 0, y = 0,
                 xend = RDA1, yend = RDA2), size = 0.5, colour = 'red') +
    geom_text(data = dat.scorbipl, aes(x = RDA1, y = RDA2,
              label = rownames(dat.scorbipl)), size = 5, angle = 45,
              vjust = 1, colour = 'violet') +
    theme_bw()

I used two geom_text() calls, one for the point labels and one for the biplot axis labels. Notice that I used RDA1 and RDA2 as the variable names in all three geoms that processed data, but the data frames are not the same in all three plots.

If separating some overlapping names is meaningful, I'd either systematically change some of the RDA1-RDA2 coordinates slightly (put them in new variables) or consider some jitter, which might be a problem for those points near the origin.

HTH,
Dennis



Reply all
Reply to author
Forward
0 new messages