ggpairs for categorical data

500 views
Skip to first unread message

xi

unread,
Mar 22, 2011, 1:52:38 PM3/22/11
to ggplot2
Dear list,

Im trying to use ggpairs in GGally to plot some categorial data,
examining the pairwise association among those factors. Have 2
questions. Appreciate any hints.

1. what does discrete = "ratio" actually plot in ggparis function.
Seems it produces mosaic plots by ggally_ratio through
ggfluctuation(). How come the following count table had a plot which
totally miss the cell count at x = 0, y = 1 level?
y
x 0 1
0 4 3
1 4 9

If it doesnt plot directly the cell count, what does it plot?

#-------------------------------
library(GGally)
temp = data.frame("x" = c(0,0,0, rep(1,5), 0,0,1,0,rep(1,7), 0), "y" =
c(1, 0,0,1,0,1,0,1,0,1,0, rep(1,5), 0, 1,1,0))
ggally_ratio(temp); table(temp)
# or
ggpairs(apply(temp,2,factor), diag = list(discrete = "bar"), lower =
list(discrete = "ratio"))
#-------------------------------


2. The default off-diagonal plots in the ggpairs for categorical data,
the lower part is the cell counts plot, what´s in the upper part?
#-------------------------------
ggpairs(apply(temp,2,factor))
#-------------------------------

Thanks!
Xi


#-------------------------------

> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] grid splines stats graphics grDevices utils
datasets methods base

other attached packages:
[1] vcd_1.2-9 colorspace_1.0-1 MASS_7.3-11
Design_2.3-0 Hmisc_3.8-3 GGally_0.2.3 stringr_0.4
ggplot2_0.8.8 proto_0.3-8
[10] reshape_0.8.3 plyr_1.2.1 survival_2.36-5
Biobase_2.10.0

loaded via a namespace (and not attached):
[1] cluster_1.13.3 digest_0.4.2 lattice_0.19-17 tools_2.12.2

Barret

unread,
Mar 22, 2011, 6:48:53 PM3/22/11
to ggplot2
Hi Xi,

I found two bugs in ggplot2 from this post.

> #-------------------------------
> library(GGally)
> temp = data.frame("x" = c(0,0,0, rep(1,5), 0,0,1,0,rep(1,7), 0), "y" =
> c(1, 0,0,1,0,1,0,1,0,1,0, rep(1,5), 0, 1,1,0))
> ggally_ratio(temp); table(temp)
> # or
> ggpairs(apply(temp,2,factor), diag = list(discrete = "bar"), lower =
> list(discrete = "ratio"))
> #-------------------------------

I made a work around for the first one. ggplot2 doesn't appreciate it
when you make a rectangle with points outside of the limits. ggpairs
was making some on the limit and ggplot2 still didn't like it. I
moved the limit down by 0.0001 and everything shows up just fine.

> 2. The default off-diagonal plots in the ggpairs for categorical data,
> the lower part is the cell counts plot, what´s in the upper part?

This again is another quirk of ggplot2. If at all possible, never use
column names that are called "x" or "y". By changing the columns of
the dataset to "V1", "V2", everything works fine. ggplot2 does some
amazing eval voodoo but at the cost of column names of "x" and "y".

colnames(temp) <- c("V1", "V2")
ggally_facetbar(temp, aes(x = V2, y = V1))

The latest version of GGally can be found on www.github.com/ggobi/ggally.

Best,
Barret
Reply all
Reply to author
Forward
0 new messages