How to weight by percentage using 'geom_bar' and 'facet_grid', and how to edit facet labels.

856 views
Skip to first unread message

sloehr

unread,
Feb 2, 2010, 1:39:27 AM2/2/10
to ggplot2
Hi ggplot users,

Hopefully someone can point me in the right direction. I very recently
started using R and ggplot2. I am interested in plotting up some using
the 'geom_bar' and 'facet_grid' functions ggplot2 provides.
As you can see from the example code I pasted below I am plotting
'Roundness' (factor with 5 levels) using geom_bar, and faceting vs.
the factors 'Depth' and 'Size' using facet_grid.
While I'm very happy with the output, there are two changes I would
like to make and have not been able to figure out.

1)As you can see in the figure (http://dl.dropbox.com/u/4312640/
Facetedcounts.pdf) from the code below, the fourth plot in the third
column has a lower total count (sum) than most of the others (22 vs 50
in most others, if I remember correctly). How can I force all the
plots to show percentage of counts in each Roundness class rather than
absolute counts, i.e. if there are 22 records shown in this plot the
percentage of these 22 that is within each factor level? Do I use the
weighting parameter, and if so how do I apply it?

2)How can I add more detail to the facet labels? E.g., instead of
'2-4' and '4-8' I would like the first two labels to be '2-4 mm' and
'4-8 mm', and instead of '0-15' and '15-55' I would like them to be
'0-15 cm' and '15-55 cm'. Can I do this within ggplot, or do I have to
make changes to the actual values in the dataframe?

I've provided a direct link to a subset of my data in the code below,
so you should be able to just run it.

>cmorph <- read.csv('http://dl.dropbox.com/u/4312640/Concretion_Morphology_R_test.csv')

#define order of factors
>cmorph$Roundness <- factor(cmorph$Roundness, levels=c("A","SA","SR","R","WR"))
>cmorph$Size <- factor(cmorph$Size, levels=c("2-4","4-8","8-16",">16"))
>cmorph$Depth <- factor(cmorph$Depth, levels=c("0-15","15-55","55-75","85-115","140-175","175-210","210-230"))

#plot Roundness as facet_grid, using Depth and Size.
>ggplot(cmorph, aes(Roundness, fill=Roundness)) + geom_bar(position="dodge") + facet_grid(Depth ~ Size) + scale_x_discrete("Roundness Class") + scale_y_continuous("Count")

Thanks,

Stefan

Luciano Selzer

unread,
Feb 2, 2010, 8:32:41 AM2/2/10
to sloehr, ggplot2

Luciano


2010/2/2 sloehr <stefancar...@gmail.com>

Hi ggplot users,

Hopefully someone can point me in the right direction. I very recently
started using R and ggplot2. I am interested in plotting up some using
the 'geom_bar' and 'facet_grid' functions ggplot2 provides.
As you can see from the example code I pasted below I am plotting
'Roundness' (factor with 5 levels) using geom_bar, and faceting vs.
the factors 'Depth' and 'Size' using facet_grid.
While I'm very happy with the output, there are two changes I would
like to make and have not been able to figure out.

1)As you can see in the figure (http://dl.dropbox.com/u/4312640/
Facetedcounts.pdf
) from the code below, the fourth plot in the third
column has a lower total count (sum) than most of the others (22 vs 50
in most others, if I remember correctly). How can I force all the
plots to show percentage of counts in each Roundness class rather than
absolute counts, i.e. if there are 22 records shown in this plot the
percentage of these 22 that is within each factor level? Do I use the
weighting parameter, and if so how do I apply it?
I would sumarize the data first, I find it easier

sumarizeddata<-melt(ddply(cmorph, .(Size, Depth), function(df) table(df$Roundness)/length(df$Roundness)), variable_name="Roundness")

 

2)How can I add more detail to the facet labels? E.g., instead of
'2-4' and '4-8' I would like the first two labels to be '2-4 mm' and
'4-8 mm', and instead of '0-15' and '15-55' I would like them to be
'0-15 cm' and '15-55 cm'. Can I do this within ggplot, or do I have to
make changes to the actual values in the dataframe?
I don't think it can be done in ggplot2 yet. I would rename the labels of the factor. 

I've provided a direct link to a subset of my data in the code below,
so you should be able to just run it.

>cmorph <- read.csv('http://dl.dropbox.com/u/4312640/Concretion_Morphology_R_test.csv')

#define order of factors
>cmorph$Roundness <- factor(cmorph$Roundness, levels=c("A","SA","SR","R","WR"))
>cmorph$Size <- factor(cmorph$Size, levels=c("2-4","4-8","8-16",">16"))
>cmorph$Depth <- factor(cmorph$Depth, levels=c("0-15","15-55","55-75","85-115","140-175","175-210","210-230"))

#plot Roundness as facet_grid, using Depth and Size.
>ggplot(cmorph, aes(Roundness, fill=Roundness)) + geom_bar(position="dodge") + facet_grid(Depth ~ Size) + scale_x_discrete("Roundness Class") + scale_y_continuous("Count")
Use this instead. 
ggplot(sumarizeddata, aes(Roundness, value, fill=Roundness)) + geom_bar(position="dodge") + facet_grid(Depth ~ Size) + scale_x_discrete("Roundness Class") + scale_y_continuous("Count") 

Thanks,

Stefan

--
You received this message because you are subscribed to the ggplot2 mailing list.
To post to this group, send email to ggp...@googlegroups.com
To unsubscribe from this group, send email to
ggplot2+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/ggplot2

hadley wickham

unread,
Feb 2, 2010, 8:49:01 AM2/2/10
to Luciano Selzer, sloehr, ggplot2
>> 1)As you can see in the figure (http://dl.dropbox.com/u/4312640/
>> Facetedcounts.pdf) from the code below, the fourth plot in the third
>> column has a lower total count (sum) than most of the others (22 vs 50
>> in most others, if I remember correctly). How can I force all the
>> plots to show percentage of counts in each Roundness class rather than
>> absolute counts, i.e. if there are 22 records shown in this plot the
>> percentage of these 22 that is within each factor level? Do I use the
>> weighting parameter, and if so how do I apply it?
>
> I would sumarize the data first, I find it easier
> sumarizeddata<-melt(ddply(cmorph, .(Size, Depth), function(df)
> table(df$Roundness)/length(df$Roundness)), variable_name="Roundness")

It's not much harder to do it without summarising the data:

cmorph <- ddply(cmorph, c("Depth", "Size"), transform, wt = 1 / length(Depth))

ggplot(cmorph, aes(Roundness, fill=Roundness, weight = wt)) +
geom_bar() +
facet_grid(Depth ~ Size)

>> 2)How can I add more detail to the facet labels? E.g., instead of
>> '2-4' and '4-8' I would like the first two labels to be '2-4 mm' and
>> '4-8 mm', and instead of '0-15' and '15-55' I would like them to be
>> '0-15 cm' and '15-55 cm'. Can I do this within ggplot, or do I have to
>> make changes to the actual values in the dataframe?
>
> I don't think it can be done in ggplot2 yet. I would rename the labels of
> the factor.

Right, that's the easiest way.

Thanks for the conveniently packaged reproducible example! (although
in future, please leave off the > at the start of each line so it's
easier to copy and paste)

Hadley

--
http://had.co.nz/

Stefan Carlos Loehr

unread,
Feb 2, 2010, 10:13:35 PM2/2/10
to ggplot2
I ended up going with Hadley's suggestion as that meant I didn't have to melt and summarise my dataset, and it worked exactly the way I wanted. It was also helpful to see the example of ddply in action.

Thanks also to Dennis and Luciano.
Reply all
Reply to author
Forward
0 new messages