Bubble plot, size function for levels within category

206 views
Skip to first unread message

Selene

unread,
Nov 18, 2014, 7:27:38 PM11/18/14
to ggp...@googlegroups.com
Dear all,

I'm new-ish to R and thrilled with ggplot

I want to make a bubble chart with y as numeric factor and x as categorical. I want to make a graph with this code on the following example:

ggplot(Ants_data, aes(x=date of count, y=number of eggs, colour=type of ant, size=number of type of ants)) +
geom_point( alpha=0.8, guide="none") + 
scale_size_area(breaks=c(10, 50, 100, 200, 300), max_size=20)
   scale_x_continuous(name="Date observation", limits=c(0,12))+
   scale_y_continuous(name="Egg count", limits=c(0,1250))+
   geom_text(size=4)+
   theme_bw()

I think the main problem is that I would have to make a function for the size of the bubble, as I would like to represent the number of the type of ant observed (e.g queen, worker, reproductive male, etc). my database show the column "type of ant" with the types as different levels... and each ant would be a point of data or observation. 
I don't know how to make a function that would show the length() of total of observations for that particular type of ant or level. 
Could anyone help me with this?

many thanks!

PS: I guess the code is far from done, but I guess the first step would be inputing that piece of data 

Ben Bond-Lamberty

unread,
Nov 18, 2014, 7:48:25 PM11/18/14
to Selene, ggplot2
So you'd like the size of the bubble to correspond to how many
individuals were observed for that particular type, over the entire
data set? Personally I would do this by summarizing the data set,
merging the summary with the main data set, and then plotting. For
example:

d <- data.frame(type=c('a','a','a','b','c','c'), date=LETTERS[1:6], eggs=1:6)
d_summary <- as.data.frame(table(d$type, dnn="type"))
d <- merge(d, d_summary)
qplot(date,eggs,data=d,size=Freq)

Hope this is useful. If not, please provide a short reproducible example.

Ben
> --
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ggplot2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ggplot2+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Selene

unread,
Nov 19, 2014, 5:55:38 AM11/19/14
to ggp...@googlegroups.com
HI Ben!
Thanks for your answer... but I think i couldn't make it quite well...this is a simplified example of my data (as mine have too many other variables! and observations). but this is the format that i have it! 
Again thank you very much for your input!


structure(list(Ant.ID = structure(1:13, .Label = c("a", "b", 
"c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m"), class = "factor"), 
    Tunel = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
    3L, 4L, 4L), .Label = c("tunel A", "tunel B", "tunel C", 
    "tunel D"), class = "factor"), Type = structure(c(4L, 1L, 
    2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L, 4L, 1L, 2L), .Label = c("Queen", 
    "RF", "RM", "Worker"), class = "factor"), Date = structure(c(4L, 
    4L, 5L, 1L, 1L, 6L, 6L, 6L, 2L, 2L, 2L, 3L, 3L), .Label = c("april", 
    "august", "dec", "feb", "march", "may"), class = "factor"), 
    Egg = c(4L, 5L, 7L, 2L, 5L, 9L, 4L, 3L, 2L, 6L, 8L, 9L, 2L
    )), .Names = c("Ant.ID", "Tunel", "Type", "Date", "Egg"), class = "data.frame", row.names = c(NA, 
-13L))

Selene

unread,
Nov 19, 2014, 6:20:41 AM11/19/14
to ggp...@googlegroups.com
By the way it shows an error that the length is not the same :(

so far i'VE USE THIS CODE:
sps<-length(ants$Type)
spn<-rep(1:sps,each=nrow(ants)/sps) 
ANTS<-cbind(spn,ants)

I was able to do a bubble plot... but it doesn't summarise the "ant type" as one big bubble (per number of eggs and date) it just shows lots of big bubbles...and some little ones... 


Again! many thanks!!


On Wednesday, November 19, 2014 12:27:38 AM UTC, Selene wrote:

Ben Bond-Lamberty

unread,
Nov 19, 2014, 7:14:27 AM11/19/14
to Selene, ggplot2
Hi Selene, thanks for the extra information. I'm still not sure this
is exactly what you want, but following up on my previous example (I
don't get see any length errors, nor the point of that extra code
snippet in your last email, so am leaving that out):

ants <- ... the dput output you sent ...
ants_summary <- as.data.frame(table(ants$Type, dnn="Type"))
ants <- merge(ants, ants_summary)

This adds a new column "Freq" to your data frame showing the overall
number of each type of ant. Then simply:

ggplot(ants, aes(Date, Egg, colour=Type, size=Freq)) + geom_point()

Ben

Selene

unread,
Nov 19, 2014, 9:22:36 AM11/19/14
to ggp...@googlegroups.com
Thanks ben!! you are right, the error was my mistake all along. So this solved the issue greatly and I see different size bubbles :D but... too many of them ... I will attempt to brake your patience and ask you further more! this basically show a scatter plot with all the points of each type of ants with the different sizes (so it pretty noisy!) my attempt has been so far to produce 1 Bubble per each type... but i guess it would be more like a bar chart or a boxplot... with different size of the bar width according to the freq.  Is this possible to do with a bubble chart? (maybe just doing it with the mean of egg count? is it possible to do it in a barplot? 

Again... many thanks for all your help!! 


On Wednesday, November 19, 2014 12:27:38 AM UTC, Selene wrote:

Ben Bond-Lamberty

unread,
Nov 19, 2014, 2:18:13 PM11/19/14
to Selene, ggplot2
I guess I'm confused on what, exactly, you want to show. Is it the
distribution of eggs by ant type, across dates? The classic solution
to this is a stacked bar graph:

ggplot(ants,aes(Date,Egg,fill=Type))+geom_bar(stat='identity')

Could also use facets:

ggplot(ants,aes(Date,Egg))+geom_bar(stat='identity')+facet_grid(Type~.)

You could also display these data, particularly if the full data set
is a lot bigger, using a heatmap:

ggplot(ants,aes(Date,Type,fill=Egg))+geom_tile()

Does one of these options work for you?

Ben

Selene

unread,
Nov 21, 2014, 9:28:56 AM11/21/14
to ggp...@googlegroups.com
Hi Ben!

What I would like to show is something like a barplot with conf. intervals for x=Date, y=number of eggs per type, fill=type and width of the bars= Freq
That's why I thought of a bubble chart....as I could maybe include the conf interval to the means of number of eggs?

I tried the ones you sent me.... but they don't quite work... the facet one is good so I can add the variable Tunnel... but in the y axis it shows number of ants apparently? and not the real number of eggs? that's kind of weird....at least not what i want to show


Again, thank you so much!


On Wednesday, November 19, 2014 12:27:38 AM UTC, Selene wrote:
Reply all
Reply to author
Forward
0 new messages