geom_smooth: fit two curves with different formula

870 views
Skip to first unread message

Elena Guijarro Garcia

unread,
Feb 19, 2013, 4:41:14 PM2/19/13
to ggp...@googlegroups.com
Dear all,

I need to plot the fit for a length-weight curve for males and females in the same plot, and cannot find the way to do it. Please find the data attached. My best shot so far is as follows:

# subset the df by sex (there are <1000 males but >1000 females):

ml<-subset(df,sex=="m")
fml<-subset(df,sex=="f")

# add the formula & parameters for each sex :

x<- 0.2127616*(ml$lng^2.689955)
z<-0.1946441*(fml$lng^2.720762)

# do the plot:

ggplot(df,aes(lng,wgt)) + geom_point(aes(colour = sex),shape=1) + theme_bw()+
  scale_colour_manual(values=c("orange","black"), breaks=c("m","f")) +
  stat_smooth(data=subset(df,sex=="m"),method="loess", formula= y~x,se = FALSE,aes(colour= sex),size=0.6)+
 #  stat_smooth(data=subset(df,sex=="f"),method="loess", formula= y~z,se = FALSE,aes(colour= sex),size=0.6)+
  scale_y_continuous(breaks=seq(0,max(sort(unique(ceiling(df$wgt/1000)*1000))),by=1000),labels=c("0","1000","2000","3000","4000","5000"))+
  scale_x_continuous(breaks=seq(0,max(sort(unique(ceiling(df$lng/10)*10))),by=5),labels=c("0","","10","","20","","30","","40"))+
  theme(axis.title.y = element_text(vjust=0.2, size=10),axis.text.y=element_text(size=8),axis.title.x = element_text(vjust=0.1, size=10),axis.text.x=element_text(size=8))+
  labs(x="Talla / Length (cm)", y="Peso individual / Individual weight (gr)")+
  theme(legend.key.size=unit(0.7,"lines"),legend.key.width = unit(0.4, "lines"),legend.text=element_text(size=8),legend.title=element_text(size=8,face="bold"),legend.position=c(0.85,0.2))+
  theme(plot.title=element_text(size=10))


I tried to add separately stat_smooth for each sex but it didn't work. With this script I can only fit one of the sexes.

Any ideas?

Thanks!

Elena



 




 
df

Ito, Kaori (Groton)

unread,
Feb 19, 2013, 5:13:27 PM2/19/13
to Elena Guijarro Garcia, ggp...@googlegroups.com

I am wondering what you want to do:

 

ggplot(df,aes(x=lng,y=wgt,color=sex)) + geom_point(shape=1) + geom_smooth()

 

geom_smooth draw smooth line for each sex group, try see:

 

ggplot(df,aes(x=lng,y=wgt,color=sex))+ geom_smooth()

 

let me know if you are thinking something else….

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2
 
---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Dennis Murphy

unread,
Feb 20, 2013, 1:33:58 AM2/20/13
to Elena Guijarro Garcia, ggp...@googlegroups.com
Hi:

I just posted a response to the ggplot2 list on a related query, so
you may want to check that out for the regression lines part. (Good
news: your problem is much simpler than the one to which I responded,
but you need to make some cosmetic changes to the functions.
Basically, you don't have to worry about faceting.) More inline.

On Tue, Feb 19, 2013 at 1:41 PM, Elena Guijarro Garcia
<elena.guij...@gmail.com> wrote:
> Dear all,
>
> I need to plot the fit for a length-weight curve for males and females in
> the same plot, and cannot find the way to do it. Please find the data
> attached. My best shot so far is as follows:
>
> # subset the df by sex (there are <1000 males but >1000 females)
>
> ml<-subset(df,sex=="m")
> fml<-subset(df,sex=="f")

This is unnecessary in ggplot2. You can use the entire data set for this task.
>
> # add the formula & parameters for each sex :
>
> x<- 0.2127616*(ml$lng^2.689955)
> z<-0.1946441*(fml$lng^2.720762)
>
> # do the plot:
>
> ggplot(df,aes(lng,wgt)) + geom_point(aes(colour = sex),shape=1) +
> theme_bw()+
> scale_colour_manual(values=c("orange","black"), breaks=c("m","f")) +
> stat_smooth(data=subset(df,sex=="m"),method="loess", formula= y~x,se =
> FALSE,aes(colour= sex),size=0.6)+
> # stat_smooth(data=subset(df,sex=="f"),method="loess", formula= y~z,se =
> FALSE,aes(colour= sex),size=0.6)+

This is the wrong way to go about it - the code you should have used
is much simpler. See below.
>
> scale_y_continuous(breaks=seq(0,max(sort(unique(ceiling(df$wgt/1000)*1000))),by=1000),labels=c("0","1000","2000","3000","4000","5000"))+
>
> scale_x_continuous(breaks=seq(0,max(sort(unique(ceiling(df$lng/10)*10))),by=5),labels=c("0","","10","","20","","30","","40"))+
> theme(axis.title.y = element_text(vjust=0.2,
> size=10),axis.text.y=element_text(size=8),axis.title.x =
> element_text(vjust=0.1, size=10),axis.text.x=element_text(size=8))+
> labs(x="Talla / Length (cm)", y="Peso individual / Individual weight
> (gr)")+
> theme(legend.key.size=unit(0.7,"lines"),legend.key.width = unit(0.4,
> "lines"),legend.text=element_text(size=8),legend.title=element_text(size=8,face="bold"),legend.position=c(0.85,0.2))+
> theme(plot.title=element_text(size=10))
>
>
> I tried to add separately stat_smooth for each sex but it didn't work. With
> this script I can only fit one of the sexes.

Here's an edited version of your code with smooths for both sexes,
using DF for the data frame name instead:

library(ggplot2)
library(grid)
ggplot(DF,aes(lng,wgt, colour = sex)) +
geom_point(shape=1) +
theme_bw(base_size = 10)+
scale_colour_manual(values=c("orange","black"), breaks=c("m","f")) +
stat_smooth(se = FALSE, size = 0.6) +
scale_y_continuous(breaks = seq(0,
max(sort(unique(ceiling(DF$wgt/1000)*1000))), by=1000),
labels = seq(0, 5000, by = 1000))+
scale_x_continuous(breaks = seq(0,
max(sort(unique(ceiling(DF$lng/10)*10))),by=5),
labels = c("0","","10","","20","","30","","40"))+
labs(x="Talla / Length (cm)",
y="Peso individual / Individual weight (gr)",
title = "Some main title...fix me!!")+
theme(axis.title.y = element_text(vjust=0.2),
axis.title.x = element_text(vjust=0.1),
legend.key.size = unit(0.7,"lines"),
legend.key.width = unit(0.4, "lines"),
legend.title = element_text(size = rel(0.8), face="bold"),
legend.position = c(0.85, 0.2))

Notice where the colour aesthetic is placed - it belongs in the base
layer because you use it in all subsequent layers (point and smooth).
I made some cosmetic changes to the scale label code, and a bit more
in theme(). The new theming system (introduced in 0.9.2) permits
inheritance of theme element properties, of which you could take great
advantage in this situation.

The key step is to define a base font size in theme_bw(). In two
places, I used the rel() function, which performs relative sizing (vis
a vis the base_size). This way, if you choose to change the base font
size at some point, everything defined with rel() remains
proportionally the same unless you specifically change it. Because of
the inheritance properties, the amount of code you need is much
smaller if you know how to use it. Take a look at the output of
theme_bw() and study it if necessary.

Inheritance example in the new theming system:

text = sets the default properties for all theme elements that use
element_text();
axis.text inherits from text, as does legend.text or any other theme
element ending in .text. axis.text.y inherits from axis.text. In
elements such as theme.axis.y, many of the default properties are set
to NULL because they are inherited from properties further up the
tree. You can use this hierarchy to simplify the number of changes you
make with theme().

Re geom_smooth(), the loess method is the default, so you need neither
have specified the method nor the formula in the call.

Dennis

ANTONIOSPARS GITONGA

unread,
Feb 20, 2013, 6:15:51 AM2/20/13
to Dennis Murphy, Elena Guijarro Garcia, ggp...@googlegroups.com
> names(test)
[1] "ObjektID"     "tallstubbyta"

> is.factor(test$ObjektID); is.factor(test$tallstubbyta)
[1] TRUE
[1] FALSE

> is.numeric(test$ObjektID); is.numeric(test$tallstubbyta)
[1] FALSE
[1] TRUE

> mean(test$tallstubbyta)
[1] 0.01142584

> aggregate(test, list(test$ObjektID), length)[, c("Group.1", "tallstubbyta")]
  Group.1 tallstubbyta
1     S.1           19
2    S.10            4

> aggregate(test, list(test$ObjektID), mean)[, c("Group.1", "tallstubbyta")]
  Group.1 tallstubbyta
1     S.1   0.01383128
2    S.10   0.00000000
Warning messages: 
1: argument is not numeric or logical: returning NA in:
mean.default(X[[1]], ...) 
2: argument is not numeric or logical: returning NA in:
mean.default(X[[2]], ...) 

> aggregate(test, list(test$ObjektID), sum)[, c("Group.1", "tallstubbyta")]
Error in Summary.factor(..., na.rm = na.rm) : 
        "sum" not meaningful for factors

> aggregate
function (x, ...) 
UseMethod("aggregate")

> test
   ObjektID tallstubbyta
1       S.1 0.0000000000
2       S.1 0.0000000000
3       S.1 0.0000000000
4       S.1 0.0000000000
5       S.1 0.0000000000
6       S.1 0.1320254313
8       S.1 0.0003141593
9       S.1 0.0000000000
10      S.1 0.0003141593
11      S.1 0.0003141593
12      S.1 0.0530929158
13      S.1 0.0000000000
14      S.1 0.0000000000
15      S.1 0.0003141593
16      S.1 0.0000000000
17      S.1 0.0226980069
18      S.1 0.0003141593
19      S.1 0.0003141593
20      S.1 0.0530929158
21     S.10 0.0000000000
22     S.10 0.0000000000
26     S.10 0.0000000000
27     S.10 0.0000000000
Reply all
Reply to author
Forward
0 new messages