I've kind of following setup: 3 variables where variable b and c are measured in similar units. For example:
a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6)
b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13)
c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29)
data <- data.frame(a,b,c)
I'd like to create a simple scatterplot where a is on
the x axis and b and c (as the are measured in similar units) share
the y axis. Additionally I'd like to add to each variable
a regression line from the lm-function. I am very new
to this topic and my ggplot2-book hasn't arrived yet so I don't know
really how to do it. What I tried so far is:
coef1 <- coef(lm(a~b))
coef2 <- coef(lm(a~c))
ggplot()+
geom_point(aes(a,b,data=data))
geom_abline(aes(x=coef1)
geom_point(aes(a,c,data=data))
geom_abline(aes(x=coef2)
But that isn't working. I think it is a very common question so probably it is quite easy for you to help me.
An addtional question came up: How do you display log-transformed data and its regression line. Assuming I had to log-log transform my data to perform a regression (acutally the were log(x+1) transformed). I'd like to display the original data but with log scales and I want to add the regression line from the transformed data. How do I have to do this (especially is there any special issue because they were log(x+1) transformed and not log(x)?)?
Thank you very much
Johannes
--
NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie!
Jetzt informieren: http://www.gmx.net/de/go/freephone
Here's one way to do the 'easy part':
a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6)
b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13)
c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29)
dat <- data.frame(a,b,c)
md <- melt(dat, id = 'a')
ggplot(md, aes(x = a)) +
geom_point(aes(y = value, colour = variable), size = 3) +
geom_smooth(aes(y = value, colour = variable), method = 'lm', se =
FALSE, size = 1)
On Tue, Sep 6, 2011 at 3:06 AM, Johannes Radinger <JRad...@gmx.at> wrote:
> Hello,
>
> I've kind of following setup: 3 variables where variable b and c are measured in similar units. For example:
>
> a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6)
> b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13)
> c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29)
> data <- data.frame(a,b,c)
>
>
> I'd like to create a simple scatterplot where a is on
> the x axis and b and c (as the are measured in similar units) share
> the y axis. Additionally I'd like to add to each variable
> a regression line from the lm-function. I am very new
> to this topic and my ggplot2-book hasn't arrived yet so I don't know
> really how to do it. What I tried so far is:
>
> coef1 <- coef(lm(a~b))
> coef2 <- coef(lm(a~c))
>
> ggplot()+
> geom_point(aes(a,b,data=data))
> geom_abline(aes(x=coef1)
> geom_point(aes(a,c,data=data))
> geom_abline(aes(x=coef2)
>
>
> But that isn't working. I think it is a very common question so probably it is quite easy for you to help me.
>
As for this part, see scale_continuous() to find out how to plot on a
log scale; as for an offset log-log scale, I don't believe there is
anything built-in but you may be able to construct one. One option may
be to play with the new scales package in conjunction with the
development version of ggplot2. Alternatively, you may be able to do
it by defining a formatting function and using the formatter =
argument of scale_continuous(). I'd try that first before venturing
into the development version to find a solution. There should be a few
examples of user-written formats in the ggplot2 list archives to give
you an idea of how they work.
HTH,
Dennis
> An addtional question came up: How do you display log-transformed data and its regression line. Assuming I had to log-log transform my data to perform a regression (acutally the were log(x+1) transformed). I'd like to display the original data but with log scales and I want to add the regression line from the transformed data. How do I have to do this (especially is there any special issue because they were log(x+1) transformed and not log(x)?)?
>
> Thank you very much
>
> Johannes
> --
> NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
>
> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
I just came up with a slightly different approach using cbind to
calculate regressions for multiple Ys, and to concatenate vectors and
use grouping variables. But there is still a problem... Somehow my regression lines are at a wrong position. I think it's a very simple thing but I can't find the reason for the problem :(
a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6)
b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13)
c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29)
X <-c(a,a)
Y <-c(b,c)
groupcode <-rep(c("b","c"),c(length(b),length(c)))
data <- data.frame(X,Y,groupcode)
model <-lm(cbind(b,c)~a,data=data)
coef <- data.frame(coef(model))
ggplot()+
geom_point(aes(X,Y,data=data,colour=groupcode))+
geom_abline(aes(intercept=coef[1,],slope=coef[2,],colour=groupcode),data=coef)
I think this idea can work quite well (accept for the stupid problem;)) Maybe you know what to do...
Concerning the log-problem I think I'll create an example and write an extra-post after this here is solved.
cheers
Johannes
-------- Original-Nachricht --------
> Datum: Tue, 6 Sep 2011 08:35:14 -0700
> Von: Dennis Murphy <djm...@gmail.com>
> An: Johannes Radinger <JRad...@gmx.at>
> CC: ggplot2 <ggp...@googlegroups.com>
> Betreff: Re: two types of points in one plot, abline and log-scale
I've some questions concerning log-transformations and plotting of the regression lines. So far as I know is it a problem to log-transform values smaller than 1 (0-1). In my statistics lecture I was told to do a log(x+1) transformation in such cases. So I provide here a small example to explain my questions:
# Some example data for testing
a1 <-c(0.2,1.9,0.1,0.2,0.8,22,111.3,19.9,23.9,138,42.3,54.2,0.9)
b1 <-c(1.8,28.2,0.3,12.4,3.2,81.1,122.1,2.9,37.2,98.9,21,28.7,1.8)
data1 <- data.frame(a1,b1)
model <- lm(log(a1+1)~log(b1+1))
because of values less then one I did the log(x+1) transformation for running the lm. Is that correct so far? (Just to mention: These are example data so I haven't checked if the need a transformation at all)
Then some questions arise when it comes to plot the data. As usual I'd like to plot the original data (not log transformed) but in a log-scale.
I tried two approaches the standard plot function and ggplot.
# Plot with ggplot
ggplot()+
geom_point(aes(b1,a1,data=data1))+
geom_abline(aes(intercept=coef(model)[1],slope=coef(model)[2]))+
scale_y_log()+
scale_x_log()
# Plot with standard plot
plot(b1,a1,log="xy")
abline(model,untf=T)
abline(model,untf=F)
1) The regression lines are different for plot vs. ggplot(transformed or untransformed). So what is actually the correct line?
2) The regression line was calculated on basis of log(x+1), but the log scale on my axis is just simple log (without +1). So how are such cases usually treated? I thought about subtracting the value 1 from the intercept?
So my simple question: What is the best way to display such data with a regression line?
Thank you
/Johannes
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2