ggplot2 error: arguments imply differing number of rows

6,156 views
Skip to first unread message

Debbie Smith

unread,
Apr 6, 2012, 9:51:44 AM4/6/12
to ggplot2
This simple example is from "The R Book" by Michael J. Crawley.

d=read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/
diminish.txt",header=TRUE)
p=qplot(xv,yv,data=d); p
m1=lm(yv~xv,data=d)
p1=p + geom_abline(intercept=coefficients(m1)[1],slope=coefficients(m1)
[2]); p1
m2=lm(yv~xv + I(xv^2),data=d)
x=seq(min(d$xv),max(d$xv),length=100)
p1 + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")

I can run the above codes without any problems in an older version of
R(R 2.10.1). But if I run the codes in a newer version of R(R 2.15, R
2.14), I got the error:
> p1 + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")
Error in data.frame(evaled, PANEL = data$PANEL) :
arguments imply differing number of rows: 100, 18

I am confused about this error. Could someone help me? Thank you so
much!

Brian Diggs

unread,
Apr 6, 2012, 4:53:47 PM4/6/12
to ggplot2

aes should be used to tell ggplot which variables in a data frame are to
be displayed as which aesthetics; vectors should not be given to them.
What is happening is that it is picking up x from the data d (not the
variable) and y as, well, I'm not exactly sure what. The last line
should be:

p1 + geom_line(aes(x,y), data=data.frame(x,y=predict(m2,list(xv=x))),
colour="red")

> I am confused about this error. Could someone help me? Thank you so
> much!

--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

Sam Albers

unread,
Apr 6, 2012, 4:54:07 PM4/6/12
to Debbie Smith, ggplot2
Hi Debbie,

It isn't totally clear to me what exactly you are trying to do. But if
you are trying to plot a line based on a linear model fit then I think
this is what you are after:

library(ggplot2)

d=read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/diminish.txt",header=TRUE)

ggplot(d, aes(x=xv, y=yv)) +
geom_point() +
stat_smooth(method="lm", se=FALSE)

Does that help?

Sam

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2

Dennis Murphy

unread,
Apr 6, 2012, 5:51:17 PM4/6/12
to Debbie Smith, ggplot2
Hi:

geom_smooth() makes this very easy:

qplot(xv, yv, data = d) +
geom_smooth(method = 'lm', color = 'blue', se = FALSE) +
geom_smooth(method = 'lm', formula = y ~ poly(x, 2), color = 'red',
se = FALSE)

geom_smooth() expects a generic y and x in the formula - it will pick
up the variables corresponding to x and y from qplot(). BTW,
geom_smooth() actually does something very similar internally to what
your code is attempting to produce, except that it fits the function
to 80 points between the min and max rather than 100.

To more or less reproduce the plot you're trying to construct, I'd
proceed as follows:

# This is the input to the newdata = argument of predict(),
# which expects a data frame:
px <- data.frame(xv = seq(min(d$xv), max(d$xv), length = 100))

# Create a new data frame with the prediction points, the
# linear and quadratic fitted values, respectively
pd <- data.frame(px, yhat1 = predict(m1, newdata = px),
yhat2 = predict(m2, newdata = px))

p + geom_line(data = pd, aes(x = xv, y = yhat1), color = 'blue') +
geom_line(data = pd, aes(x = xv, y = yhat2), color = 'red')

## This plot ^ will not produce a legend to distinguish the
## colors of the curves and should closely resemble the first plot.

# Do this to set up a legend for the fits:

# First, melt the data using xv as the 'id' variable
pm <- reshape2::melt(pd, id = 'xv')
# Next, create a new factor named model with labels 'linear'
# and 'quadratic', and use this to create the color legend:
pm$model <- pm$variable
pm$model <- factor(pm$model, labels = c('linear', 'quadratic'))

# Do the plot and notice the legend title and labels:
p + geom_line(data = pm, aes(x=xv, y = value, color= model)) +
labs(y = 'yv')

ggplot2 is being actively developed, so code that is about four years
old has a fairly high probability of breaking in recent versions of
ggplot2, and possibly in recent versions of R. If you're trying to
follow Crawley using ggplot2, I'd suggest consulting the on-line help
pages at http://had.co.nz/ggplot2/ (scroll toward the bottom of the
page to find them) for code specific to ggplot2.

HTH,
Dennis

On Fri, Apr 6, 2012 at 6:51 AM, Debbie Smith <dsmit...@gmail.com> wrote:

Jean-Olivier Irisson

unread,
Apr 7, 2012, 3:21:34 AM4/7/12
to Brian Diggs, ggplot2
On 2012-Apr-06, at 22:53 , Brian Diggs wrote:
>
> aes should be used to tell ggplot which variables in a data frame are to be displayed as which aesthetics; vectors should not be given to them. What is happening is that it is picking up x from the data d (not the variable) and y as, well, I'm not exactly sure what. The last line should be:
>
> p1 + geom_line(aes(x,y), data=data.frame(x,y=predict(m2,list(xv=x))), colour="red")

Actually, x does not exist in the data frame d, only in the general environment. When a variable does not exist in the data source, ggplot fetches it in the environment, so it should work (the result of predict has 100 elements, as does x, so there is no size mismatch here). Apparently it does not work when all aesthetics come from the environment and a data frame is still inherited from the previous layers of the plot. Here the data is inherited from the first qplot call. Indeed, this works:

ggplot() + geom_point(aes(xv, yv), data=d) + geom_abline(intercept=coefficients(m1)[1],slope=coefficients(m1)[2]) + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")

but this does not

ggplot(d) + geom_point(aes(xv, yv), data=d) + geom_abline(intercept=coefficients(m1)[1],slope=coefficients(m1)[2]) + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")

and produces the same error

Error in data.frame(evaled, PANEL = data$PANEL) :
arguments imply differing number of rows: 100, 18

I don't know wether this is worth a bug report. In any case, it shows that, for maximum flexibility you should always use the ggplot() + geom_…(…) syntax, with a full specification in each geom.

Jean-Olivier Irisson
---
Observatoire Océanologique
Station Zoologique, B.P. 28, Chemin du Lazaret
06230 Villefranche-sur-Mer
Tel: +33 04 93 76 38 04
Mob: +33 06 21 05 19 90
http://jo.irisson.com/

Jean-Louis Abitbol

unread,
Mar 31, 2013, 7:30:30 PM3/31/13
to ggp...@googlegroups.com
This works in 2.15:

d=read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/
diminish.txt",header=TRUE)
p=qplot(xv,yv,data=d); p
m1=lm(yv~xv,data=d)
p1=p + geom_abline(intercept=coefficients(m1)[1],slope=coefficients(m1)
[2]); p1
m2=lm(yv~xv + I(xv^2),data=d)
x=seq(min(d$xv),max(d$xv),length=100)
y=predict(m2,list(xv=x))
df <- data.frame(x,y)
p1 + geom_line(aes(x=x, y = y), data =df, color="red")

JL

On Sun, Mar 31, 2013, at 07:23 PM, jessie abbate wrote:
> Debbie -
> I am also getting the same error with script that ran beautifully before
> I
> upgraded to 2.15.
> It is not the only problem I have had with the new version of R.
> My advice is re-install the older version of R!!
>
> Jessie
>
> On Friday, April 6, 2012 3:51:44 PM UTC+2, Debbie Smith wrote:
> >
> > This simple example is from "The R Book" by Michael J. Crawley.
> >
> > d=read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/
> > diminish.txt<http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/diminish.txt>",header=TRUE)
> >
> > p=qplot(xv,yv,data=d); p
> > m1=lm(yv~xv,data=d)
> > p1=p + geom_abline(intercept=coefficients(m1)[1],slope=coefficients(m1)
> > [2]); p1
> > m2=lm(yv~xv + I(xv^2),data=d)
> > x=seq(min(d$xv),max(d$xv),length=100)
> > p1 + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")
> >
> > I can run the above codes without any problems in an older version of
> > R(R 2.10.1). But if I run the codes in a newer version of R(R 2.15, R
> > 2.14), I got the error:
> > > p1 + geom_line(aes(x=x,y=predict(m2,list(xv=x))), color="red")
> > Error in data.frame(evaled, PANEL = data$PANEL) :
> > arguments imply differing number of rows: 100, 18
> >
> > I am confused about this error. Could someone help me? Thank you so
> > much!
> >
>
> --
> --
> You received this message because you are subscribed to the ggplot2
> mailing list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ggplot2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ggplot2+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Jessie Abbate

unread,
Apr 1, 2013, 11:12:22 AM4/1/13
to Jean-Louis Abitbol, ggp...@googlegroups.com
OK, but I'm having a different problem and getting the same error. It took less time to reinstall 2.13.2 than it would to go through the problem on this string. Thanks though!
Jessie


You received this message because you are subscribed to a topic in the Google Groups "ggplot2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ggplot2/sB9TyBvySkQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ggplot2+u...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.





--
--------------
JL Abbate
Post-doctoral Researcher / Scientifique
Centre d'Ecologie Fonctionelle et Evolutive (CEFE)
CNRS - UMR 5175
1919, route de Mende
34293 Cedex 5
Montpellier, France
+1.804.212.2321 (skype/voicemail)
+33 6 95 64 21 15 (cell)
jl...@virginia.edu,
jessie...@gmail.com,
jessica...@cefe.cnrs.fr
Reply all
Reply to author
Forward
0 new messages