geom_smooth and stat_smooth behaviour on facetted plots

881 views
Skip to first unread message

Giovanni Marco Dall'Olio

unread,
Mar 2, 2010, 10:18:04 AM3/2/10
to ggp...@googlegroups.com
Hi,
geom_ and stat_smooth fail to produce a plot if they are used on a dataset in which only one of the facetted plots would be empty.

Maybe I can explain this better with an example.

Let's create a plot with facets and stat_geom on the diamonds dataset:
>>> qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm')

The plot is created and it is nice, it has 8 different smaller plots.
The problem is that any algorithm used by stat_smooth produces an error if it is applied to a dataset with too few points, and when you apply it to faceted data this is more likely to happen.

Let's add a new data to diamonds, adding a new value for 'clarity' and with no price:
>>> newdata <- data.frame(carat=0.4, cut='Ideal', color='E', clarity='NEW', depth=33.2, table=33, price=NA, x=2.1, y=2.2, z=2.2)
>>> d2 <- rbind(diamonds, newdata)

Now, if you apply the same function to d2, you will get an error and no plot is shown at all:
>>> qplot(data=d2, carat, price, facets=~clarity) + stat_smooth(method='lm')
Errore in `[<-.data.frame`(`*tmp*`, var, value = list(`NA` = NULL)) :
  i valori missing non sono ammissibili cone indici in assegnazioni di data frame
Inoltre: Warning message:
Removed 1 rows containing missing values (stat_smooth).

This behaviour is fine because I am trying to apply a smoothing/spline function to a dataset with too few points; however, I would like to have a plot in which all the facets in which the smoothing function can be calculated are shown, and for those on which the smoothing function returns an error, just plot them without the smoothing line.
Is it possible to do this with ggplot2? How can I do it? Maybe by rewriting the lm or loess function and returning no value in case the input data is too short?


--
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it

hadley wickham

unread,
Mar 13, 2010, 10:04:56 AM3/13/10
to dallo...@gmail.com, ggp...@googlegroups.com
Hi Giovanni,

This turned out to be a really simple/dumb bug and is fixed in the
development version -
http://github.com/hadley/ggplot2/commit/537b01b55c1bdbc96f36f3f4ae1c25263969c65a

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> To post to this group, send email to ggp...@googlegroups.com
> To unsubscribe from this group, send email to
> ggplot2+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/ggplot2

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Giovanni Marco Dall'Olio

unread,
Mar 13, 2010, 10:24:59 AM3/13/10
to hadley wickham, ggp...@googlegroups.com
On Sat, Mar 13, 2010 at 4:04 PM, hadley wickham <h.wi...@gmail.com> wrote:
Hi Giovanni,

This turned out to be a really simple/dumb bug and is fixed in the
development version -
http://github.com/hadley/ggplot2/commit/537b01b55c1bdbc96f36f3f4ae1c25263969c65a
 
Hail to you!! :-)

I was looking at the code and trying to see if I could fix the error by myself, but my knowledge of R is still too basic to do so. Thank you and see if for the next bug I will be able to catch it myself :-)

 
Reply all
Reply to author
Forward
0 new messages