Plotting geom_line and geom_point for two different variables

1,696 views
Skip to first unread message

Jahan

unread,
Aug 6, 2012, 6:57:39 PM8/6/12
to ggp...@googlegroups.com
Hello all,
I'm having a problem that I know is very basic but I can't find a good solution online.  I have a data set of dates, weights, and the moving averages of the weights.  Here is a link to a sample data set:

I want to create a plot with the weights as points, the moving average as a line, and add a geom_smooth only for the raw weights. I've successfully done it in base R but I can't get it right with ggplot2.  Here's the code for base R and so you see what I'm trying to do:

plot(date,weight)
lines(date,moving_avg,col=2)
lines(lowess(date,weight),col=3)
legend('bottomright',legend=c('Weighted Moving Average (5 Days)','Lowess'),col=2:3,lty=1)

Even after I melted the data set I still can't figure out how to do this in ggplot2.  Can anyone show me a straightforward way of doing this?  Thanks in advance!




Ista Zahn

unread,
Aug 7, 2012, 9:34:12 AM8/7/12
to Jahan, ggp...@googlegroups.com
Hi Jahan,

Here is one way:

weights.m <- melt(weights, id.vars = c("X", "date"))

ggplot(mapping=aes(x=date, y=value)) +
geom_point(data = subset(weights.m, variable == "weight")) +
geom_line(aes(color = variable), data = subset(weights.m,
variable == "moving_avg")) +
geom_smooth(aes(color=variable), data = subset(weights.m,
variable == "weight")) +
scale_color_discrete("", breaks = c("moving_avg", "weight"),
labels = c("Weighted Moving Average (5 Days)", "Lowess"))

Hope that helps,

Ista
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2

Dennis Murphy

unread,
Aug 7, 2012, 5:02:21 PM8/7/12
to Jahan, ggp...@googlegroups.com
Hi:

Here's another approach that doesn't require melting. There is also a
difference between lowess smoothing and loess smoothing, the latter of
which is the default smoother in ggplot2. The game is to create a
factor variable on the fly following the approach shown on p. 106 of
the ggplot2 book.

The data were read into an object named d. The location of the file is
irrelevant but the arguments matter; in particular, it is easier to
convert a character string to a Date object than a factor, which is
why the second argument is present. Then, turn date into a Date object
and add a new variable lowess_y which extracts the y component from
the lowess fit object.

d <- read.csv("../Downloads/weights.csv", header = TRUE,
stringsAsFactors = FALSE)
d$date <- as.Date(d$date, format = '%Y-%m-%d')
d$lowess_y <- with(d, lowess(date, weight))$y

library('ggplot2')

# For each call to geom_line, map colour to a label;
# in this case, "ma" = moving average, "lo" = lowess
# Then use scale_colour_manual() to assign colors
# and legend labels. Note that the "Lowess" label
# comes first because level "lo" precedes level "ma"
# by default when constructing a factor variable, by
# lexicographic ordering.

ggplot(d, aes(x = date)) + geom_point(aes(y = weight)) +
geom_line(aes(y = moving_avg, colour = "ma"), size = 1) +
geom_line(aes(y = lowess_y, color = "lo"), size = 1) +
scale_colour_manual("Type",
values = c("ma" = "blue", "lo" = "orange"),
labels = c("Lowess", "Weighted Moving\nAverage (5 Days)"))

# Same as above, except substituting the default loess fit
# from geom_smooth()

ggplot(d, aes(x = date)) + geom_point(aes(y = weight)) +
geom_line(aes(y = moving_avg, colour = "ma"), size = 1) +
geom_smooth(aes(y = weight, color = "lo"), size = 1, se = FALSE) +
scale_colour_manual("Type",
values = c("ma" = "blue", "lo" = "orange"),
labels = c("Loess", "Weighted Moving\nAverage (5 Days)"))

HTH,
Dennis

Jahan

unread,
Aug 13, 2012, 1:01:56 PM8/13/12
to ggp...@googlegroups.com, Jahan
Hey Dennis,
Your code works perfectly until I add in the scale_colour_manual term:

> ggplot(weights, aes(x = date)) + geom_point(aes(y = weight)) + 
+     geom_line(aes(y = moving_avg, colour = "ma"), size = 1) + 
+     geom_smooth(aes(y = weight, color = "lo"), size = 1, se = FALSE)+
+ scale_colour_manual("Type", 
+                         values = c("ma" = "blue", "lo" = "orange"), 
+           labels = c("Loess", "Weighted Moving\nAverage (5 Days)")) 
Error: Labels can only be specified in conjunction with breaks

I tried specifying arbitrary breaks of c(3,5), and it plots the graph but the legend doesn't include the colored lines corresponding to the moving average and loess.  Do I need to use specific breaks to get the legend right?

Jahan

unread,
Aug 14, 2012, 12:00:28 AM8/14/12
to ggp...@googlegroups.com, Jahan
Ok nevermind, I figured out how to use breaks.  I never had before but with a little searching I realized the breaks just need to be the names of what I'm plotting ("lo" and "ma").  Thanks for your help!
Reply all
Reply to author
Forward
0 new messages