ggplot density plots overlay points

339 views

Skip to first unread message

Timothy Lau

unread,

Oct 12, 2015, 5:12:28 PM10/12/15

to ggplot2

Any suggestions for how to go about plotting something like this in ggplot?

Dennis Murphy

unread,

Oct 13, 2015, 12:30:35 AM10/13/15

to Timothy Lau, ggplot2

Hi:

I can get you part of the way there, but I got stuck on trying to replicate the normal distribution grob. Since, as usual, there is no reproducible example with which to work, I had to create one based on your plot.

library(grid)

library(ggplot2)

## Step 1: Create a rotated standard normal distribution

# Create a sequence of x-values at which to apply the dnorm() function

DF0 <- data.frame(x = seq(-4, 4, by = 0.05))

# Create the normal distribution plot, stripping everything except the graph

p <- ggplot(DF0, aes(x = x)) +

theme_minimal() +

stat_function(fun = dnorm, color = "red", size = 1) +

geom_hline(yintercept = 0, color = "red", size = 1) +

coord_flip() +

theme(panel.grid.major = element_blank(),

axis.ticks = element_blank(),

axis.text = element_blank(),

panel.background = element_rect(fill = "transparent",

colour = "transparent")) +

labs(x = NULL, y = NULL) +

theme(plot.margin = grid::unit(c(0, 0, 0, 0), "lines"))

# Convert the above into a grob, which we'll need below.

pr <- ggplotGrob(p)

# Attempt to replicate your input data

DF <- data.frame(age = seq(6.5, 16.5),

observed = c(18, 22, 27, 30, 31, 35, 38, 41, 40, 46, 43))

# Since you didn't provide a function to estimate the predicted values,

# I produced a loess smooth and fit a spline function through it

# as a hack solution. The point of the exercise is to produce

# predicted values at the observed ages.

f <- with(DF, splinefun(loess.smooth(age, observed)))

DF$predicted <- f(DF$age)

# Produce the scatterplot. A factor for the observed and predicted

# means is generated on the fly to allow for a legend. I chose to create

# separate legends for point color and linetype. Although both colors are

# set to black in scale_color_manual(), one is modified in the guides() call.

pp <- ggplot(DF, aes(x = age)) +

theme_bw() +

geom_point(aes(y = observed, color = "Observed mean",

linetype = "Observed mean")) +

geom_line(aes(y = predicted, color = "Predicted mean",

linetype = "Predicted mean")) +

theme(legend.position = c(1, 0),

legend.justification = c("right", "bottom")) +

scale_color_manual(values = c("black", "black")) +

scale_linetype_manual(values = c("blank", "solid")) +

scale_x_continuous(breaks = seq(6.5, 16.5)) +

guides(color = guide_legend(override.aes = list(shape = 21,

fill = c("black", "transparent")))) +

labs(x = "Age, in years", y = "Raw score", color = "", linetype = "") +

ylim(0, 50)

# This is my attempt to map the distribution grob at each predicted value.

# It works the first time but fails afterward, so there's probably something

# obvious I'm missing here...

for(i in 1:nrow(DF)) # also tried seq_along(DF$age) with same results

{

pp <- pp + annotation_custom(grob = pr,

xmin = DF$age[i] - 0.2, xmax = DF$age[i] + 0.4,

ymin = DF$predicted[i] - 5,

ymax = DF$predicted[i] + 5)

}

# You really shouldn't be using R^2 for nonlinear models (Hint: what is the

# null model against which you're comparing the present fit?), but if you think

# of it as the squared correlation between observed and predicted values, I

# guess this will suffice to add its annotation:

r2 <- with(DF, cor(observed, predicted))^2

pp + annotate("text", x = 6.5, y = 45, hjust = 0,

label = paste("R^2 ==", round(r2, 3)), parse = TRUE)

Someone else will have to figure out how to get the loop to "work". This requires Baptiste's magic, but I don't know if he still monitors this group.

Re the comment about R^2 above, if you don't know the answer to the hint, you shouldn't be using it. Seriously. Moreover, what is the point of "adjusted" R^2 when you only have one covariate in the model?

Dennis

On Mon, Oct 12, 2015 at 2:12 PM, Timothy Lau <timoth...@gmail.com> wrote:

Any suggestions for how to go about plotting something like this in ggplot?

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages