I can get you part of the way there, but I got stuck on trying to replicate the normal distribution grob. Since, as usual, there is no reproducible example with which to work, I had to create one based on your plot.
library(ggplot2)
## Step 1: Create a rotated standard normal distribution
# Create a sequence of x-values at which to apply the dnorm() function
DF0 <- data.frame(x = seq(-4, 4, by = 0.05))
# Create the normal distribution plot, stripping everything except the graph
p <- ggplot(DF0, aes(x = x)) +
theme_minimal() +
stat_function(fun = dnorm, color = "red", size = 1) +
geom_hline(yintercept = 0, color = "red", size = 1) +
coord_flip() +
theme(panel.grid.major = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank(),
panel.background = element_rect(fill = "transparent",
colour = "transparent")) +
labs(x = NULL, y = NULL) +
theme(plot.margin = grid::unit(c(0, 0, 0, 0), "lines"))
# Convert the above into a grob, which we'll need below.
pr <- ggplotGrob(p)
# Attempt to replicate your input data
DF <- data.frame(age = seq(6.5, 16.5),
observed = c(18, 22, 27, 30, 31, 35, 38, 41, 40, 46, 43))
# Since you didn't provide a function to estimate the predicted values,
# I produced a loess smooth and fit a spline function through it
# as a hack solution. The point of the exercise is to produce
# predicted values at the observed ages.
f <- with(DF, splinefun(loess.smooth(age, observed)))
DF$predicted <- f(DF$age)
# Produce the scatterplot. A factor for the observed and predicted
# means is generated on the fly to allow for a legend. I chose to create
# separate legends for point color and linetype. Although both colors are
# set to black in scale_color_manual(), one is modified in the guides() call.
pp <- ggplot(DF, aes(x = age)) +
theme_bw() +
geom_point(aes(y = observed, color = "Observed mean",
linetype = "Observed mean")) +
geom_line(aes(y = predicted, color = "Predicted mean",
linetype = "Predicted mean")) +
theme(legend.position = c(1, 0),
legend.justification = c("right", "bottom")) +
scale_color_manual(values = c("black", "black")) +
scale_linetype_manual(values = c("blank", "solid")) +
scale_x_continuous(breaks = seq(6.5, 16.5)) +
guides(color = guide_legend(override.aes = list(shape = 21,
fill = c("black", "transparent")))) +
labs(x = "Age, in years", y = "Raw score", color = "", linetype = "") +
ylim(0, 50)
# This is my attempt to map the distribution grob at each predicted value.
# It works the first time but fails afterward, so there's probably something
# obvious I'm missing here...
for(i in 1:nrow(DF)) # also tried seq_along(DF$age) with same results
{
pp <- pp + annotation_custom(grob = pr,
xmin = DF$age[i] - 0.2, xmax = DF$age[i] + 0.4,
ymin = DF$predicted[i] - 5,
ymax = DF$predicted[i] + 5)
}
# You really shouldn't be using R^2 for nonlinear models (Hint: what is the
# null model against which you're comparing the present fit?), but if you think
# of it as the squared correlation between observed and predicted values, I
# guess this will suffice to add its annotation:
r2 <- with(DF, cor(observed, predicted))^2
pp + annotate("text", x = 6.5, y = 45, hjust = 0,
label = paste("R^2 ==", round(r2, 3)), parse = TRUE)
Someone else will have to figure out how to get the loop to "work". This requires Baptiste's magic, but I don't know if he still monitors this group.
Re the comment about R^2 above, if you don't know the answer to the hint, you shouldn't be using it. Seriously. Moreover, what is the point of "adjusted" R^2 when you only have one covariate in the model?