Wrong Rsqared in Logit second stage moderated mediation

kleines Peh

unread,

Mar 27, 2018, 1:23:13 PM3/27/18

to lavaan

Hi there

I am investigating a second-stage moderated mediation (Model 14 in Hayes templates), so I have the path between the mediator (M) and the dependent variable (Y) moderated by a variable V (which i don't mean center).

I have dichotomous independent (X) and dependent variable (Y) and continuous Mediator (M) and Moderator (V). I try to replicate what I found in SPSS now with lavaan.

Now I encounter one major difference/problem:

when I run the model in lavaan, I get a Rsquared for the Y-model that is through the roof (0.975 vs. 0.2725 in SPSS). I tried both link = "logit" and the default probit.

I am not sure what's going on....

my lavaan code looks like this:

# create interaction of M and V

data$IA_MV<-data$M*data$V

#values of V (for adding them in the indirect effect conditional on moderator later)

mean(data$V) # 69.37594

mean(data$V)-sd(data$V) # 65.66968

mean(data$V)+sd(data$V) # 73.0822

model14<- ' # regressions

M~ a1*X

Y ~ b1*M

Y ~ b2*V

Y ~ b3*IA_MV

Y ~ cdash*X

# index of moderated mediation

index := a1*b3

# indirect effects conditional on moderators: a1*(b1 + b3*V)

SDbelow:= a1*b1+a1*b3*65.66968

average := a1*b1+a1*b3*69.37594

SDabove := a1*b1+a1*b3*73.0822

'

# fit model

f.model14 <- sem(model = model14,

data = data,

se = "bootstrap",

bootstrap = 1000,

link = "logit")

# fit measures

summary(f.model14,

fit.measures = TRUE,

rsquare = TRUE)

On a side note: is there a way to add a constant to the model?

Thank you so much, I really appreciate your support!

Terrence Jorgensen

unread,

Mar 28, 2018, 7:40:50 AM3/28/18

to lavaan

when I run the model in lavaan, I get a Rsquared for the Y-model that is through the roof (0.975 vs. 0.2725 in SPSS).

Which pseudo-R-squared does SPSS report for a logistic regression? There are numerous proposed:

https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faq-what-are-pseudo-r-squareds/

I also like Tjur's, not shown in the link above:

https://statisticalhorizons.com/r2logistic

lavaan runs probit regression with DWLS estimation, which starts by estimating polychoric correlations among the variables (treating your interaction as a separate variable, by the way). So lavaan's R-squared is the proportion of latent-response variance that is explained by the predictors. That could be quite different than pseudo-R-squared based on proportions or on log-likelihoods of target vs. empty models.

I tried both link = "logit" and the default probit.
I am not sure what's going on....

Did you notice a warning saying that it switch the link back to probit? That's the only method available, even when you request MML estimation in the latest development version:

sem('grade ~ x1 + x2', data = HolzingerSwineford1939, ordered = c("grade"), link = "logit", estimator = "MML")

Error in lav_model_gradient_mml(lavmodel = lavmodel, GLIST = GLIST, THETA = THETA[[g]], :

logit link not implemented yet; use probit

Not sure under what circumstances the experimental logit link would actually work yet.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

kleines Peh

unread,

Mar 28, 2018, 8:12:00 AM3/28/18

to lav...@googlegroups.com

Thanks for your quick reply and the pointers.

The Pseudo Rsquared above is the McFadden, but also CoxSnell (. 3108) and Nagelkerke (. 4172) are way below the one I get from lavaan.

I was worried that I did something wrong in my lavaan code?

I didn't get a warning message when running with the logit link. Thanks for the clarification, it seems to be ignored as the output is actually exactly the same.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Terrence Jorgensen

unread,

Mar 28, 2018, 8:57:24 AM3/28/18

to lavaan

I didn't get a warning message when running with the logit link. Thanks for the clarification, it seems to be ignored as the output is actually exactly the same.

you can check whether it was ignored

lavInspect(fit, "options")$link

kleines Peh

unread,

Apr 5, 2018, 5:48:30 AM4/5/18

to lav...@googlegroups.com

Hi Terrence

I really like Tjurs pseudo Rsquared - thanks for the recommendation!

Do I understand correctly, however, that I cannot use lav.predict to get the mean of the predicted probabilities? (I am referring to the following setence in the lavpredict description: "Note that this function can not be used to ‘predict’ values of dependent variables, given the values of independent values (in the regression sense")

Is there another way to get the mean of the predicted probabilities for the two events from my model?

Thanks again

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+unsubscribe@googlegroups.com.

Terrence Jorgensen

unread,

Apr 5, 2018, 8:26:23 AM4/5/18

to lavaan

Is there another way to get the mean of the predicted probabilities for the two events from my model?

At this point, you would need to write out your regression equation and calculate the predicted probabilities yourself. But you do not have any latent common factors in your model, it is just a path analysis. So you can run the separate regression models in glm() and use the predict() function to save your predicted probabilities

?predict.glm

kleines Peh

unread,

Apr 5, 2018, 8:33:19 AM4/5/18

to lav...@googlegroups.com

Thank you so much for your quick and helpful reply!

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

kleines Peh

unread,

May 27, 2018, 11:40:24 AM5/27/18

to lavaan

Hi Terrence

I did what you suggested and i found some differences in the coefficients of the second stage, resulting in different predicted probabilities.

Here is what I do with lavaan

# create interaction of M and V

data$IA_MV<-data$M*data$V

model14<- ' # regressions

M~ 1+a1*X

Y ~ b1*M

Y ~ b2*V

Y ~ b3*IA_MV

Y ~ 1+cdash*X

'

# fit model

f.model14 <- sem(model = model14,

data = data,

se = "bootstrap",

bootstrap = 1000,

link = "probit")

These are the coefficient:

M~1= 6.526; a1= -0.877; b1= -0.962; b2= -0.094; b3= 0.014; cdash= 0.123; Y~1= 6.521

This is how i did it with the separate regression models:

M.model<-lm(M~X, data=data)

Y.model<-glm(Y~X+M+V+IA_MV, data=data, family = binomial(link = "probit"))

These are the coefficients:

M~1=6.526 ; a1=-0.877;b1=-4.132; b2=-0.401; b3=0.060; cdash=0.454; Y~1=26.075

When i now predict the choice probabilities:

lavaan: pred_lavaan<-pnorm( 6.521+0.123*X -0.962*M-0.094*V+0.014*IA_MV)

regression model: pred_regression<- predict(Y.model, data, type="response", se.fit=TRUE)

If I binarise at .5 as decision boundary (ifelse(pred_*<0.499, 0, 1)), I get different choices:

True choice: 42.86 % yes

lavaan choice: 91.73 % yes

regression choice: 39.85 % yes

Why are there different? Am I doing something wrong?

Thanks

On Thursday, April 5, 2018 at 2:33:19 PM UTC+2, kleines Peh wrote:

Thank you so much for your quick and helpful reply!

On Thu, Apr 5, 2018, 2:26 PM Terrence Jorgensen <tjorge...@gmail.com> wrote:

Is there another way to get the mean of the predicted probabilities for the two events from my model?

At this point, you would need to write out your regression equation and calculate the predicted probabilities yourself. But you do not have any latent common factors in your model, it is just a path analysis. So you can run the separate regression models in glm() and use the predict() function to save your predicted probabilities

?predict.glm

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
UvA web page: http://www.uva.nl/profile/t.d.jorgensen

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+unsubscribe@googlegroups.com.

Terrence Jorgensen

unread,

Jun 2, 2018, 10:10:39 AM6/2/18

to lavaan

Why are there different?

Because your regression estimates are different. I suppose they are different because in the SEM, V (and the interaction?) is related to M, but you are only controlling for its effect on Y.

Reply all

Reply to author

Forward