Errors with fitting latent growth curve model plus mediation analyses

219 views
Skip to first unread message

xueying qin

unread,
Sep 24, 2020, 4:09:36 AM9/24/20
to lavaan
Dear Professors, 

I am doing a mediation analysis within latent growth curve model, and  I continuouly got one error message and don't know how to deal with it. Briefly, I introduce my analyses and database first. My analyses is to look at how variable "m" relate to a distal outcome, whether there exist direct effect or indirect effect mediated by variable "r".  My study is a longitudinal design, therefore, variable "m" and variable "r" were measured repeatly at 5 time points. In addition, there are several time-invariant covariates, such as sex, and the distal outcome is a binary variable, coded as 0 for no outcome and 1 for having the outcome. The data has cluster structure ("PAID" variable is standing for the cluster), and also has missing values. 

Below is my lavaan code and the path of my hypotheses: 

model.mediation='  
# random intercepts and slopes for variables "m" and "r"
m.i=~1*m1+1*m2+1*m3+1*m4+1*m5 
m.s=~0*m1+1*m2+2*m3+3*m4+4*m5

r.i=~1*r1+1*r2+1*r3+1*r4+1*r5
r.s=~0*r1+1*r2+2*r3+3*r4+4*r5

# covariance and variance of latent intercepts and slopes 
m.i~~m.i
m.s~~m.s
m.i~~m.s
r.i~~r.i
r.s~~r.s
r.i~~r.s

# create structured residuals 
m1~~covm*m1
m2~~covm*m2
m3~~covm*m3
m4~~covm*m4
m5~~covm*m5


r1~~covr*r1 
r2~~covr*r2
r3~~covr*r3
r4~~covr*r4
r5~~covr*r5

# direct effect 
outcome~x1*m.i+x2*m.s+sex  # red lines 

# mediator effect 
r.i~x3*m.i+sex      # purple lines 
r.s~x5*m.i+x6*m.s+sex      # purple lines  
outcome~m1*r.i+m2*r.s    # blue lines 

# indirect effect 
miri:=x3*m1
mirs:=x5*m2 
ms:=x6*m2 
sum_mi:=x3*m1+x5*m2

# total effect: 
total.mi:=x1+x3*m1+x5*m2
total.ms:=x2+x6*m2  
'
 fit.model=growth(model = model.mediation, data = pheno, cluster = "PAIR", ordered = "outcome", estimator='WLSMV',  link = "probit", missing="pairwise")

The message I got after fitting this model is: 

Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument
In addition: Warning messages:
1: In lav_options_set(opt) :
  lavaan WARNING: information will be set to “expected” for estimator = “DWLS”
2: In lav_data_full(data = data, group = group, cluster = cluster,  :
  lavaan WARNING: due to missing values, some pairwise combinations have less than 10% coverage

Here are my questions: 

1. What does this error message mean? is it because of the missing values? How to deal with the error message? (Also I have no idea the error message is due to the update of the lavaan package. Just before the updating, I could got fitted results.) 

2. Am I correct to use "growth" function when fitting the model? If I would like to use the general lavaan function, what arguments should I consider in this model? Actually I have tried both functions in the same data before, and it seemed that they got different estimations of the parameters. 

3. To fit a pure growth curve model, where I mean no "outcome" and no mediation analysis, I ask the latent intercept and slope to regress on covariates, for example m.i~1+sex, m.s~1+sex, r.i~1+sex, r.s~1+sex, but in this combined model (combine latent growth curve and mediation analyses), I only regress r.i and r.s regress on covariate, is it correct?  

Thank you very much. 

Regards

Xueying  
mediation q.jpg


Terrence Jorgensen

unread,
Sep 24, 2020, 7:56:40 AM9/24/20
to lavaan
Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument
In addition: Warning messages:
1: In lav_options_set(opt) :
  lavaan WARNING: information will be set to “expected” for estimator = “DWLS”
2: In lav_data_full(data = data, group = group, cluster = cluster,  :
  

Here are my questions: 

1. What does this error message mean?  Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument

Looks like the outcome has a missing (or "not a number") threshold.  What is your table(pheno$outcome) output?

is it because of the missing values? How to deal with the error message? lavaan WARNING: due to missing values, some pairwise combinations have less than 10% coverage

It is a warning, not an error, but yes, it means your observations are quite sparse.  If you have a model with these variables that converges (e.g., saturated using lavCor() with output="lavaan" and other arguments), you could see the coverage and patterns using lavInspect(fit, "coverage") and lavInspect(fit, "patterns").

(Also I have no idea the error message is due to the update of the lavaan package. Just before the updating, I could got fitted results.) 

What version were you using before?

 
2. Am I correct to use "growth" function when fitting the model?

You can, but don't have to, if you know every nonzero parameter you want to specify.
 
If I would like to use the general lavaan function, what arguments should I consider in this model? Actually I have tried both functions in the same data before, and it seemed that they got different estimations of the parameters. 

Then they didn't produce the same calls.  growth() is a wrapper around lavaan() with these options:

lavaan(..., model.type = "growth",
   
int.ov.free = FALSE, int.lv.free = TRUE, auto.fix.first = TRUE,
   
auto.fix.single = TRUE, auto.var = TRUE, auto.cov.lv.x = TRUE,
   
auto.cov.y = TRUE, auto.th = TRUE, auto.delta = TRUE, auto.efa = TRUE)

 
3. To fit a pure growth curve model, where I mean no "outcome" and no mediation analysis, I ask the latent intercept and slope to regress on covariates, for example m.i~1+sex, m.s~1+sex, r.i~1+sex, r.s~1+sex, but in this combined model (combine latent growth curve and mediation analyses), I only regress r.i and r.s regress on covariate, is it correct?  

It's your theory, you specify what you think represents the real data-generating process most faithfully.  Your diagram indicates you think the effects of m's growth factors (and of the covariates) on the outcome are mediated by r's growth factors, which would be full rather than partial mediation.  If your model fits poorly, my first thought would be to examine partial mediation.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Xueying Qin

unread,
Sep 24, 2020, 8:43:33 AM9/24/20
to lav...@googlegroups.com
Hi Terrence, 

Thank you for your quick response. I listed my further questions marked in blue, and hopefully could get more help from you.  

Regards 

Xueying 

Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument
In addition: Warning messages:
1: In lav_options_set(opt) :
  lavaan WARNING: information will be set to “expected” for estimator = “DWLS”
2: In lav_data_full(data = data, group = group, cluster = cluster,  :
  

Here are my questions: 

1. What does this error message mean?  Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument

Looks like the outcome has a missing (or "not a number") threshold.  What is your table(pheno$outcome) output?

I just looked at my data using table(pheno$outcome), the total sample size of the data is 523, of them 355 were coded as 0 for the outcome variable, and 168 coded as 1.  

is it because of the missing values? How to deal with the error message? lavaan WARNING: due to missing values, some pairwise combinations have less than 10% coverage

It is a warning, not an error, but yes, it means your observations are quite sparse.  If you have a model with these variables that converges (e.g., saturated using lavCor() with output="lavaan" and other arguments), you could see the coverage and patterns using lavInspect(fit, "coverage") and lavInspect(fit, "patterns").

I mean the error message " Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument ", how to deal with it?  Since I cannot fit the model because of the error message, so I cannot use these functions you suggested, such as lavInspect. 

(Also I have no idea the error message is due to the update of the lavaan package. Just before the updating, I could got fitted results.) 

What version were you using before?

Sorry I forgot what version I used before. I installed the packages two months ago from CRAN. I updated the lavaan yesterday afternoon which is the version 0.6-7, and updated it again this morning with the version 0.6-8, but they gave the same error messages " Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument ". However, I did make some changes to the model before and after updating the package, in the previous model I included the "m.s→r.i" path and could get the fit results, but deleted this path in the current model since I think the slope of m variable would not determine the intercept of r variable, this change did not cause the change of growth curve model, but only influence the code of mediation analyses. I don't think this change would cause such an error message. Am I correct? Actually, when I fit the previous model in the updated lavaan, it gave the same error message. 

 
2. Am I correct to use "growth" function when fitting the model?

You can, but don't have to, if you know every nonzero parameter you want to specify.
 
If I would like to use the general lavaan function, what arguments should I consider in this model? Actually I have tried both functions in the same data before, and it seemed that they got different estimations of the parameters. 

Then they didn't produce the same calls.  growth() is a wrapper around lavaan() with these options:

lavaan(..., model.type = "growth",
   
int.ov.free = FALSE, int.lv.free = TRUE, auto.fix.first = TRUE,
   
auto.fix.single = TRUE, auto.var = TRUE, auto.cov.lv.x = TRUE,
   
auto.cov.y = TRUE, auto.th = TRUE, auto.delta = TRUE, auto.efa = TRUE)

Sorry, you mean I should use the lavaan function? If I use lavaan function, should I include some arguments in the function or write the full model, for example, adding all variance and covariance of latent variables and manifest variables in the model syntax? 
 
3. To fit a pure growth curve model, where I mean no "outcome" and no mediation analysis, I ask the latent intercept and slope to regress on covariates, for example m.i~1+sex, m.s~1+sex, r.i~1+sex, r.s~1+sex, but in this combined model (combine latent growth curve and mediation analyses), I only regress r.i and r.s regress on covariate, is it correct?  

It's your theory, you specify what you think represents the real data-generating process most faithfully.  Your diagram indicates you think the effects of m's growth factors (and of the covariates) on the outcome are mediated by r's growth factors, which would be full rather than partial mediation.  If your model fits poorly, my first thought would be to examine partial mediation.

Actually I did the cross-lagged effect analyses between m and r variables under the model "autoregressive latent trajectory" introduced by Curran et al. In this analysis, I fit two latent growth curve models for "m" and "r" variables separately, and ask the latent intercept of slope of m and r variable to regression on covariates. I found the "m" variable has a cross-lagged effect on the "r" variable. To continue to look at whether the "m" variable has an indirect effect mediated by "r" variable, I fit the current mediation model demonstrated in the path diagram. For the previous model which could give the results before updating lavaan, the model was fit well. 


--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ViNDUKAEdak/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/50827cb8-b697-4c96-a9d4-13094489b9c7o%40googlegroups.com.

Yves Rosseel

unread,
Sep 24, 2020, 9:44:01 AM9/24/20
to lav...@googlegroups.com
> Error in th.start.idx[i]:th.end.idx[i] : NA/NaN argument

It is difficult to say what is going on here without seeing the data.
Could you send me your data (or a snippet of the data), together with a
minimal script. Just enough to generate this error message. You may
email it to Yves dot Rosseel at UGent dot be

Yves.

Xueying Qin

unread,
Sep 24, 2020, 5:52:12 PM9/24/20
to lav...@googlegroups.com
Hi Yves, 

Sorry it's a bit difficult and may take long time to send the data to you, since I only temporarily work in the current position for a short time and have no much right to transfer the data, even it's only a snippet of the data, I need to consult with my professors and other collaborators and get their permissions before sending out data. Hopefully you could understand the situation. 

Anyway, I re-run the code again and have checked my data frame very carefully after receiving your and Terrence's responses, and also have made any possible attempts with the code. I found that in the following code,  fit.model=growth(model = model.mediation, data = pheno, cluster = "PAIR", ordered = "outcome", estimator='WLSMV',  link = "probit", missing="pairwise"), If I only drop cluster="PAIRID" with no other changes for the code, then the model will fit and give fit results for different situations, here I mean I have several variables similar to the initial "m" variable demonstrated in the path diagram, I changed to different manifest variables and could all get fit results. My data is actually the family based data, so the "PAID" means the family number, and in each family there are only two siblings, therefore, my data is like this, observations nested in persons, and person nested in family, but in each family there are only two persons. In addition to family data, there are also some singletons in the database which could not constitute a complete family.  

So my questions are, (1) I cannot use the cluster option in this mediation model? (2) Could you please help me check if I write the correct syntax for this analysis according to my path diagram? To facilitate the check work, I attached them below again:  

model.mediation='  
# random intercepts and slopes for variables "m" and "r"
m.i=~1*m1+1*m2+1*m3+1*m4+1*m5 
m.s=~0*m1+1*m2+2*m3+3*m4+4*m5

r.i=~1*r1+1*r2+1*r3+1*r4+1*r5
r.s=~0*r1+1*r2+2*r3+3*r4+4*r5

# covariance and variance of latent intercepts and slopes   ## is it necessary to include these under the growth function? 
m.i~~m.i
m.s~~m.s
m.i~~m.s
r.i~~r.i
r.s~~r.s
r.i~~r.s

# create structured residuals   ## is it necessary to include these under the growth function? 
m1~~covm*m1
m2~~covm*m2
m3~~covm*m3
m4~~covm*m4
m5~~covm*m5


r1~~covr*r1 
r2~~covr*r2
r3~~covr*r3
r4~~covr*r4
r5~~covr*r5

# direct effect 
outcome~x1*m.i+x2*m.s+sex  # red lines 

# mediator effect 
r.i~x3*m.i+sex      # purple lines 
r.s~x5*m.i+x6*m.s+sex      # purple lines  
outcome~m1*r.i+m2*r.s    # blue lines 

# indirect effect 
miri:=x3*m1
mirs:=x5*m2 
ms:=x6*m2 
sum_mi:=x3*m1+x5*m2

# total effect: 
total.mi:=x1+x3*m1+x5*m2
total.ms:=x2+x6*m2  
'
 fit.model=growth(model = model.mediation, data = pheno, ordered = "outcome", estimator='WLSMV',  link = "probit", missing="pairwise")   ## could I use lavaan function here, if so, what else arguments should I add? 

Thank you so much. 

Xueying 



--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ViNDUKAEdak/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Yves Rosseel

unread,
Sep 25, 2020, 2:16:25 AM9/25/20
to lav...@googlegroups.com
> So my questions are, (1) I cannot use the cluster option in this
> mediation model?

The cluster option is only available for continuous data, not for
categorical data. That also explains the error message.

If you have family data, the number of observations (per family) is
usually small. I would suggest to reshape your data to 'wide-format', so
that you have a single line per family. That will allow you to use
categorical data, and estimators like WLSMV.

A paper about this approach can be found here:

https://doi.org/10.1080/10705511.2019.1689366

Yves.

Xueying Qin

unread,
Sep 25, 2020, 3:07:39 AM9/25/20
to lav...@googlegroups.com
Hi Yves, 

All my analyses were based on wide-format data. A quick question is I am not very sure whether my wide-format data is exactly the same as you mentioned. My data looks like this: 

ID PAIR sex outcome r1 r2 r3 r4 r5 m1 m2 m3 m4 m5
A1 A 1 1 0.42 0.54 0.19 0.34 0.25 0.15 0.24 0.07 0.64 0.97
A2 A 1 1 0.69 0.66 0.89 0.58 0.35 0.37 0.36 0.45 0.74 0.31
B1 B 0 1 0.18 0.26 0.85 0.32 0.91 0.02 0.30 0.25 0.61 0.65
B2 B 0 0 0.50 0.54 0.72 0.66 0.61 0.26 0.30 0.56 0.66 1.00

I will read the paper you suggested. Thank you sooooooooo much. 

Regards

Xueying            

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ViNDUKAEdak/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Yves Rosseel

unread,
Sep 25, 2020, 3:33:18 AM9/25/20
to lav...@googlegroups.com
On 9/25/20 9:07 AM, Xueying Qin wrote:
> Hi Yves,
>
> All my analyses were based on wide-format data.

In that case, don't use the cluster= argument. That is only needed for
data in long format.

Yves.

Xueying Qin

unread,
Sep 25, 2020, 8:32:25 AM9/25/20
to lav...@googlegroups.com
Thank you, Yves. I am very impressed with the paper you suggested, although the formulas and statistics are hard for me to understand completely since I am not a statistician, but I could understand the results and know I am doing the right thing with my path diagram. 

Thank you to you both, @Terrence and @Yves. 

Regards 

Xueying 




--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ViNDUKAEdak/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages