What does it mean when the Correlation between Slope and Intercept in a latent growth curve model is <1?

Randy Stache

unread,

Jan 14, 2021, 3:22:57 PM1/14/21

to lavaan

Dear all,

I try to do a multigroup analysis with a latent growth curve model in lavaan (see picture and syntax):

mod = '
Int =~ 1*y1 + 1*y2 + 1*y3
Slo =~ 1*y1 + 2*y2 + 3*y3

Int ~~ Slo

Int ~ X1 + country_2 # with country_1 as reference category)
Slo ~ X1 + country_2
X1 ~ country_2
X1 ~ 1
'
fit <- growth(mod, data = data, estimator='ML',
missing = "FIML", group = 'sex')

summary(fit, standardized = TRUE, fit.measures = TRUE)

The following Warning turns up at the end of the estimation:

"Warning message:In lav_start_check_cov(lavpartable = lavpartable, start = START) : lavaan WARNING: starting values imply a correlation larger than 1; variables involved are: Int Slo [in block 2]"

After first checkings, it turns out, that the problem of the correlation >1 between Intercept and Slope occurs only in a certain group (male) within country_2. If I omit either the regression of the intercept or the regression of the slope to the country variable, the multigroup model runs perfectly. Model fits are always very good (CFI>0.97) but the slopes are low and insignificant - so there seems to be no change over time. However, a LR test indicated that a random-intercept-random-slope model is significantly better than just a random-intercept-fixed slope model.

Now my Question is: What does it mean, when in a certain group the correlation between Intercept and Slope is estimated >1 as soon as a certain conditional variable is considered? Is this a problem of multicollinearity? It seems to me, that the whole issue is located in the following section of the model:

Let me know if you need further information. I would be very grateful for your help.

Best Wishes

Randy

Mauricio Garnier-Villarreal

unread,

Jan 15, 2021, 8:58:10 AM1/15/21

to lavaan

Randy

A couple things

mod = '
Int =~ 1*y1 + 1*y2 + 1*y3

Slo =~ 0*y1 + 1*y2 + 2*y3

## usually you would put 0 as the first loading for the slope.

## you can put the 0 anywhere you want, but where in time the 0 is located is the time that the intercept is located for interpretation

Int ~~ Slo

Int ~ X1 + country_2 # with country_1 as reference category)
Slo ~ X1 + country_2
X1 ~ country_2

X1 ~ 1 ## this line is unneccesary

'

Since I cant see your data, make sure the categorical variable is binary 0/1, so it can be use this way

If the high correlation shows conditional of a group variable would be that that is the most likely correlation for the reference group

With a high correlation between intercept and slope, would indicate that: the higher the starting point the higher the rate of change

##

Model fits are always very good (CFI>0.97) but the slopes are low and insignificant - so there seems to be no change over time. However, a LR test indicated that a random-intercept-random-slope model is significantly better than just a random-intercept-fixed slope model.

##

when you tested for random-intercept-fixed slope model, you are still estimating a slope, so they are not very different ones. If you wanted to test Mean slope equal to 0, would have to add Slo ~0*1

Randy Stache

unread,

Jan 15, 2021, 12:35:36 PM1/15/21

to lavaan

Thank you for the advice.

1) I fixed the Syntax to "Slo =~ 0*y1 + 1*y2 + 2*y3". That really makes more sense

2) I am sure the categorical variable is binary

3) Sorry, I wrote something wrong in my request: In the beginning I compared an unconditional random-intercept-random-slope model with an unconditional random-intercept model:

mod1 = '

Int =~ 1*y1 + 1*y2 + 1*y3

'

vs.

mod 2= '

Int =~ 1*y1 + 1*y2 + 1*y3
Slo =~ 0*y1 + 1*y2 + 2*y3

Int ~~ Slo

'

and the improvement in chi² was significant with p~0.01. After that I started to add the conditional variables and tested it without the separation by group. It worked well and the fit was very good. By implemetning the multigroup option I got the problem with the correlation of I and S as described above.

4) Could you be more specific about what that means?:

"If the high correlation shows conditional of a group variable would be that that is the most likely correlation for the reference group

With a high correlation between intercept and slope, would indicate that: the higher the starting point the higher the rate of change"

--> I know, how to interpret the covariance between slope and intercept in regular models and what a group difference means for this parameter. But what could be the reason/What does it mean that the estimated correlation in the male group is = 1.189, when slope and intercept are both regressed on country_2? And moreover, how should I deal with it? I would very much appreciate another brief explanation.

Thank you a lot and have a nice weekend!

Mauricio Garnier-Villarreal

unread,

Jan 17, 2021, 4:08:12 PM1/17/21

to lavaan

Randy

When looking at a correlation out of bounds (std.all), like it seems this might be the case, you would first need to see if its an estimation problem or sample limitation.

So, a sample limitation would be that in you rcase you are working with multiple groups and binary covariates, so the correlation represents the partial correlation in function of this binary variable. Check the crosstable of sample size for multiple group*binary variable, if the sample size for some of the groups is very small might be that you are asking too big of a model for small samples.

On the other side, if you dont see any other estimation problem (eignevalues, very large SE, etc), would be good to check if the correlation is significantly different from 1. Meaning that you want to see if the inference would say that the correlation is very far from the extreme (if the correlation is 1.89 seems very far from 1). You could argue that the estimation is still fine but the out of bounds values are needed to estimate the best solution

Kolenikov, S., & Bollen, K. A. (2012). Testing Negative Error Variances: Is a Heywood Case a Symptom of Misspecification? Sociological Methods & Research, 41(1), 124–167. https://doi.org/10.1177/0049124112442138

Savalei, V., & Kolenikov, S. (2008). Constrained versus unconstrained estimation in structural equation modeling. Psychological Methods, 13(2), 150–170. https://doi.org/10.1037/1082-989X.13.2.150

Now, the interpretation of a high correlation between growth factors is different than from a CFA. In a CFA you would talk about the factor beng similar between them and might indicate that needs to use 1 factor instead of 2. But in a growth curve you cant say that because the factor are now describing the grwoth over time. So, you could say that the participants with higher intercept also presented the largerst increase over time, the factors are still different, but describing the conditional relation between them over time

Could you include paste here the output to see it in more detail?

hope this makes sense

Randy Stache

unread,

Jan 21, 2021, 6:48:04 AM1/21/21

to lavaan

Many thanks for the explanations. I think I'm beginning to understand what's going on:

1) estimation problem vs. sample limitation

     | male     female |     Total
-----------+-------------------------+------------
       EN |        48        122 |     170
       GE |      296       795 | 1,091
      SW |        90        199 |     289
-----------+--------------------------+---------
   Total |       434      1,116 |     1,550

There really seem to be relatively few cases for certain combinations, but calculations seem to me to be possible in principle.

2) The correlation between Intercept and Slope is -0.739(**) in group 1 and +1.381(n.s.) in group 2. When I fix this covariance for both groups the warning message no longer appears. The Model is then slightly worse compared to the one with the free estimate of the covariance (p for Chi² diff. = 0.047) but i am sceptical about the fit.

The results of both models are attached.

It seems to me that "the estimation is still fine but the out of bounds values are needed to estimate the best solution" as you said. Of course, this could have something to do with the low case numbers. What do you think?

uncond_LCM_MG.txt

Mauricio Garnier-Villarreal

unread,

Jan 21, 2021, 9:21:04 AM1/21/21

to lavaan

Randy

The p-value for the high correlation doesnt really test what we want here. We want to test if 1 is within the range of the estimate. While the p-value is testing if 0 is within.

What countr is 1, 2 and 3 in your covariates? I think here is where your sample size is becoming the issue. If you have EN as you country 1 (reference group) then that correlation might the correlation for the N=48, and might by that given the small sample the range (CI) is large and it can be lower than 1 as well. But you are putting the smalles group as the one to carry the heavist part of the estimation. I would recommend to have GE as your reference group as they have the largest sample sizes

Patrick (Malone Quantitative)

unread,

Jan 21, 2021, 9:24:16 AM1/21/21

to lav...@googlegroups.com

Though if the correlation is above 1 and the p-value doesn't rule out a fixed null of 0, it won't rule out 1, either.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/12c3dcdb-9934-4954-b7be-228608ca6a81n%40googlegroups.com.

--

Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

Mauricio Garnier-Villarreal

unread,

Jan 21, 2021, 9:40:18 AM1/21/21

to lavaan

The p-value is specifically the probability againts the null hypothesis of 0. So, the hypothesis against 1 would have a different p-value, not sure I can make intuitive sense of there relation. Also, gets more odd here because the p-value is for the covariance, not the correlation. Just saying i dont know an easy way to extrapolate that p-value for the test of correlation == 1

Patrick (Malone Quantitative)

unread,

Jan 21, 2021, 10:06:54 AM1/21/21

to lav...@googlegroups.com

No, I didn't mean you could extrapolate the p- value. Just that all else being equal, the p- value will be larger for a new null value (1) that lies between the old null value (0) and the observed value (1.381). A smaller difference is harder to reject than a larger one (all else equal).

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/0fd8d7b0-fcaf-4052-8a34-3a60bd3422e8n%40googlegroups.com.

Reply all

Reply to author

Forward