Transformations of paramers of growth models using orthogonal polynomials to original time metric

105 views
Skip to first unread message

Wenyuan Liu

unread,
Nov 12, 2023, 7:27:46 PM11/12/23
to lavaan
Dear Lavaan Users,

We are fitting a cubic growth model (using lavaan growth function), and have used orthogonal polynomials to estimate parameters due to there were errors in models using the original measured time.

Now, we can get parameters for estimated growth models without errors, but for clear interpretations, we want to re-express or translate the parameters from models using orthogonal polynomials to the corresponding model using the original measured time.

We have tried to get all these transformations done in R and failed to find some packages/functions to achieve this. So not sure whether any of you happen to know whether such ways exist in R or that we have to perform these transformations by hand?

Thank you very much for any answers.

Best wishes,
Wenyuan

Terrence Jorgensen

unread,
Nov 15, 2023, 9:58:46 AM11/15/23
to lavaan
I'm not sure why you don't just fit the model using orthogonal contrast codes instead of transforming from the usual parameterization (time = 0, 1, ... T).  contr.poly() provides orthonormal polynomial contrasts, and is quite flexible with uneven spacing of occasions.  

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Wenyuan Liu

unread,
Nov 16, 2023, 12:21:21 PM11/16/23
to lavaan
Thank you very much for the response, Terrence!

I am sorry that I may not fully understand your point...poor in statistics. It would be helpful if you provide an example of the code. Any help and thoughts would really be appreciated!

But maybe it is good to let you know that I have used poly to get orthogonal polynomials to replace the original measured time metric, with direct use of which we got optimizer issues as shown below:
> model_cubic <- ' + i =~ 1*T1 + 1*T2 + 1*T3 + 1*T4 + 1*T5 + 1*T6 + s =~ 0*T1 + 2*T2 + 4*T3 + 8*T4 + 11*T5 + 14*T6 + q =~ 0*T1 + 4*T2 + 16*T3 + 64*T4 + 121*T5 + 196*T6 + c =~ 0*T1 + 8*T2 + 64*T3 + 512*T4 + 1331*T5 + 2744*T6 + + ' > fit_cubic <- growth(model_cubic, data= data, sampling.weights="WEIGHT2") Warning message: In lavaan::lavaan(model = model_non_ASD_cubic, data = data_wide_non_ASD, : lavaan WARNING: the optimizer (NLMINB) claimed the model converged, but not all elements of the gradient are (near) zero; the optimizer may not have found a local solution use check.gradient = FALSE to skip this check.

the orthogonal polynomials we used to model the trajectory of variables of interest are shown below:

degree=3

x<-c(0,2,4,8,11,14)

mypoly<-poly(x=x,degree=degree, simple = TRUE)

int<-rep(1/sqrt(6),6)

TSprime_poly<-cbind(int,mypoly)

 

TSprime_poly

           int          1            2          3

[1,] 0.4082483 -0.5352015  0.519842695 -0.4427470

[2,] 0.4082483 -0.3705241 -0.004725843  0.3554442

[3,] 0.4082483 -0.2058467 -0.353194559  0.4815676

[4,] 0.4082483  0.1235080 -0.521832524 -0.2174819

[5,] 0.4082483  0.3705241 -0.186048965 -0.5258859

[6,] 0.4082483  0.6175402  0.545959194  0.3491029


based on these produced orthogonal polynomials, we refitted the model as below, and got all estimates normally:

> model_cubic <- ' i =~ 0.4082483*T1 + 0.4082483*T2 + 0.4082483*T3 + 0.4082483*T4 + 0.4082483*T5 + 0.4082483*T6
s =~ -0.5352015*T1 + -0.3705241*T2 + -0.2058467*T3 + 0.1235080*T4 + 0.3705241*T5 + 0.6175402*T6
q =~ 0.519842695*T1 + -0.004725843*T2 + -0.353194559*T3 + -0.521832524*T4 + -0.186048965*T5 + 0.545959194*T6
c =~ -0.4427470*T1 + 0.3554442*T2 + 0.4815676*T3 + -0.2174819*T4 + -0.5258859*T5 + 0.3491029*T6 + + ' > fit_cubic <- growth(model_cubic, data= data, sampling.weights="WEIGHT2")


however, the intercept after using orthogonal polynomials has changed to means across time, which is not what we were expected to explain. Based on the description shown below (P91, Hedeker & Gibbons, 2006), so we found these estimate parameters based on orthogonal polynomials could be translated to ones based on the original measured time that we would want to make interpretations:

.....translating parameters.....The parameters from a model using orthogonal polynomials can be directly related to those in the corresponding model that uses the original metric for time. One is simply a reexpressed or translated version of the other......


another question on coding of time, originally for ease of interpretation, we centred time to the first time points, from 3/5/7/11/14/17 to 0/2/4/8/11/14, but got 'optimizer problems', now with the use of orthogonal polynomials, we faced data transformations, and this seems a challenge for plotting trajectory as I do not know how to rescale back to original measured time. But recently, I saw someone used other strategies to code the time in growth models:

the original time is 0 months, 4 months, 36 months, 48 months.....Three LGM models were considered to specify the linear and non-linear growth components: (1) linear, (2) quadratic (non-linear), and (3) cubic (non-linear). In the linear model, the loadings of the latent intercept were fixed as 1 across four occasions, and the loadings for the linear change were fixed as 0, 0.3, 3, and 4 according to the time of measurement (i.e., the interval between T1 and T2 was four months; the interval between T2 and T3 was 32 months, and the interval between T3 and T4 was 12 months).....https://link.springer.com/article/10.1007/s10964-022-01727-w#Tab4 


just wondering if it is suitable to recode the time above in my studies using such strategies which may reduce errors we came across: from 0/2/4/8/11/14 to 0/1/2/4/5.5/7


but I tried to this new coding strategy to fit the cubic model, and I got another error:

Warning message: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, : lavaan WARNING: The variance-covariance matrix of the estimated parameters (vcov) does not appear to be positive definite! The smallest eigenvalue (= 3.935873e-13) is close to zero. This may be a symptom that the model is not identified.


With my limited knowledge, I do no really understand what this means, and what further suggestions could you please help make based on all the above descriptions.


Thank you very much.


Best,


Wenyuan


Terrence Jorgensen

unread,
Nov 22, 2023, 7:09:14 AM11/22/23
to lavaan
It would be helpful if you provide an example of the code

Thank you for doing the work yourself first.  You already implemented what I was referring to:

the orthogonal polynomials we used to model the trajectory of variables of interest are shown below:

degree=3

x<-c(0,2,4,8,11,14)

mypoly<-poly(x=x,degree=degree, simple = TRUE)

int<-rep(1/sqrt(6),6)

TSprime_poly<-cbind(int,mypoly)

 

TSprime_poly

           int          1            2          3

[1,] 0.4082483 -0.5352015  0.519842695 -0.4427470

[2,] 0.4082483 -0.3705241 -0.004725843  0.3554442

[3,] 0.4082483 -0.2058467 -0.353194559  0.4815676

[4,] 0.4082483  0.1235080 -0.521832524 -0.2174819

[5,] 0.4082483  0.3705241 -0.186048965 -0.5258859

[6,] 0.4082483  0.6175402  0.545959194  0.3491029


based on these produced orthogonal polynomials, we refitted the model as below, and got all estimates normally:

> model_cubic <- ' i =~ 0.4082483*T1 + 0.4082483*T2 + 0.4082483*T3 + 0.4082483*T4 + 0.4082483*T5 + 0.4082483*T6
s =~ -0.5352015*T1 + -0.3705241*T2 + -0.2058467*T3 + 0.1235080*T4 + 0.3705241*T5 + 0.6175402*T6
q =~ 0.519842695*T1 + -0.004725843*T2 + -0.353194559*T3 + -0.521832524*T4 + -0.186048965*T5 + 0.545959194*T6
c =~ -0.4427470*T1 + 0.3554442*T2 + 0.4815676*T3 + -0.2174819*T4 + -0.5258859*T5 + 0.3491029*T6 + + ' > fit_cubic <- growth(model_cubic, data= data, sampling.weights="WEIGHT2")


That is the same as I would have done, except I just use the contr.poly() function, which gives results for all T-1 polynomials.

contr.poly(1:length(x), scores = x)
             .L           .Q         .C          ^4          ^5
[1,] -0.5352015  0.519842695 -0.4427470  0.24223952 -0.14815996
[2,] -0.3705241 -0.004725843  0.3554442 -0.50228719  0.56337366
[3,] -0.2058467 -0.353194559  0.4815676  0.09658215 -0.65190380
[4,]  0.1235080 -0.521832524 -0.2174819  0.56563215  0.42253024
[5,]  0.3705241 -0.186048965 -0.5258859 -0.57450718 -0.23412931
[6,]  0.6175402  0.545959194  0.3491029  0.17234054  0.04828917


Of course, you don't need to model all higher-order polynomials, but if you do, then you will have a saturated model and can test whether something more complex than cubic is required to fit the data well.



however, the intercept after using orthogonal polynomials has changed to means across time, which is not what we were expected to explain


I don't know whose expectation you want to meet, but if you want the intercept to be the mean at the first occasion, then you cannot have orthogonal polynomial contrasts.  Contrasts have means of zero by definition, and any pair of contrasts are orthogonal/uncorrelated because their products have means of zero.  Interpreting the usual intercept as "initial status" is just something you give up when you define growth factors as polynomial contrasts.  

Voelkle, M. C., & McKnight, P. E. (2012). One size fits all? A Monte-Carlo simulation on the relationship between repeated measures (M)ANOVA and latent curve modeling. Methodology, 8(1), 2338. https://doi.org/10.1027/1614-2241/a000044

The latent "level" or "center-cept" is each person's mean across occasions, so that growth-factor's mean is the grand mean across all occasions and subjects.  Interpreting variability in that quantity is just as valid as interpreting variability in means on the first occasion.

Wainer, H. (2000). The centercept: An estimable and meaningful regression parameter. Psychological Science, 11(5), 434–436. https://doi.org/10.1111/1467-9280.00284

Wenyuan Liu

unread,
Jan 17, 2024, 3:26:27 PMJan 17
to lavaan
Hi Terrence,

Thank you very much for your responses. Good to know that we have been running growth models with orthogonal polynomials correctly!

Response 1:
==================================================================================================
Of course, you don't need to model all higher-order polynomials, but if you do, then you will have a saturated model and can test whether something more complex than cubic is required to fit the data well.
==================================================================================================
My response to this suggestion is that: as we only have six time points of data, it may not be adequate to model a quartic growth term and possibly have an overparameterization issue. If this assumption does not make sense, please correct me, thanks!

Response 2:
==================================================================================================
I don't know whose expectation you want to meet, but if you want the intercept to be the mean at the first occasion, then you cannot have orthogonal polynomial contrasts.  Contrasts have means of zero by definition, and any pair of contrasts are orthogonal/uncorrelated because their products have means of zero.  Interpreting the usual intercept as "initial status" is just something you give up when you define growth factors as polynomial contrasts. 
==================================================================================================
For this issue, I understand that the interpretations of intercepts change after defining growth factors as polynomial contrasts. But as we registered for this working study, we are more interested in the typical initial status. So we have done data results transformations manually based on the procedures described in Hedeker's book (2006). Thank you for sharing those reading documents!

New query:
==================================================================================================
Could you help have a look at this new issue regarding the model converging issue when I used FIML to deal with missing data? 

In lavaan::lavaan(model = model_non_ASD_cubic, data = data_wide_non_ASD, : lavaan WARNING: the optimizer (NLMINB) claimed the model converged, but not all elements of the gradient are (near) zero; the optimizer may not have found a local solution use check.gradient = FALSE to skip this check.

As the warning message showed above, I tried to use lavInspect(fit_non_ASD_cubic, "optim.gradient") to check the gradients, some of which were negative. Not sure what are next steps to address this issue, or do you have any ideas about this issue?
> q <- lavInspect(fit_non_ASD_cubic, "optim.gradient") > formatted_result <- sprintf("%.4f", q) > print(formatted_result) [1] "0.0003" "-0.0031" "-0.0026" "-0.0018" "-0.0002" "0.0003" "-0.0013" [8] "0.0017" "-0.0027" "-0.0029" "0.0003" "-0.0011" "0.0009" "0.0009" [15] "0.0001" "-0.0016" "0.0009" "-0.0000" "-0.0017" "0.0001"

The below shows the output regarding this new query.

> ##cubic model > model_non_ASD_cubic <- ' + i =~ 0.4082483*T1 + 0.4082483*T2 + 0.4082483*T3 + 0.4082483*T4 + 0.4082483*T5 + 0.4082483*T6 + s =~ -0.5352015*T1 + -0.3705241*T2 + -0.2058467*T3 + 0.1235080*T4 + 0.3705241*T5 + 0.6175402*T6 + q =~ 0.519842695*T1 + -0.004725843*T2 + -0.353194559*T3 + -0.521832524*T4 + -0.186048965*T5 + 0.545959194*T6 + c =~ -0.4427470*T1 + 0.35544428*T2 + 0.4815676*T3 + -0.2174819*T4 + -0.5258859*T5 + 0.3491029*T6 + ' > fit_non_ASD_cubic <- growth(model_non_ASD_cubic, data= data_wide_non_ASD, missing = "FIML",sampling.weights="WEIGHT2") Warning messages: 1: In lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored: 2: In lavaan::lavaan(model = model_non_ASD_cubic, data = data_wide_non_ASD, : lavaan WARNING: the optimizer (NLMINB) claimed the model converged, but not all elements of the gradient are (near) zero; the optimizer may not have found a local solution use check.gradient = FALSE to skip this check. > summary(fit_non_ASD_cubic,fit.measure=TRUE, standardized=TRUE) lavaan 0.6.15 did NOT end normally after 83 iterations ** WARNING ** Estimates below are most likely unreliable Estimator ML Optimization method NLMINB Number of model parameters 20 Used Total Number of observations 16488 16643 Number of missing patterns 63 Sampling weights variable WEIGHT2 Parameter Estimates: Standard errors Sandwich Information bread Observed Observed information based on Hessian Latent Variables: Estimate Std.Err z-value P(>|z|) Std.lv Std.all i =~ T1 0.408 3.452 0.701 T2 0.408 3.452 0.774 T3 0.408 3.452 0.733 T4 0.408 3.452 0.688 T5 0.408 3.452 0.670 T6 0.408 3.452 0.640 s =~ T1 -0.535 -2.314 -0.470 T2 -0.371 -1.602 -0.359 T3 -0.206 -0.890 -0.189 T4 0.124 0.534 0.106 T5 0.371 1.602 0.311 T6 0.618 2.669 0.495 q =~ T1 0.520 NA NA T2 -0.005 NA NA T3 -0.353 NA NA T4 -0.522 NA NA T5 -0.186 NA NA T6 0.546 NA NA c =~ T1 -0.443 NA NA T2 0.355 NA NA T3 0.482 NA NA T4 -0.217 NA NA T5 -0.526 NA NA T6 0.349 NA NA Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all i ~~ s 4.650 NA 0.127 0.127 q -8.517 NA -1.308 -1.308 c -0.197 NA -0.026 -0.026 s ~~ q -0.828 NA -0.249 -0.249 c -3.476 NA -0.893 -0.893 q ~~ c -1.211 NA -1.746 -1.746 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .T1 0.000 0.000 0.000 .T2 0.000 0.000 0.000 .T3 0.000 0.000 0.000 .T4 0.000 0.000 0.000 .T5 0.000 0.000 0.000 .T6 0.000 0.000 0.000 i 18.434 NA 2.180 2.180 s -0.902 NA -0.209 -0.209 q 0.805 NA NA NA c -1.388 NA NA NA Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .T1 13.520 NA 13.520 0.557 .T2 6.041 NA 6.041 0.304 .T3 7.163 NA 7.163 0.323 .T4 8.986 NA 8.986 0.358 .T5 8.258 NA 8.258 0.312 .T6 14.333 NA 14.333 0.493 i 71.477 NA 1.000 1.000 s 18.686 NA 1.000 1.000 q -0.593 NA NA NA c -0.811 NA NA NA Warning message: In lav_object_summary(object = object, header = header, fit.measures = fit.measures, : lavaan WARNING: fit measures not available if model did not converge

Thank you very much for your support here!

Best wishes,

Wenyuan
Reply all
Reply to author
Forward
0 new messages