conducting multilevel sem with nonnormal data.

You really shouldn't be running an MLSEM; even the L1 model has a low N, and L2 is obscenely low. You can't trust results based on asymptotic theory so far from the asymptote. But to account for nonnormality, you could set estimator = "MLR".

Looking at your model, your only strictly Level-2 variable is the exogenous predictor coach_ff. So you could run separate multilevel regression models with lmer() in the lme4 package, then use the mediate package to estimate and test indirect effects (it will use Bayesian or bootstrap methods to calculate the CI, which are perfectly acceptable).

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

simon taylor

unread,

Jun 20, 2019, 4:21:43 PM6/20/19

to lavaan

Thank you for your feedback, I really appreciate it! I knew my sample size was small, but I've seen papers published in my field that have around the same numbers, however, they might have used different programs. I used your advice regarding the estimator="MLR" and reduced the amount of correlations and the output has improved in terms of providing me with more data, however, the biggest issue is that it says degrees of freedom=-1 I know I'm asking a lot, but I was hoping you might be able to take a quick look at the improved input and the output and see whether it looks any better now.

#calculate level 2 model with coach FF as explanatory variable

mlm <- 'level: within
athlete_ff ~wc*ccb + wb*pnts
pnts ~ wa*ccb

        level: between
        ccb~coach_ff
        athlete_ff ~ bc*ccb + bb*pnts
        pnts ~ ba*ccb

#correlation

athlete_ff~~pnts
athlete_ff~~ccb
pnts~~ccb
ccb~~coach_ff

#defined parameters
wi := wa * wb
bi := ba * bb
'

fit4 <- sem(mlm, data = Rdata_complete, cluster = "team",optim.method= "em", estimator= "MLR", std.lv=TRUE)
summary(fit4)

lavaan 0.6-3 ended normally after 5398 iterations

  Optimization method                               EM
  Number of free parameters                         22

                                                  Used       Total
  Number of observations                           151         157
  Number of clusters [team]                         38

  Estimator                                         ML      Robust
  Model Fit Test Statistic                          NA          NA
  Degrees of freedom                                -1          -1
  Scaling correction factor                           
    for the Yuan-Bentler correction (Mplus variant)

Parameter Estimates:

  Information                                 Observed
  Observed information based on                Hessian
  Standard Errors                   Robust.huber.white


Level 1 [within]:

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  athlete_ff ~                                        
    ccb       (wc)   -0.030    0.078   -0.387    0.699
    pnts      (wb)    0.300    0.067    4.471    0.000
  pnts ~                                              
    ccb       (wa)    0.665    0.092    7.254    0.000

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)
   .athlete_ff        0.000                           
   .pnts              0.000                           
   .ccb               0.000                           

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .athlete_ff        0.542    0.076    7.124    0.000
   .pnts              0.847    0.095    8.947    0.000
   .ccb               0.952    0.167    5.707    0.000


Level 2 [team]:

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  ccb ~                                               
    coach_ff          0.007    0.089    0.078    0.938
  athlete_ff ~                                        
    ccb       (bc)   -0.764    0.504   -1.517    0.129
    pnts      (bb)    0.674    0.436    1.545    0.122
  pnts ~                                              
    ccb       (ba)   -0.824    6.449   -0.128    0.898

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
 .athlete_ff ~~                                       
   .pnts              0.287    1.542    0.186    0.852
   .ccb               0.209    0.170    1.229    0.219
 .pnts ~~                                             
   .ccb               0.382    1.827    0.209    0.834
 .ccb ~~                                              
    coach_ff          0.011    0.003    4.053    0.000

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)
   .ccb               2.655    0.119   22.293    0.000
   .athlete_ff       -0.023    1.731   -0.013    0.989
   .pnts              4.847   17.172    0.282    0.778
    coach_ff         -0.579    0.129   -4.500    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .ccb               0.281    0.149    1.883    0.060
   .athlete_ff        0.155    0.331    0.469    0.639
   .pnts              0.558    4.980    0.112    0.911
    coach_ff          0.629    0.108    5.838    0.000

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)
    wi                0.199    0.055    3.600    0.000
    bi               -0.555    4.300   -0.129    0.897

Is there a way to improve the df, or do I have to scrap the idea of using lavaan and use another method, such as lmer? I have tried the mediate package with lmer () and while it offers direct/indirect effects, the CI's are all 0 and for the p-value it just says NA.

Best regards,

Simon

Terrence Jorgensen

unread,

Jun 21, 2019, 10:29:59 AM6/21/19

to lavaan

it says degrees of freedom=-1

You need to delete the correlations. You are estimating the same bivariate relationship as both a slope and a correlation, which is redundant unless you make other restrictive assumptions.

simon taylor

unread,

Jun 26, 2019, 11:00:14 AM6/26/19

to lavaan

Hi Terrence,

Thank you for your reply and apologies for the late response. I have spent some time trying to understand how to use lmer and mediation package, but there seems to be less documentation out there in comparison to lavaan. Also, I am unable to figure out how to deal with my skewed data. Do you have any sources or information you might be able to provide me with, which might help?