dummy variables and path coefficients

173 views
Skip to first unread message

LBrigham

unread,
Jun 29, 2018, 10:42:51 AM6/29/18
to lavaan
Hello all,

I have an exogenous nominal variable (K=3) that I've included in an SEM as K-1 dummies. 

Necessarily, I am able to get standardized path coefficients for the two dummy variables that are included in the model. Is there a way to find the path coefficient of the third variable that is not explicitly included in the model? Would I have to run a separate model with that dummy alone, or would that be breaking some rule?

For context, the exogenous variable in question is "plant family" and the dependent variable is "interior root microbial structure." 

Thank you for any advice you can give.

Best,
LB

Terrence Jorgensen

unread,
Jun 29, 2018, 11:37:55 AM6/29/18
to lavaan
Is there a way to find the path coefficient of the third variable that is not explicitly included in the model? Would I have to run a separate model with that dummy alone, or would that be breaking some rule?

There isn't one.  If you want to have an intercept for each group instead of having k-1 group comparisons to a reference group, you could fix the DV's intercept to zero and include all k=3 dummies as predictors.

DV ~0*1

That would mean you don't have tests of group differences in intercepts (equivalent to differences in adjusted means).  But you can simply create user-defined parameters to calculate any group difference you want.  Just label each dummy code's effect:

DV ~ 0*1 + g1*dummy1 + g2*dummy2 + g3*dummy3
diff12
:= g1 - g2
diff13
:= g1 - g3
diff23
:= g2 - g3


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

LBrigham

unread,
Jun 29, 2018, 11:49:04 AM6/29/18
to lavaan
Hello,

Thank you for your helpful response. I'm using lavaan in R, and I get an error message when I try this: "Error in lav_model_estimate(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  : 
  lavaan ERROR: initial model-implied matrix (Sigma) is not positive definite;
  check your model and/or starting parameters."

Have I made a mistake in the code?

test<-'
PCoA1_S~pH_15
PCoA1_R~ 0*1 + PCoA1_S+ g1*host_family.fAsteraceae+ g2*host_family.fCyperaceae + g3*host_family.fPoaceae
diff12 := g1 - g2
diff13 := g1 - g3
diff23 := g2 - g3
'
test.fit<-sem(test, data=df)

Where PCoA1_S is the soil microbial structure and PCoA1_R is the root interior microbial structure. 

Terrence Jorgensen

unread,
Jun 30, 2018, 5:58:09 AM6/30/18
to lavaan
Have I made a mistake in the code?

Woops!  No, that was my mistake.  I was thinking of what was possible with OLS regression (using the lm() function).  This is not possible in SEM / analysis of covariance structure because the input data is the covariance matrix, which will have linear dependency when there are dummy codes for every group. 

So go back to k - 1 dummy codes like you had before.  Those slopes are the group-(adjusted-)mean differences from the reference group. To calculate differences between any other pair of groups, you just subtract their slopes from each other.  Since they are both differences from the same reference group, the reference-group mean (i.e., the intercept) cancels out.   

Laurel Brigham

unread,
Jun 30, 2018, 5:25:38 PM6/30/18
to lavaan
Okay, I'm back to K-1. I suppose even if I can find the path coefficient for the variable not included in the model, I wouldn't know if it was significant. Am I able to run a separate model with only the variable not included in the SEM with K-1? This would allow me to find t whether it was significant and he standardized path coefficient. I'm not sure if this is a sound move, however.

Here's a sample of the output from the SEM. The model output doesn't include the slope, but does include the std. path coefficient (Std.all).

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  PCoA1_S ~                                                             
    pH_15             0.349    0.041    8.519    0.000    0.349    0.807
  PCoA1_R ~                                                             
    PCoA1_S          -0.330    0.161   -2.043    0.041   -0.330   -0.283
    hst_fmly.fAstr    0.023    0.065    0.347    0.729    0.023    0.048
    hst_fmly.fCypr    0.147    0.049    2.978    0.003    0.147    0.420

Laurel Brigham

unread,
Jul 1, 2018, 7:14:14 PM7/1/18
to lavaan
I've come up with a solution: Compare the full model to a nested model without the dummy variables using the anova function in lavaan. This tells me whether the dummy variables add to the model. Thank you for you help. 

Edward Rigdon

unread,
Jul 2, 2018, 8:51:53 AM7/2/18
to lav...@googlegroups.com
Models with different numbers of obseved variables, or different observed variables, are not nested. To obtain nested models, include the ummy variables but do not let them have effects on any other variables--only let them correlated among themselves. Then the model with the paths from the dummies to other variables fixed to 0 will be nested within another model where the same paths are free to be estimated.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages