Define new paramters that are not estimated but determined

221 views
Skip to first unread message

Markus Jansen

unread,
Sep 22, 2022, 6:38:39 AM9/22/22
to lavaan
Dear all,

I am trying to define new parameters in a lavaan model. For example I estimate a parameter that is actually a difference between two loadings and I like to determine the two paramters it depends on (minimal example, of course not necessarily identified)

fac1 =~ x + y + z

x == a + b
y == c + d
z == e + f

Of course one parameter is fixed for identification. If I do it like this, I get an error that the not fixed parameters are unknown, which makes sense. But how can I define new paramters, one which later defined constraints depend on? I already found this https://groups.google.com/g/lavaan/c/_-4zClO8oj8
where it is explained to define phantom parameters:
a =~ 0
a ~ NA*1 + label("a")*1 + .0?1

While that works for the estimation, I cannot compute factor scores, because the system is singular due to the "estimates" being determined.

I know, in Mplus this is achieved with

fac1 by x y z;

MODEL CONSTRAINT: NEW a b c d e f;
x = a + b;
y = c + d;
z = e + f;

But how can that be achieved in lavaan?

Thanks in advance!

Markus

Edward Rigdon

unread,
Sep 22, 2022, 8:50:35 AM9/22/22
to lav...@googlegroups.com
The operator to define a new parameter is :=
You must first label parameters so that you can work with them:
f =~ a*x1 + b*x2 + c*x3
d := a + b
The lavaan tutorial (https://lavaan.ugent.be/tutorial/ ) explains this well

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/3296e1be-de25-4765-b8d3-8ca8761df33en%40googlegroups.com.

Markus Jansen

unread,
Sep 22, 2022, 9:50:58 AM9/22/22
to lavaan
Thank you. The prblem is the other way, that is I have named the parameters

fac1 =~ x*x1 + y*x2 + z*x3

but I dont want to have a new parameter (in your example d) be dependend on x, y and z. I need x, y, and z to be dependend on the new parameters (a to f). Taking your example

x := a + b
y := c + d
z := e + f

This leads to yet undefined parameters...

Edward Rigdon

unread,
Sep 22, 2022, 3:51:21 PM9/22/22
to lav...@googlegroups.com
Yes, given that the new values are themselves undefined.
Where do those other variables get their values?


Markus Jansen

unread,
Sep 22, 2022, 4:14:38 PM9/22/22
to lavaan
The new values (in my example a to f) are the solutions of the equations. For example if I have

fac1 =~ x*x1 + y*x2 + z*x3

x := a + b
y := b + c
z := c + d

If I fix one parameter to 1, these are just solvable equations as y, x and z were estimated. Let e.g. x = .25, y = .40 and z = .50 then and a = 1

x =  1 + (-.75)
y = -.75 + 1.15
z = 1.15 + (-.65)
so
b = -.75
c = 1.15
d = -.65

But lavaan does not seem to be able to work with it at the moment. Of course in this simple example I could just change the order, however, this is not always possible. And not always one of the parameters is fixed (still identified). If I define the parameters like this

a =~ 0
a ~ NA*1 + label("a")*1 + .0?1

b =~ 0
b ~ NA*1 + label("b")*1 + .0?1
...

I can estimate all model paramters, and get the solutions I would expect. So this works. However, the parameters a to f are treated as variables in a way, that factor scores cannot be estimated because the system is singular. On the other hand factor scores can be estimated, if I just define (:=) the parameters to the known values, however, the values are generally not known.
So am I right that lavvan it currently not capable of doing this?

Alex Schoemann

unread,
Sep 23, 2022, 9:35:00 AM9/23/22
to lavaan
I think some of the confusion stems from your use of :=. With that operator the new parameter name should be on the left side. I'm not sure exactly what happens when you use labels from the model on the left side (but I'm guessing it doesn't help. So you would need to express everything as a function of the new parameters. You can also specify a fixed value for a new parameter using :=. In your case, assuming a is fixed to 1. I think something like this will work:

a := 1
b := x - a
c := y - b
d := z - c

Markus Jansen

unread,
Sep 23, 2022, 9:40:28 AM9/23/22
to lavaan
Thank you. I know that with := the new parameter name should be on the left side. Sadly, not always one of the new paramters is fixed (it is still solvable by the model). I will try to find a solution by rearanging everything, and hope for the best.

Thanks you all for your help!

Terrence Jorgensen

unread,
Oct 8, 2022, 3:54:28 AM10/8/22
to lavaan
In your original post, you were trying to use a parameter operator on variables (x, y, and z) rather than parameters (in both lavaan and Mplus example syntax).  Then you updated your syntax so that x, y, and z were parameters:

fac1 =~ x*x1 + y*x2 + z*x3

x := a + b
y := b + c
z := c + d
... 
I know, in Mplus this is achieved with

fac1 by x y z;

MODEL CONSTRAINT: NEW a b c d e f;
x = a + b;
y = c + d;
z = e + f;
 

However, later in that same post, you still conflated parameters and variables:

If I define the parameters like this

a =~ 0
a ~ NA*1 + label("a")*1 + .0?1

b =~ 0
b ~ NA*1 + label("b")*1 + .0?1

You used "a" (and "b") as both a phantom-factor's name (i.e., a variable) and as the name of that variable's mean.  That is easy enough to distinguish by using different arbitrary names for the phantom constructs:

foo =~ 0
foo ~ NA*1 + label("a")*1 + .0?1


I'm being pedantically specific because I want to make sure to understand what you are actually trying to do.  Later in that same post, you referred to wanting factor-score estimates (from lavPredict()?), but I'm not sure you were using the term to mean subject-level variables because you were referring to them being estimated when you fit the model (which does not happen):

the parameters a to f are treated as variables in a way, that factor scores cannot be estimated because the system is singular. On the other hand factor scores can be estimated, if I just define (:=) the parameters to the known values, however, the values are generally not known.

This makes it sound like you are just referring to the model parameters (e.g., the mean and variance of the factor scores) rather than factor scores.  Is that right?  If you are looking for casewise estimates of what (e.g.) subject 1's value of "a" is, that is not possible in lavaan.

Assuming you are trying to do what you posted using Mplus

MODEL CONSTRAINT: NEW a b c d e f;
x = a + b;
y = c + d;
z = e + f;

That is NOT equivalent to using the := operator, which defines a NEW parameter.  The model constraint syntax in lavaan uses the same logical operators as R, e.g., set 2 parameters to equality using ==

x == a + b
y == b + c
z == c + d
 
So am I right that lavvan it currently not capable of doing this?

I don't know if it would work with your data, but in principle it should work the same way as I showed in the post you linked to when you started this thread.  Here's the 2 important distinctions:
  • lavaan's analog to the Mplus NEW operator is := only in the case that it is a function of estimated parameters; otherwise, the analog to NEW is to create a phantom construct that is unconnected to anything (with variance == 1) and label its mean, to be estimated subject to your constraints.
  • And whereas Mplus requires an explicit command MODEL CONSTRAINTS: for that portion of model syntax, lavaan just recognizes any such equality (==) or inequality (!=, >, <, >=, or <=) constraints in your model syntax.
john =~ 0
john ~~ 1*john
john ~ NA*1 + label("a")*1 + .0?1

paul =~ 0
paul ~~ 1*paul
paul ~ NA*1 + label("b")*1 + .0?1

george =~ 0
george ~~1*george
george ~ NA*1 + label("c")*1 + .0?1

Sir_Richard_Starkey =~ 0
Sir_Richard_Starkey ~~ 1*Sir_Richard_Starkey
Sir_Richard_Starkey ~ NA*1 + label("d")*1 + .0?1

## don't forget orthogonality constraints, or simply set 
##     lavaan(..., orthogonal = TRUE)
## and explicitly estimate factor covariances of interest
## in the model syntax, if there are any

fac1 =~ x*x1 + y*x2 + z*x3

x == a + b
y ==     b + c
z ==         c + d

In principle, this should "work", but your model is not identified because you are trying to estimate 3 loadings as a function of 4 parameters, giving you df = −1.  The problem would be worse in your original post (a-f), where the first factor loading was the sum of a and b without any other constraints on a or b (thus, an infinite number of pairs could provide identical solutions), even if you had nonnegative df.  For example, the syntax above wouldn't be an identified model even if you had a 4th indicator (giving you df = 1):

HS.model <- ' visual  =~ x*x1 + y*x2 + z*x3 + x4

john =~ 0
john ~~ 1*john
john ~ NA*1 + label("a")*1 + .0?1

paul =~ 0
paul ~~ 1*paul
paul ~ NA*1 + label("b")*1 + .0?1

george =~ 0
george ~~1*george
george ~ NA*1 + label("c")*1 + .0?1

Sir_Richard_Starkey =~ 0
Sir_Richard_Starkey ~~ 1*Sir_Richard_Starkey
Sir_Richard_Starkey ~ NA*1 + label("d")*1 + .0?1


x == a + b
y ==     b + c
z ==         c + d
'

fit <- cfa(HS.model, data = HolzingerSwineford1939, std.lv = TRUE)

The summary() shows your constraints are met, but the warning message says its vcov() is singular.  

  visual =~                                                
    x1         (x)          0.912    0.088   10.312    0.000
    x2         (y)          0.498    0.079    6.310    0.000
    x3         (z)          0.642    0.078    8.230    0.000
    x4                      0.493    0.078    6.307    0.000

 
   .john       (a)    0.595    0.078    7.658    0.000    0.595    0.595
   .paul       (b)    0.317    0.047    6.719    0.000    0.317    0.317
   .george     (c)    0.182    0.054    3.348    0.001    0.182    0.182
   .Sr_Rchrd_S (d)    0.460    0.067    6.904    0.000    0.460    0.460


Running the model again with different starting values gives you the same factor-loading estimates, but completely different values of a, b, c, and d to achieve those loadings.

john ~ NA*1 + label("a")*1 + .5?1
paul ~ NA*1 + label("b")*1 + -.5?1
george ~ NA*1 + label("c")*1 + 13?1
Sir_Richard_Starkey ~ NA*1 + label("d")*1 + 0?1


Yields:

visual =~                  (same)                              
    x1         (x)          0.912    0.088   10.312    0.000
    x2         (y)          0.498    0.079    6.310    0.000
    x3         (z)          0.642    0.078    8.230    0.000
    x4                      0.493    0.078    6.307    0.000


              (different)
   .john       (a)    2.386    0.078   30.696    0.000
   .paul       (b)   -1.474    0.047  -31.278    0.000
   .george     (c)    1.972    0.054   36.318    0.000
   .Sr_Rchrd_S (d)   -1.330    0.067  -19.948    0.000


Maybe you posted an oversimplified example to ask about the syntax, but your real model has more moving parts with sufficient constraints.  I'd recommend empirically checking identification like this, if you get your syntax working.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Markus Jansen

unread,
Oct 8, 2022, 6:02:10 AM10/8/22
to lavaan
Thank you for your reply. It took me some time to understand the difference between the labeled parameters and constraint parameters, that are newly defined. The example is oversimplified, but the real cases are all identified. Model identification is not a problem in this case.

Now, my solution simply is to define the new paramters in order they are needed (they must be defined, the phantom variables do help for the model definition, but not for the factor scores per individual). If before I had

x == a + b
y == b + c
z == c + d

I know have (in this example a := 1)

b := x - a
c := y - b
d := z - c

The problem only is/was, that this changes for each specified model, and the order seems to matter.

That way, I am able to define the parameters, get correct moder estimation results and factor scores per individual.
Reply all
Reply to author
Forward
0 new messages