NA in parameterEstimates output

217 views
Skip to first unread message

Jill Jacobson

unread,
Feb 14, 2023, 11:36:25 PM2/14/23
to lav...@googlegroups.com
I am doing a mediation analysis with percentile bootstrap confidence intervals (syntax below). I ran the same model last year, and the parameterEstimates() output had values for se, z, and p of the calculated values (a*b and total). This year I get values only for the bootstrap confidence intervals (pasted below). I tried a different model and again NA in se, z, and p, but I do get the confidence interval boundaries (output also below). I'm using the confidence intervals to determine significance, so I don't need the p value, but I just wondered what causes lavaan to put NA in the output and why the change this year. I have been Googling for a couple of hours and haven't found an answer yet.

library(lavaan)
#Simple mediation model
med.model.boot <- '#direct effect and b path
   relatdet ~ c*morality + b*identity
   #mediator a path
   identity ~ a*morality
   # indirect effect (a*b)
   ab := a*b
   #total effect
   total := c + (a*b)'
fit.med.model.boot <- sem(med.model.boot, data = Lab5, se = "bootstrap", bootstrap = 5000)
parameterEstimates(fit.med.model.boot, boot.ci.type = "perc")
summary(fit.med.model.boot, fit.measure = TRUE)
> parameterEstimates(fit.med.model.boot, boot.ci.type = "perc")
       lhs op      rhs label    est    se      z pvalue ci.lower ci.upper
1 relatdet  ~ morality     c -0.831 0.107 -7.788      0   -1.044   -0.625
2 relatdet  ~ identity     b -0.672 0.147 -4.575      0   -0.958   -0.376
3 identity  ~ morality     a  0.374 0.041  9.050      0    0.294    0.460
4 relatdet ~~ relatdet        0.940 0.073 12.798      0    0.791    1.079
5 identity ~~ identity        0.211 0.019 11.154      0    0.175    0.251
6 morality ~~ morality        0.480 0.000     NA     NA    0.480    0.480
7       ab :=      a*b    ab -0.252    NA     NA     NA   -0.370   -0.140
8    total :=  c+(a*b) total -1.083    NA     NA     NA   -1.259   -0.910

Other model output with same issue
> parameterEstimates(fit.med.model.boot, boot.ci.type = "perc")
         lhs op        rhs label    est    se      z pvalue ci.lower ci.upper
1 Depression  ~  NegEvents     c  0.369 0.073  5.045      0    0.228    0.518
2 Depression  ~ Rumination     b  0.737 0.066 11.094      0    0.601    0.865
3 Rumination  ~  NegEvents     a  0.252 0.064  3.950      0    0.131    0.382
4 Depression ~~ Depression       68.358 8.044  8.498      0   53.600   85.258
5 Rumination ~~ Rumination       58.637 4.306 13.619      0   50.262   66.920
6  NegEvents ~~  NegEvents       52.121 0.000     NA     NA   52.121   52.121
7         ab :=        a*b    ab  0.186    NA     NA     NA    0.093    0.292
8      total :=    c+(a*b) total  0.555    NA     NA     NA    0.394    0.721

Shu Fai Cheung (張樹輝)

unread,
Feb 15, 2023, 12:03:35 AM2/15/23
to lavaan
Which version of lavaan are you using? How about running the example from lavaan website and see if the problem happens again?


In this example, fit the model and get the CI as in your case

fit <- sem(model, data = Data, se = "bootstrap", bootstrap = 5000)
parameterEstimates(fit, boot.ci.type = "perc")

I tried this on my computer and it shows the SEs, zs, and pvalues for ab and total:

> fit <- sem(model, data = Data, se = "bootstrap", bootstrap = 5000)
> parameterEstimates(fit, boot.ci.type = "perc")

    lhs op     rhs label   est    se     z pvalue ci.lower ci.upper
1     Y  ~       X     c 0.036 0.113 0.321  0.748   -0.182    0.273
2     M  ~       X     a 0.474 0.098 4.835  0.000    0.283    0.665
3     Y  ~       M     b 0.788 0.089 8.856  0.000    0.601    0.950
4     Y ~~       Y       0.898 0.147 6.117  0.000    0.605    1.172
5     M ~~       M       1.054 0.166 6.369  0.000    0.728    1.380
6     X ~~       X       0.999 0.000    NA     NA    0.999    0.999
7    ab :=     a*b    ab 0.374 0.085 4.398  0.000    0.213    0.545
8 total := c+(a*b) total 0.410 0.137 2.985  0.003    0.147    0.691

If the same problem happens in your computer, then maybe something due to your version of lavaan, other packages in your R installation, the OS, etc..

If the problem does not happen, then maybe something due to the dataset, or a combination of the dataset and other factors.

-- Shu Fai

Jill Jacobson

unread,
Feb 16, 2023, 8:12:02 PM2/16/23
to lavaan
Thanks, Shu Fai. It also was happening to a colleague for yet another data set on his computer (his is PC; my was Mac laptop). The next day I tried what I posted on my iMac, and the full reporting was there. My students do their assignment tomorrow, so we'll see what happens on a variety of OS.
Jill

Shu Fai Cheung (張樹輝)

unread,
Feb 16, 2023, 10:12:36 PM2/16/23
to lavaan
Hi Jill,

Did you get any warning message when doing bootstrapping? Though artificial, I created an artificial case that can produce a similar problem:

# Adapted from https://lavaan.ugent.be/tutorial/mediation.html
library(lavaan)
#> This is lavaan 0.6-14
#> lavaan is FREE software! Please report any bugs.
set.seed(1235)
n <- 50
sigma <- lav_matrix_lower2full(c(1.00,
                                  .99, 1.00,
                                  .99,  .99, 1.00), diagonal = TRUE)
Data <- data.frame(MASS::mvrnorm(n, c(X = 0, M = 0, Y = 0), sigma, empirical = TRUE))
cov(Data)
#>      X    M    Y
#> X 1.00 0.99 0.99
#> M 0.99 1.00 0.99
#> Y 0.99 0.99 1.00
cor(Data)
#>      X    M    Y
#> X 1.00 0.99 0.99
#> M 0.99 1.00 0.99
#> Y 0.99 0.99 1.00
model <- ' # direct effect
             Y ~ c*X
           # mediator
             M ~ a*X
             Y ~ b*M

           # indirect effect (a*b)
             ab := a*b
           # total effect

             total := c + (a*b)
         '
fit <- sem(model, data = Data, se = "bootstrap", bootstrap = 100, iseed = 9)
#> Warning in lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats =
#> lavsamplestats, : lavaan WARNING: 1 bootstrap runs failed or did not converge.

parameterEstimates(fit, boot.ci.type = "perc")
#>     lhs op     rhs label   est    se      z pvalue ci.lower ci.upper
#> 1     Y  ~       X     c 0.497 0.092  5.398      0    0.342    0.730
#> 2     M  ~       X     a 0.990 0.015 65.506      0    0.957    1.017
#> 3     Y  ~       M     b 0.497 0.097  5.144      0    0.259    0.672
#> 4     Y ~~       Y       0.015 0.003  4.618      0    0.008    0.020
#> 5     M ~~       M       0.020 0.004  4.904      0    0.012    0.026
#> 6     X ~~       X       0.980 0.000     NA     NA    0.980    0.980
#> 7    ab :=     a*b    ab 0.493    NA     NA     NA    0.257    0.655
#> 8 total := c+(a*b) total 0.990    NA     NA     NA    0.963    1.020

I know the source of the problem in this artificial case. At least one bootstrap run had problem and this affects the computation of VCOV and hence the SEs for ab and total. However, CIs are computed differently and so are not affected.

Your case may be different but this may give some hints on what happened.

How about setting the seed for random number (iseed) when calling sem(), such that the results due to bootstrapping are reproducible? If the source of the problem is similar, you should have the NA problem only for some numbers of iseed. If this is due to random sampling in bootstrapping, then this can also explain why the NA problem only occurred in some runs. The randomness of the problem may not be due to computers or OSes, but due to the random sampling.

-- Shu Fai

Jill Jacobson

unread,
Feb 17, 2023, 11:58:17 AM2/17/23
to lav...@googlegroups.com
Thanks, Shu Fai. It definitely does seem to be related to getting the warning that some of the bootstrap samples didn't iterate. Everyone of my students (N = 27) today had this problem. They all had just downloaded lavaan, so I think we can rule out lavaan version as a source of the problem. The R versions might have varied, but I know my version is up to date on both computers. I gathered OS information, and most students were using MacOS Monterey (12.4-12.6.3), but students using Ventura (13.2), which is the newest one, or BigSur (11.6), which is older also had that problem. In addition, for students using PCs, the problem emerged with Windows 10 and 11. That's a wide range of OS, so I don't think OS can be the source. I thought perhaps it was occurring only with laptops, which often have less memory, etc., because I didn't have the problem anymore when I switched from my Mac laptop to my iMac, but a student also had it happen on the PC desktop in the lab room.

One student discovered that it didn't happen when she used the lavaan() function with auto.var = TRUE instead of the sem() syntax listed in my original email, but that was true only when she dropped the se = "bootstrap" argument. As long as students had the se = "bootstrap" argument, the NAs appeared either in lavaan() or sem(). Can you tell me what difference that makes? What would lavaan be using for the se if there's no = "bootstrap", but the # of bootstrap samples argument is included, and in the parameterEstimates, we include the bootstrap type argument? I wasn't sure if the confidence intervals being reported would be from the percentile bootstraps or calculated not from bootstraps. Also, I also wasn't sure if the the z and p values would change if the se weren't coming from the bootstrap samples.

Thanks for your help,
Jill

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/f1l5YOtlIrU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/f21a4f59-f34a-4afc-9fc3-2d69c6e08d3dn%40googlegroups.com.

Shu Fai Cheung (張樹輝)

unread,
Feb 17, 2023, 8:22:12 PM2/17/23
to lavaan
I will try to explain what *might* have happened, based on my limited understanding. Other members please correct me if I am wrong.

1) Are the CIs bootstrap CIs?

We can rule out the possibility that the CIs are normal theory CIs based on SE by checking whether the CIs are symmetric about the point estimates.


parameterEstimates(fit, boot.ci.type = "perc")
#>     lhs op     rhs label   est    se      z pvalue ci.lower ci.upper
#> 1     Y  ~       X     c 0.497 0.092  5.398      0    0.342    0.730
#> 2     M  ~       X     a 0.990 0.015 65.506      0    0.957    1.017
#> 3     Y  ~       M     b 0.497 0.097  5.144      0    0.259    0.672
#> 4     Y ~~       Y       0.015 0.003  4.618      0    0.008    0.020
#> 5     M ~~       M       0.020 0.004  4.904      0    0.012    0.026
#> 6     X ~~       X       0.980 0.000     NA     NA    0.980    0.980
#> 7    ab :=     a*b    ab 0.493    NA     NA     NA    0.257    0.655
#> 8 total := c+(a*b) total 0.990    NA     NA     NA    0.963    1.020

If the CIs are based on SE, then they must be symmetric. That is, their distances from the point estimates (est) are the same. If they are not and you set se = "bootstrap" and ask for boot.ci, they should be bootstrap CIs.

For my example:

est <- parameterEstimates(fit, boot.ci.type = "perc")
abs(est$ci.lower - est$est)
#> [1] 0.155503344 0.033346802 0.238254959 0.006786193 0.007784402 0.000000000
#> [7] 0.235291385 0.027343440
abs(est$ci.upper - est$est)
#> [1] 0.232163895 0.027023086 0.174776320 0.005492289 0.006801941 0.000000000
#> [7] 0.162719272 0.029812472

They are not symmetric about the point estimates. They are bootstrap CIs, as expected.

If I did not ask for bootstrapping, this is the result:

fit <- sem(model, data = Data)

parameterEstimates(fit, boot.ci.type = "perc")
#>     lhs op     rhs label   est    se      z pvalue ci.lower ci.upper
#> 1     Y  ~       X     c 0.497 0.123  4.055      0    0.257    0.738
#> 2     M  ~       X     a 0.990 0.020 49.624      0    0.951    1.029
#> 3     Y  ~       M     b 0.497 0.123  4.055      0    0.257    0.738
#> 4     Y ~~       Y       0.015 0.003  5.000      0    0.009    0.020
#> 5     M ~~       M       0.020 0.004  5.000      0    0.012    0.027

#> 6     X ~~       X       0.980 0.000     NA     NA    0.980    0.980
#> 7    ab :=     a*b    ab 0.493 0.122  4.042      0    0.254    0.731
#> 8 total := c+(a*b) total 0.990 0.020 49.624      0    0.951    1.029
est <- parameterEstimates(fit, boot.ci.type = "perc")
abs(est$ci.lower - est$est)
#> [1] 0.240446325 0.039101159 0.240446325 0.005752642 0.007644644 0.000000000
#> [7] 0.238835344 0.039101159
abs(est$ci.upper - est$est)
#> [1] 0.240446325 0.039101159 0.240446325 0.005752642 0.007644644 0.000000000
#> [7] 0.238835344 0.039101159

They are not symmetric about the point estimates. They are not bootstrap CIs.

2) Why NAs for SE and z but bootstrap CIs available?

My guess is, when computing bootstrap SEs for user-defined parameters, NAs are not removed. Therefore, if NAs are present in any of the bootstrap samples, it returns NAs for SEs and hence also for z and pvalue.

However, NAs are removed when forming percentile bootstrap CIs. Percentile bootstrap CI does not need SE and so it can return percentile bootstrap CIs even when SEs are NAs.

This usually is not an issue. If we prefer percentile bootstrap CIs to z-based CI, as for indirect effects, then it means that we allow for asymmetry in the sampling distribution. The SEs and p-values are not so useful. Actually, in the case of indirect effects, the p-values based on bootstrap SEs should not be interpreted, in my opinion, if we prefer using percentile (or BCa) bootstrap CIs.

3) How to check whether NAs in SEs are really due to NAs in bootstrap samples?

We can extract the bootstrap estimates using lavInspect and ask for "boot":

fit_boot <- lavInspect(fit, "boot")

Two attributes, "error.idx" and "nonadmissible", are relevant. "error.idx" stores the index of bootstrap samples the encountered error for whatever reasons.

In my example, estimates are NAs in one of the bootstrap samples (the 12th bootstrap sample)

(error.idx <- attr(fit_boot, "error.idx"))
#> [1] 12
(nonad <- attr(fit_boot, "nonadmissible"))
#> integer(0)

These are the estimates in the 10th to 14th bootstrap samples:

fit_boot[sort(c(error.idx - 1:2, error.idx, error.idx + 1:2)), ]
#>              c         a         b       Y~~Y       M~~M
#> [1,] 0.6305563 0.9740943 0.3755701 0.01131400 0.02023379
#> [2,] 0.4621861 0.9868740 0.5601131 0.01203450 0.01634239
#> [3,]        NA        NA        NA         NA         NA
#> [4,] 0.3801079 0.9639834 0.6537283 0.01060975 0.01427636
#> [5,] 0.4485607 0.9834801 0.5577716 0.00878299 0.01543722

If you see NAs in SEs, you can check whether there are NAs in some bootstrap samples. If yes, then this may explain why you see NAs in the SEs.

Note that these NAs in bootstrap samples may be "natural" and "normal." Bootstrapping involves randomly sampling. Therefore, even if a model has no problem in estimation when fitting to the full sample, problems can occur in some random bootstrap samples.

4) Setting iseed when asking for bootstrapping CIs.

When doing bootstrapping, regardless of the programs used (PROCESS, lavaan, etc.), I usually set the seed for the random number generator. In lavaan, this can be done using iseed (note that iseed does not work in some previous versions, so make sure you use the latest version of lavaan):

fit <- sem(model, data = Data, se = "bootstrap", bootstrap = 5000, iseed = 9)

This allows us to produce the same set of bootstrap samples across platforms, as long as the random number generator is the same across platforms.

In your case, you can try to set the iseed to some numbers and do the analysis. If the NA problem occurs when setting iseed to a number, the same problem should occur also in other platform. On the other hand, if the NA problem does not occur for a number, then you should see no NA in other platforms if you set iseed to the same number.

Hope this helps.

-- Shu Fai

Shu Fai Cheung (張樹輝)

unread,
Feb 17, 2023, 8:24:00 PM2/17/23
to lavaan
Correction:

> abs(est$ci.lower - est$est)
> #> [1] 0.240446325 0.039101159 0.240446325 0.005752642 0.007644644 0.000000000
> #> [7] 0.238835344 0.039101159
> abs(est$ci.upper - est$est)
> #> [1] 0.240446325 0.039101159 0.240446325 0.005752642 0.007644644 0.000000000
> #> [7] 0.238835344 0.039101159
>
> They are not symmetric about the point estimates. They are not bootstrap CIs.
>
> 2) Why NAs for SE and z but bootstrap CIs available?

Sorry. I meant ' They are symmetric about the point estimates. They are not bootstrap CIs."

-- Shu Fai

Shu Fai Cheung (張樹輝)

unread,
Feb 17, 2023, 8:30:04 PM2/17/23
to lavaan
Correction again. I am not precise enough in my message. Sorry.

Percentile bootstrap CIs can be nearly symmetric about the point estimates, though they should not be *exactly* symmetric about the point estimates. CIs based on SEs are exactly symmetric about the point estimates.

And I meant "percentile bootstrap CIs" or similar CIs [e.g., BCa CIs] when I just wrote "bootstrap CIs". In my area, we rarely use bootstrap SEs to form symmetric CIs, though it can be done.

-- Shu Fai
Reply all
Reply to author
Forward
0 new messages