Missing data and WLSMV

4,181 views
Skip to first unread message

Ulrich Schroeders

unread,
Feb 16, 2013, 5:32:40 AM2/16/13
to lav...@googlegroups.com
Dear Yves,

When fitting a measurement model with the WLSMV estimator, data containing missing values is listwise deleted. I read the thread ("missing data - ordinal variables", started by Fabio Sierra on Oct, 1st, 2012) and wonder if there is a second option, besides multiple imputation and the problems of aggregating fit statistics?
In Mplus, all available information is used, that is, pairwise present data (which they call confusingly FIML). Is there a workaround to achieve the same in lavaan? For instance, specifying a WLS.V matrix with values from a second model running with the raw data (WLS.V = fit@SampleStats@WLS.V)? Is this a method for taking into account missing data correctly?

Kind regards,
Ulrich
PS: Keep up the good work!

yrosseel

unread,
Feb 17, 2013, 5:34:25 AM2/17/13
to lav...@googlegroups.com
On 02/16/2013 11:32 AM, Ulrich Schroeders wrote:
> Dear Yves,
>
> When fitting a measurement model with the WLSMV estimator, data
> containing missing values is listwise deleted.

True indeed.

> ("missing data - ordinal variables", started by Fabio Sierra on Oct,
> 1st, 2012) and wonder if there is a second option, besides multiple
> imputation and the problems of aggregating fit statistics?

Not in the current version.

> In Mplus, all available information is used, that is, pairwise present
> data (which they call confusingly FIML).

True. A document explaining what Mplus is doing can be found here:

www.statmodel.com/download/GstrucMissingRevision.pdf

> achieve the same in lavaan? For instance, specifying a WLS.V matrix with
> values from a second model running with the raw data (WLS.V =
> fit@SampleStats@WLS.V)? Is this a method for taking into account missing
> data correctly?

No. What we need is a missing="pairwise" option, which can be used with
the WLS estimator. But this will result in a different WLS.V matrix.

It is on my TODO list.

Yves.

Ulrich Schroeders

unread,
Feb 17, 2013, 6:23:34 AM2/17/13
to lav...@googlegroups.com
Dear Yves,

thanks for the quick reply and for pointing out the paper.
Can you estimate when you'll lfind the time to implement an missing=pairwise option? :-)

Thanks in advance, kind regards,
Ulrich

yrosseel

unread,
Feb 17, 2013, 6:26:13 AM2/17/13
to lav...@googlegroups.com
On 02/17/2013 12:23 PM, Ulrich Schroeders wrote:
> Dear Yves,
>
> thanks for the quick reply and for pointing out the paper.
> Can you estimate when you'll lfind the time to implement an
> missing=pairwise option? :-)

Since this falls within the 'categorical part' of lavaan, it has
priority. It is currently on the TODO list for 0.5-13.

Yves.

Ulrich Schroeders

unread,
Feb 17, 2013, 6:32:06 AM2/17/13
to lav...@googlegroups.com
Great, love to hear that! :-)
Thanks in advance, kind regards, Ulrich

Ulrich Schroeders

unread,
May 17, 2013, 3:32:18 PM5/17/13
to lav...@googlegroups.com
Dear Yves,

I saw that the new version 0.5-13 has been released last week.
Unfortunately‎, the pairwise present data option for categorical data hasn't made it in the final release or have I been missing some option?
Sorry, to bother you again with this issue. But can you make another guess when the functionality will be added? ;-)

Thanks again, kind regards,
Ulrich

yrosseel

unread,
May 26, 2013, 1:34:54 PM5/26/13
to lav...@googlegroups.com
On 05/17/2013 09:32 PM, Ulrich Schroeders wrote:
> Dear Yves,
>
> I saw that the new version 0.5-13 has been released last week.
> Unfortunately, the pairwise present data option for categorical data
> hasn't made it in the final release or have I been missing some option?
> Sorry, to bother you again with this issue. But can you make another
> guess when the functionality will be added? ;-)

I would hope that 0.5-14 will have this feature, but can not make any
promises.

Yves.

Yves Rosseel

unread,
Jul 21, 2013, 8:41:52 AM7/21/13
to lav...@googlegroups.com
In version 0.5-14 (just released), you can add missing="pairwise" to deal with missing data in the categorical/WLSMV case.

Yves.

Ulrich Schroeders

unread,
Jul 26, 2013, 3:07:24 PM7/26/13
to lav...@googlegroups.com
Dear Yves,

thank you very much for implementing this new feature! :-)
However, I got an error; translated something like
"error in if (fx < 0) fx <- 0 : missing value, where TRUE/FALSE is needed".

Do you have any suggestions what I'm making wrong?

Thanks in advance, kind regards,
Ulrich

--
Dr. Ulrich Schroeders
Institute for Educational Quality Improvement
Unter den Linden 6, 10099 Berlin, Germany
url  http://www.iqb.hu-berlin.de  | Skype  ulrich.schroeders


2013/7/21 Yves Rosseel <yros...@gmail.com>
--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/J1TtmcPDTb0/unsubscribe.

To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

yrosseel

unread,
Jul 27, 2013, 3:23:07 AM7/27/13
to lav...@googlegroups.com
On 07/26/2013 09:07 PM, Ulrich Schroeders wrote:
> Dear Yves,
>
> thank you very much for implementing this new feature! :-)
> However, I got an error; translated something like
> "error in if (fx < 0) fx <- 0 : missing value, where TRUE/FALSE is needed".
>
> Do you have any suggestions what I'm making wrong?

Hm. Hard to say. Would you be able to send me your Rscript and a snippet
of your data (just enough to replicate this)?

Yves.

Franziska Zuber

unread,
Aug 20, 2013, 8:58:54 AM8/20/13
to lav...@googlegroups.com
Dear Yves,

I was just using the new feature of missing=pairwise with estimator=WLSMVS in a CFA model, and I know how to extract the sample covariance matrix used by lavaan to fit the model using inspect(fit, "sampstat") command.
Is there a way to also obtain"covariance coverage output" (i.e. percentage of covariance data available) similar to what is provided in Mplus, or alternatively the N used for calculating each covariance, (and maybe even the missing data patterns as in Mplus)?

Many thanks,
Franziska

Yves Rosseel

unread,
Aug 21, 2013, 1:47:00 PM8/21/13
to lav...@googlegroups.com
On 08/20/2013 02:58 PM, Franziska Zuber wrote:
> Is there a way to also obtain"covariance coverage output" (i.e.
> percentage of covariance data available) similar to what is provided in
> Mplus, or alternatively the /N/ used for calculating each covariance,
> (and maybe even the missing data patterns as in Mplus)?

Good point. I added this in the dev version (0.5-15):

- inspect(fit, "coverage") will return the pairwise coverage
- inspect(fit, "patterns") will return the missing patterns

You can install the dev version by typing in R:

install.packages("lavaan", repos="http://www.da.ugent.be", type="source")

Yves.

Franziska Zuber

unread,
Aug 22, 2013, 3:02:15 AM8/22/13
to lav...@googlegroups.com
Dear Yves,

Many thanks for your reply and the hint to the dev version.

Best regards,
Franziska

Franziska Zuber

unread,
Aug 11, 2014, 7:53:32 AM8/11/14
to lav...@googlegroups.com
Dear Yves,

Related to our conversation about WLSMV(S) estimator for categorical data, and to an error message mentioned by Ulrich Schroeders, I have the following two questions:

1. When I used lavaan 0.5-14 and dev 0.5-15 a year ago, I obtained convergence for a model using WLSMVS estimator and missing="pairwise", while today, using the same data and the same options when fitting the model, with version 0.5-16, the model does not converge.

I have looked through the version history on the lavaan project website, but not been able to find an explanation there.
(The bugs mentioned for 0.5-15 concerning the polychoric correlation between two variables and missing="pairwise" (if data are categorical) do not seem to apply because I use 0.5-16, and moreover don't have binary variables, nor do I have exogenous covariates as I am running a CFA model).

Has anything in the algorithms behind WLSMVS changed that might explain the divergent results?

When I looked again into the results of the converged model using the old lavaan vesion, I noticed that some fit indices were almost too good to be true: CFI = 1, TLI = 1.018, RMSEA = 0.000 90% ConfInt RMSEA = [0.000, 0.000] - not sure whether this information helps you.


2. When I specify my model slightly differently, I also see a new error messages which never appeared in the same analysis with the previous versions:
Error in if (fx < 0) fx <- 0 : missing value where TRUE/FALSE needed
In addition: Warning message: In lav_samplestats_from_data(lavdata = lavdata, missing = lavoptions$missing, : lavaan WARNING: number of observations (427) too small to compute Gamma

Ulrich in his post in this thread on 7/27/2013 mentioned the first error as well (but not the second) and you might have found the reason for the error message he mentioned  but no further answer is visible on the forum thread.

In advance many thanks for your help,
Franziska

yrosseel

unread,
Sep 4, 2014, 11:58:14 AM9/4/14
to lav...@googlegroups.com
On 08/11/2014 01:53 PM, Franziska Zuber wrote:
> 1. When I used lavaan 0.5-14 and dev 0.5-15 a year ago, I obtained
> convergence for a model using WLSMVS estimator and missing="pairwise",
> while today, using the same data and the same options when fitting the
> model, with version 0.5-16, the model does not converge.

Hm. I'm not sure. Can you send me the script/data?

> Has anything in the algorithms behind WLSMVS changed that might explain
> the divergent results?

No, but there must be an explanation.

> 2. When I specify my model slightly differently, I also see a new error
> messages which never appeared in the same analysis with the previous
> versions:
>
> Error in if (fx < 0) fx <- 0 : missing value where TRUE/FALSE needed

This should never happen. Something went wrong, but we should catch the
problem much earlier.

> In addition: Warning message: In lav_samplestats_from_data(lavdata =
> lavdata, missing = lavoptions$missing, : lavaan WARNING: number of
> observations (427) too small to compute Gamma

Again, if you could provide me the script + data, I will investigate
this further.

Yves.

David Disabato

unread,
Sep 14, 2014, 8:41:17 PM9/14/14
to lav...@googlegroups.com
Hi Yves,

Dear Yves,

Related to Franziska's conversation about the WLSMV(S) estimator for ordinal-categorical data, I am obtaining similar error messages and questionable output. I am using lavaan 0.5-16 which I just updated today. I am trying to converge a model using the WLSMVS estimator and missing="pairwise" for a one-factor CFA of 8 four-point likert scale items (N = 455). 

1. When I specify missing="pairwise" I receive the following error message: 
Error in if (fx < 0) fx <- 0 : missing value where TRUE/FALSE needed
In addition: Warning message:
In lav_data_full(data = data, group = group, group.label = group.label,  :
  lavaan WARNING: some cases are empty and will be removed:
  34 38 49 66 70 76 93 104 125 166 169 178 211 237 245 280 288 295 354 364 397 410 421 422 428 444 449
When I specify missing = "default", the model converges and the output contains two columns. One that uses regular DWLS and one that uses Robust DWLS. While the regular DWLS column says the sample size is equal to that expected for listwise deletion, the Robust DWLS column says the sample size is equal to that expected for pairwise deletion. Is the regular DWLS column reporting listwise deletion results and the Robust DWLS column pairwise deletion results, or is their difference getting at something else?

2. When I specify missing = "default", the model converges and the output contains two columns. I noticed that the fit indices in the regular DWLS column are likely upwardly biased (i.e., better fit than expected). Past research has found poor model fit for this particular CFA with other samples. But my regular DWLS column states the test statistic = 19.43 (p = .494), CFI = 1.00, TLI = 1.01, RMSEA = .000. I don't understand how this is possible. In particular, the non-significant test statistic with a sample size above 400 seems quesitonable. My average standardized factor loading is .35 with ranges from .19 - .49. In the Robust DWLS column, the model fit is more believable:  test statistc = 17.59 (p = .129), CFI = .89, TLI = .89, RMSEA = .033. However, the test statistic is still non-significant. Are the Robust DWLS column fit statistics accurate and the regular DWLS column fit statistics biased? 

In advance many thanks for your help,
David Disabato

Yves Rosseel

unread,
Sep 18, 2014, 2:51:56 AM9/18/14
to lav...@googlegroups.com
> 1. When I specify missing="pairwise" I receive the following error message:
>
> Error in if (fx < 0) fx <- 0 : missing value where TRUE/FALSE needed

Ha. Would you be able to send me your script and a snippet of the data
(just enough to replicate this)?

This is most likely related to the fact that some of your bivariate
frequency tables contain a lot of zeroes. Look at the output of

lavTables(fit)

If you see lots of zeroes in the obs.freq column, and you have
convergence issues, you may try to play with the zero.add= argument.

> When I specify missing = "default", the model converges and the output
> contains two columns. One that uses regular DWLS and one that uses
> Robust DWLS. While the regular DWLS column says the sample size is equal
> to that expected for listwise deletion, the Robust DWLS column says the
> sample size is equal to that expected for pairwise deletion. Is the
> regular DWLS column reporting listwise deletion results and the Robust
> DWLS column pairwise deletion results, or is their difference getting at
> something else?

No. When missing="default", you get listwise deletion, and then you
always get a line like this:

Used Total
Number of observations 220 301

where '220' is the number of observations used in the analysis, while
301 is the total sample size. It has nothing to do with the estimator.

> 2. When I specify missing = "default", the model converges and the
> output contains two columns. I noticed that the fit indices in the
> regular DWLS column are likely upwardly biased

You should not trust the DWLS column. The 'Robust' column give much more
reliable results. Not always in your favor though.

> RMSEA = .033. However, the test statistic is still non-significant. Are
> the Robust DWLS column fit statistics accurate and the regular DWLS
> column fit statistics biased?

Yes.

Yves.

Wessel van Eeden

unread,
Feb 27, 2015, 10:46:41 AM2/27/15
to lav...@googlegroups.com
Dear Yves,

Is the "Error in if (fx < 0) fx <- 0 : missing value where TRUE/FALSE needed" issue fixed or is there a solution? I have the same problem here. 



yrosseel

unread,
Mar 13, 2015, 9:09:28 AM3/13/15
to lav...@googlegroups.com
Most cases were due to a huge amount of empty cells in large bivariate
tables. Can you try this with the development version (0.5-18) and
report back to me if you still get this?

Yves.

Ivan W

unread,
Oct 3, 2015, 5:34:49 PM10/3/15
to lavaan
Hello Yves,

I am getting this error using "pairwise" with "wlsmv".

Ivan

Yves Rosseel

unread,
Oct 5, 2015, 5:34:21 AM10/5/15
to lav...@googlegroups.com
On 10/03/2015 11:34 PM, Ivan W wrote:
> Hello Yves,
>
> I am getting this error using "pairwise" with "wlsmv".

Could you provide me a reproducible example?

Yves.
Reply all
Reply to author
Forward
0 new messages