CFA Warnings

638 views
Skip to first unread message

Gian Mauro Manzoni

unread,
Jan 14, 2015, 5:50:48 PM1/14/15
to lav...@googlegroups.com
Dear lavaan group,
I run a single-factor CFA on a dataset consisting of both dichotomous and likert-type data (234 obs. of 20 variables/items).
Fit statistics and all the other results were computed but also the following warnings were shown (There were 50 or more warnings).
   
> warnings()
Messaggi di avvertimento:
1: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :
  lavaan WARNING: empty cell(s) in bivariate table of YFAS_2 x YFAS_1
2: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :
  lavaan WARNING: empty cell(s) in bivariate table of YFAS_3 x YFAS_1
3: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :
  lavaan WARNING: empty cell(s) in bivariate table of YFAS_4 x YFAS_1
4: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :
  lavaan WARNING: empty cell(s) in bivariate table of YFAS_5 x YFAS_1
5: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :
  lavaan WARNING: empty cell(s) in bivariate table of YFAS_6 x YFAS_1
6: In pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method,  ... :

I will appreciate any clarification.

Thanks a lot!

Mauro

Gian Mauro Manzoni

unread,
Jan 15, 2015, 5:44:46 AM1/15/15
to lav...@googlegroups.com
Dear group,
I have just found a previous post on the same warnings and I have understood their meaning. 
Yves commented that "This is not necessarily problematic, but you should be aware of it".
I wonder if the warnings mean also that the latent variables are not normal. I have some concerns about that because I noticed that some of the likert items have skewed distributions as expected. Is this a severe problem for tetrachoric and polychoric correlations?

Thanks a lot for any assistance!

Mauro      
   
 

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/91Xpsa01wu0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Yves Rosseel

unread,
Jan 15, 2015, 5:51:22 AM1/15/15
to lav...@googlegroups.com
On 01/15/2015 11:44 AM, Gian Mauro Manzoni wrote:
> I wonder if the warnings mean also that the latent variables are not
> normal.

No, not at all. Note that your sample size is not very large (n=234) for
20 ordered items.

> the likert items have skewed distributions as expected. Is this a severe
> problem for tetrachoric and polychoric correlations?

Usually not. It might become a problem if you have some response
categories that are empty or almost empty. In that case, all you can do
is merge some response categories, and reduce the number of response
categories.

Yves.

Bethan Thompson

unread,
Nov 29, 2016, 6:36:10 AM11/29/16
to lavaan
I just wanted to follow up on this thread as I also get a number of these error messages when I run a CFA with DWLS. I have a sample size of 540. Looking at an example pair of items I can see that some of the response categories are lowish, but not so low that I would have anticipated a problem. The lowest combined pairings would be response counts of around 50 with 30. For example I get the warning for these two items: 

$`lavaan WARNING: empty cell(s) in bivariate table of Q9_3 x Q9_2`
pc_cor_TS(fit.y1 = FIT[[i]], fit.y2 = FIT[[j]], method = optim.method, 
    zero.add = zero.add, zero.keep.margins = zero.keep.margins, 
    zero.cell.warn = zero.cell.warn, Y1.name = ov.names[i], Y2.name = ov.names[j])

I ran the following code to see if I could work out if there was a particular threshold after which the warnings are generated but have not been able to work it out after comparing to other pairs. 

polychor(Test3$Q9_2, Test3$Q9_3, ML = T, std.err = T)

Polychoric Correlation, ML est. = 0.5083 (0.03849)
Test of bivariate normality: Chisquare = 46.65, df = 15, p = 4.185e-05

  Row Thresholds
  Threshold Std.Err.
1   -0.6301  0.05761
2    0.1581  0.05357
3    0.6632  0.05758
4    0.9979  0.06412


  Column Thresholds
  Threshold Std.Err.
1   -0.2334  0.05399
2    0.5749  0.05662
3    1.1500  0.06846
4    1.4700  0.08040

I was hoping someone might be able to shed some light on at what point I should consider these warnings advisory and when I should consider dropping items/combining response categories. 

Many thanks

R McLaren

unread,
Nov 29, 2016, 10:59:55 AM11/29/16
to lavaan
Bethan Thompson's question is one I've wanted to ask as well but put another way...

Is there a list of Lavaan warning messages so we can read and determine the meaning of warnings, and determine which warnings are 'fatal' for our objectives.

Thanks, RM

Terrence Jorgensen

unread,
Nov 30, 2016, 4:12:45 AM11/30/16
to lavaan
I just wanted to follow up on this thread as I also get a number of these error messages when I run a CFA with DWLS.

They aren't error messages, they are warnings.

The lowest combined pairings would be response counts of around 50 with 30.

That contradicts your posted warning message.  Are 50 and 30 your lowest marginal counts?  The warnings are not about marginal counts, they are about joint counts. 

I was hoping someone might be able to shed some light on at what point I should consider these warnings advisory and when I should consider dropping items/combining response categories. 

If your model fails to converge on a solution, then you will see an error message telling you so (and you won't have any model results).  If you additionally see these warnings about sparse data, then that might be why you have trouble with convergence.  If you see the warnings but your model converged, then you can still interpret your model results. 

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Terrence Jorgensen

unread,
Nov 30, 2016, 4:55:08 AM11/30/16
to lavaan
Is there a list of Lavaan warning messages so we can read and determine the meaning of warnings, and determine which warnings are 'fatal' for our objectives. 

Errors are fatal (you will not get results at all).  Warnings are not fatal (you will get results, although your confidence in them should be shaky until you investigate the potential problem indicated by the warning).

In the specific case in this thread, zero frequencies in bivariate tables makes estimation of polychoric correlations more difficult. lavaan's default behavior is to add 0.5 to empty cells when the variables are binary (see the zero.add argument on the ?lavaan help page), which performs well unless their thresholds have opposite signs and the correlation is large.  You can inspect your output to check whether that is the case for binary-variable-pairs that have tables with zeros (use the lavTables() function or check the warnings) -- if your data conform to those conditions, then you can have confidence in the results.  For more categories, no adjustment is recommended.  You can find further details in this article:


The other most common warning message that I notice leaves users feeling unsure about their results (which the purpose of giving a warning) is about non-positive definite (NPD) matrices.  These come in 3 flavors, 2 of which provide error messages: 
  • input data: If the sample covariance matrix is NPD, then you get an error because the optimizer cannot reproduce a matrix that cannot exist in the first place. This should not happen when fitting a model to raw data. If you are providing summary statistics, then there must either be a typo or the method of calculating the covariance matrix is at fault (e.g., using pairwise deletion when there are missing data -- use FIML instead!)
  • starting values: Sometimes (rarely) the default starting values do not make sense for the specific model, in which case the user just needs to provide manual starting values for the parameters in the error message.
  • parameter estimates: A NPD (residual) covariance matrix of latent variables (psi) or observed variables (theta) will probably have a Heywood case, but not always (a determinant can be negative even if all variances are positive and correlations are within +/-1).  If there is a Heywood case, you can investigate (a) whether it is evidence of model misspecification by testing model fit, using approximate fit indices, and inspecting correlation residuals, and (b) whether there is evidence against the null hypothesis that sampling is enough to explain the out-of-bounds value (read doi:10.1177/0049124112442138):

Bethan Thompson

unread,
Nov 30, 2016, 5:51:32 AM11/30/16
to lavaan
Terrence, thanks so much for your explanations. Very helpful. I was indeed confusing marginal and join counts. I've now accessed the relevant table using lavTables() which i hadn't managed to find before. This has cleared things up in my mind and gives me a way to pull out the number of zero counts per pair without constructing my own table. I will be clearer about my warning/error terminology in the future!! Article very interesting on this too. 

Many thanks

Bethan
Reply all
Reply to author
Forward
0 new messages