Data with "missing" ordinal levels.

Bryan Maloney

unread,

Nov 8, 2022, 11:31:10 AM11/8/22

to lavaan

I'm looking at a data set where one of the variables is an ordinal scale with six possible levels. However, the specific data set ended up only having members of four of those levels. We have representatives of 1, of 2, of 5, and of 6, but no 3 or 4. No, I can't go back to the clinicians and get data to fill things in. I got what I was handed.

How do I account for the fact that there is a very real gap between the 1 and 2 samples and the 5 and 6 samples? In the real world, using this scale, we can find people who score 3 and 4, but they weren't included in the data the clinicians collected.

Jeremy Miles

unread,

Nov 8, 2022, 11:38:19 AM11/8/22

to lav...@googlegroups.com

If you assume that the relationship is linear, then this isn't (IMHO) a huge deal. You're sampling from the ends of the scatterplot, which is where you see a relationship, not the big blob of points in the middle.

There's even an argument that you should deliberately collect data like this, so as to maximize your power. This paper presents the argument: https://www2.psych.ubc.ca/~schaller/528Readings/McClelland1997.pdf and Table 1 presents the efficiency of various sampling strategies (for a 5 point scale, not 6 point as you have):

This also means that your (standardized) effect sizes will be likely to be overestimates of the population effect sizes, because your variances will be wrong.

Jeremy

On Tue, 8 Nov 2022 at 08:31, Bryan Maloney <maloneyw...@gmail.com> wrote:

I'm looking at a data set where one of the variables is an ordinal scale with six possible levels. However, the specific data set ended up only having members of four of those levels. We have representatives of 1, of 2, of 5, and of 6, but no 3 or 4. No, I can't go back to the clinicians and get data to fill things in. I got what I was handed.

How do I account for the fact that there is a very real gap between the 1 and 2 samples and the 5 and 6 samples? In the real world, using this scale, we can find people who score 3 and 4, but they weren't included in the data the clinicians collected.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/ecedd5e4-b763-42f3-a15a-42e6e3519c60n%40googlegroups.com.

Bryan Maloney

unread,

Nov 8, 2022, 12:17:13 PM11/8/22

to lavaan

How do I use that data with Lavaan? I get an error message when I try. I get told that some of the levels are missing. How do I tell Lavaan to not care about that?

Terrence Jorgensen

unread,

Nov 8, 2022, 2:41:01 PM11/8/22

to lavaan

I think by "linear effect", Jeremy is suggesting that you treat the variable as continuous, interval-level data. If your gut reaction is "but they aren't normal", recall that treating them as ordinal makes the assumption that the observed responses are a crude discretization of an underlying latent response that is assumed to be normal. If you treat them as continuous, at least you have the option of correcting for the observed deviation from normality, which is not available for unobserved kurtosis in latent responses.

https://doi.org/10.3389/feduc.2020.589965

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Jeremy Miles

unread,

Nov 8, 2022, 3:43:06 PM11/8/22

to lav...@googlegroups.com

Easiest way is probably (if your data frame is called d, and your variable called var):

d$var <- as.numeric(as.factor(d$var))

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/59ad46a7-d0ac-4125-8f25-3c52768a5030n%40googlegroups.com.

Bryan Maloney

unread,

Nov 9, 2022, 7:58:09 AM11/9/22

to lavaan

The light is on! Thanks.

Reply all

Reply to author

Forward