Establishing whether data is MCAR, MAR or NMAR

826 views
Skip to first unread message

James Nobles

unread,
Oct 6, 2014, 11:14:27 AM10/6/14
to missin...@googlegroups.com
Good Afternoon,

Hopefully this question will be a rather straight forward one to answer, however as in most cases it is not always so!

I am not an expert in statistics or in the use of statistical packages, but I am very keen to use imputation in one of my studies. I have aimed below to give you an outline to my study: - 

- To identify participant characteristics that are predictive of drop out or completion in a weight management programme. To do so, I have established how often participants attended the given sessions, and further assigned them to a completion group (Completer, Non-Initiator, Initiator, Non-Completer etc...). Each participant, on entry to the programme, has completed a questionnaire to a certain extent and it is this information along with demographic and anthropometric data that make up the pre-intervention participant characteristics. This data will then be entered into a logistical regression model to determine those variables most predictive of completion, or not, of a weight management programme. 

With a number of the pre-intervention variables I have missing data, ranging from 5-45% missingness. Within SPSS I have completed the 'Analyse Patterns' test, which to my novice eyes and from what I have read, seems to have identified a pattern (please see the attached output). I believe that this pattern could be attributed to one of the completion groups - those who are non-initiators (sign up to the programme and do not attend a session). Missing values are much higher in this given group because many of them have not completed the pre-intervention questionnaire. Would this therefore suggest that the data is missing not at random? 

If NMAR is assumed, then I had considered the option of controlling for this group, and run further analysis without the inclusion of those who are classified as a non-initiator.   

Any assistance here would be greatly appreciated. 

Many thanks, 

James 


Missing Values Pattern Output.docx

Jonathan Bartlett

unread,
Oct 7, 2014, 8:08:00 PM10/7/14
to missin...@googlegroups.com
Hi James

The patterns of missingness, in of themselves, can't tell you whether the data are missing completely at random, missing at random, or missing not at random. Looking at the patterns is always a good idea, because it allows you to understand whether groups of variables tend to be either all missing or all observed. In your case you said that one of your patterns is likely explained by whether someone was a non-initiator. This is useful information, because you can hypothesize the mechanism leading to missingness in variables. To decide whether or not this missingness in MCAR, MAR or MNAR, you have to decide whether you think the chance of missingness is related to the values of the variables which are subject to missingness. If they are, the data are not MCAR. If you believe that you have fully observed variables that explain whether or not someone is missing, then you might believe the MAR assumption is plausible. Lastly, if the values of the partially observed variables differ between those with with variables observed and those who are missing, even after accounting for your fully observed variables, then your data are probably MNAR.

As a next step I'd recommend looking at the 'Missingness mechanism' pages at our site: http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=category&id=40:missingness-mechanisms&Itemid=96&layout=default

Best wishes
Jonathan

James Nobles

unread,
Oct 8, 2014, 7:47:27 AM10/8/14
to missin...@googlegroups.com
Thanks for taking the time on this Jonathan, 

I have run a couple more tests to control for the non-initiators, different programmes, gender etc.. (Independently) and the patterns, aswell as the results of the Little's MCAR test, are still very similar. As far as I am aware, the MCAR assumption now does not seem to be a valid one. I shall have a read of the suggested pages and come back to you, if you wouldn't mind, to discuss further. Once I have made the assumptions, I feel reasonably confident in imputing the missing values and running the analysis on these - I just need to get to that stage. 

Many thanks on your help. 

All the best, 

James  

mohankar...@gmail.com

unread,
Oct 10, 2014, 2:38:15 AM10/10/14
to missin...@googlegroups.com

Traditionally, theory has advised us that patterns of missingness, in of themselves, can't tell us if the data are MCAR/MAR/MNAR and, therefore, one needs to hypothesize the mechanism leading to missingness and, as Jonathan has explained, based on this hypothesis, something can be said about whether data are MCAR/MAR or MNAR.
Recent developments in graphical models have supplemented this practice with three findings:
1. There are simple tests that, when applied to the data itself, can refute the MCAR or MAR hypothesis
2. Once the user goes through the exercise of hypothesizing the mechanism leading to missingness, more can be achieved than simply classifying data as MCAR/MAR or MNAR (Such classification can easily be determined by inspection). One can actually determine if the parameters of interest can be estimated bias-free and, if so, how. This scheme extends into MNAR problems, where traditional methods (eg. Maximum Likelihood or Multiple Imputation) are helpless.
3. The hypothesized mechanism often has testable implications, allowing us to reject hypotheses which are incompatible with data.
For summaries of these developments please see:
http://ftp.cs.ucla.edu/pub/stat_ser/r417.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/r410.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/r415.pdf

 

Regards,

Karthika

Jonathan Bartlett

unread,
Oct 15, 2014, 6:16:44 PM10/15/14
to missin...@googlegroups.com
Thanks Karthika! I will read these articles with great interest!

Best wishes
Jonathan
Reply all
Reply to author
Forward
0 new messages