Are methods valid under MAR applicable to randomized clinical trials data where dropout is the only type of missing (monotone missing)?

Yilei Zhan

unread,

Jul 8, 2015, 3:53:23 PM7/8/15

to missin...@googlegroups.com

When conducting randomized clinical trials, if a patient intends to dropout after visit j-1 but before visit j, it is possible to let clinicians have his/her response variable measured (here my interest of outcome variable is some continuous variable) measurement upon withdrawal and then we can approximately treat this single measurement as Y_ij, an regular response for this patient at visit j. It is reasonable to assume this is true for all patients that withdraw after baseline, and also suppose that dropout is the only type of missing(patients who withdraw will not come back).

In this setting, intuitively, the reason for a patient who withdraws, say at visit j, could only depend on previous and current visit of this patient Y_i1 ~Y_ij, which are all observed in the data set. Doesn't this mean that if conditional on observed data, the reason for an observation being missing will not depend on unobserved data? Thus, this type of data is always excluded from the assumption of MNAR. I am very confused about it. Do I understand MNAR correctly?

Jonathan Bartlett

unread,

Jul 12, 2015, 1:55:29 PM7/12/15

to missin...@googlegroups.com, zhany...@gmail.com

A good way of thinking about MAR in this setting is as follows. Imagine that there are two patients who, up to and including visit j-1, have the same values of the outcome variable (and other variables that might have been measured on, and which could in principle be conditioned on in a model). Then consider the probability that each of these two patients now drops out (i.e. misses visit j and all subsequent visits). If this probability depends on Y_j, the value the value that would be measured if the patient attends visit j, even after adjusting for past data, then the missingness mechanism is MNAR. For example, if patients dropout when their condition suddenly deteriorates, and this deterioration is not predictable from the past data, then it is MNAR. But if instead, the probability of dropout at time j doesn't depend on Y_j, conditional on the past, then it is MAR.

Message has been deleted

Yilei Zhan

unread,

Jul 15, 2015, 11:25:51 AM7/15/15

to missin...@googlegroups.com

Prof. Bartlett, thank you for your answer.

In your example, what if when the patient decides to drop out, he or she still gets one more measurement upon dropout (after visit j-1 but before visit j)? In today's randomized trials, it is possible to have most patients obey this rule: patients are asked to fill in a survey on reasons of dropout and go to the clinicians for a measurement before dropout of the study. Thus, even if patients drop out because of sudden deterioration, we have information exactly at the dropout timepoint and all we don't know is information after dropout. And in common sense, it is more reasonable to relate reasons of dropout to measurements at dropout timepoint than unobserved measurements observations after dropout.

Jonathan Bartlett

unread,

Aug 3, 2015, 3:35:38 PM8/3/15

to Missing Data

Hi Yilei

I now understand, thank you. I would need to think about this a bit more carefully, but in the setup you describe it is not entirely clear to me what MAR would constitute. The definition is generally made in a setup in which you have a set of intended measurements to be made. In your study design, I think you would have to include these extra potential 'between visit' measurements as part of what would constitute the 'full data'. However, no patient would then have full data, since those completing the study would have no intermediate visit measurements made. Nevertheless, if the probability of dropping out between j-1 and j did not depend on the j,j+1,..,J measurements, conditional on measurements 1,..,j-1, plus the measurement at the 'dropout' visit, then I think MAR would be satisfied. I guess you would then need to use an analysis method that can incorporate these measurements (e.g. mixed models treating time as a continuous variable and suitable modelling of correlation structure), unless you treat the dropout measurement between j-1 and j as if it were the visit j measurement, as you suggested in your first message.

In summary, I agree with your intuition that you ought to be able to make a much more plausibly valid analysis with these additional measurements, but I'd need to spend some more time thinking about it to give a more definitive answer!

Best wishes

Jonathan

Yilei Zhan

unread,

Aug 3, 2015, 4:26:15 PM8/3/15

to Missing Data

Hi Prof. Bartlett,

Thank you for replying to me with great explanations. Your answer helps a lot. It is interesting and inspiring to apply statistical models treating time as a continuous variable, as you suggested at the end of the first paragraph. On the other hand, is it common/valid to treat the dropout measurement between j-1 and j as if it were the visit j measurement ( thus considering time as a categorical variable)?

I look forward to hearing more from you.

Thanks,

Yilei

Jonathan Bartlett

unread,

Jan 25, 2016, 7:32:08 AM1/25/16

to Missing Data

Hi Yilei

I've had some further thoughts regarding your earlier question, in regards to under what conditions MAR hold. Consider a very simple trial, with baseline measurement Y0, treatment allocation Z, and single follow-up measurement Y1. Suppose that, in the spirit of your earlier questions, whether a patient completes the follow-up (and has Y1 measured) depends causally only on Z and Y0. i.e. the patient/physician decides whether the patient will dropout as a function of fully observed variables. Then intuitively it may seem that MAR holds, since dropout is determined by Z and Y0, which are fully observed. However, now suppose that the distribution of Y1 depends on the dropout indicator R, in addition to Z and Y0. This would be the case for example if the outcome behaviour at follow-up changes in expectation depending on whether the patient decided to dropout or not. In this case, the dropout indicator R and Y1 are statistically dependent, conditional on Z and Y0, despite the fact that we assumed R was generated dependent on Z and Y0. This is a consequence of the assumed effect of R on Y1. In this case, the data are not MAR, they are MNAR.

As an extreme example of the above scenario, suppose that the patient tosses a coin to decide whether they will dropout following their baseline visit. Here intuitively missingness would be completely at random, since it is determined by a coin toss. However, let's (as above) now suppose that if the patient drops out they no longer receive the study treatment, and this affects their outcome value Y1 (in expectation). In this case there will be an association between R and Y1, and the data are MNAR.

Given the above, I think my previous optimism regarding data being plausibly MAR in the sort of scenario you were envisaging is much reduced...