Multiple imputation and permutation tests

289 views
Skip to first unread message

Liz

unread,
Mar 18, 2013, 12:20:44 PM3/18/13
to meds...@googlegroups.com
Dear all,
I am reanalysing the results of a small RCT in response to some comments from reviewers. Although the level of missing data was low, one reviewer would prefer me to perform multiple imputation, which I am happy to do, but also recommends performing permutation tests instead of ANCOVA because tests of normality are flawed and we cannot be certain our data are suited to parametric analysis. I think I am going to stick to my guns with the main analysis, since it is my understanding that it is very difficult if not impossible to include a covariate in a permutation test, and we specified in advance that we would control for baseline values. I thought it might be worth investigating permutation tests nevertheless as a sensitivity analysis. Whilst I am able to perform a permutation test on my complete case data, how would you recommend I go about performing multiple permutation tests following MI, and combining the results, preferably in Stata? I've got 20 imputed datasets for each of my primary and secondary outcomes at each of two time-points.
Thanks in advance,
Liz

Martin Bland

unread,
Mar 18, 2013, 1:05:40 PM3/18/13
to meds...@googlegroups.com
I think you should resist calls for a non-parametric method.  If the assumptions of the parametric method are met, rank-based methods are inefficient and they don't give confidence intervals unless you make assumptions almost as strong as those for a t test.  The CONSORT statement would encourage you to produce a confidence interval for the difference, as would all the major journals.  However, no doubt you have colleagues who need the publication.  I would do a parametric analysis with multiple imputation.  Do a nonparametric test on the available data if you must.

Martin




--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules
 
---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
***************************************************
J. Martin Bland
Prof. of Health Statistics
Dept. of Health Sciences
ARRC Building
University of York
Heslington
York YO10 5DD

Email: martin...@york.ac.uk
Phone: 01904 321334     Fax: 01904 321382
Web site: http://martinbland.co.uk/

Statement by the University of York:
This email and its attachments may be confidential and are intended solely for the use of the intended recipient. If you are not the intended recipient of this email and its attachments, you must take no action based upon them, nor must you copy or show them to anyone. Please contact the sender if you believe you have received this email in error. Any views or opinions expressed are solely those of the author and do not necessarily represent those of The University of York.
***************************************************

Kornbrot, Diana

unread,
Mar 18, 2013, 1:20:44 PM3/18/13
to meds...@googlegroups.com
In my naivete, I would also use multiple imputation and then normal based methods [often known as parametric methods]
But I note form earlier posts that analysis of difference is subject to regression to mean bias and that analysis of final value with initial value  as covariate is recommended as preferred method.

There are always bootstraps

Best

Diana



On 18/03/2013 17:05, "Martin Bland" <martin...@york.ac.uk> wrote:

I think you should resist calls for a non-parametric method.  If the assumptions of the parametric method are met, rank-based methods are inefficient and they don't give confidence intervals unless you make assumptions almost as strong as those for a t test.  The CONSORT statement would encourage you to produce a confidence interval for the difference, as would all the major journals.  However, no doubt you have colleagues who need the publication.  I would do a parametric analysis with multiple imputation.  Do a nonparametric test on the available data if you must.

Martin




On 18 March 2013 16:20, Liz <lizh...@hotmail.com> wrote:
Dear all,
I am reanalysing the results of a small RCT in response to some comments from reviewers. Although the level of missing data was low, one reviewer would prefer me to perform multiple imputation, which I am happy to do, but also recommends performing permutation tests instead of ANCOVA because tests of normality are flawed and we cannot be certain our data are suited to parametric analysis. I think I am going to stick to my guns with the main analysis, since it is my understanding that it is very difficult if not impossible to include a covariate in a permutation test, and we specified in advance that we would control for baseline values. I thought it might be worth investigating permutation tests nevertheless as a sensitivity analysis. Whilst I am able to perform a permutation test on my complete case data, how would you recommend I go about performing multiple permutation tests following MI, and combining the results, preferably in Stata? I've got 20 imputed datasets for each of my primary and secondary outcomes at each of two time-points.
Thanks in advance,
Liz


Emeritus Professor Diana Kornbrot
email:  d.e.ko...@herts.ac.uk    
 web:    http://dianakornbrot.wordpress.com/
Work
Department of Psychology
School of Life and Medical Sciences
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626
Home
19 Elmhurst Avenue
London N2 0LT, UK
voice:   +44 (0) 208  444 2081
mobile: +44 (0) 740 318 1612


Liz

unread,
Mar 18, 2013, 1:50:58 PM3/18/13
to meds...@googlegroups.com
Martin
Thanks for the advice; I may run into difficulties because I don't think this reviewer is going to be convinced that my data meet the assumptions, even if I've tested them, because the tests can't be trusted. However CONSORT would add another layer to my defence of parametric methods so I will stick to plan A. 

Diana
Thanks - I hadn't considered bootstraps but will try MI as first port of call since the reviewer specifically asked for it.

Just for my own understanding, assuming I did want to run permutation tests after MI, how would I combine the results? I don't think Stata would run permute after mi and combine the results for me, but I could run permute in each imputed dataset separately. Each of the permutation tests would give me a proportion of the permutations where the observed effect, t in the case of regression, was exceeded. Would these be combined in Rubin's rules as proportions? I know one can't combine p-values, but are the results of a permutation test a special case?

Thompson,Paul

unread,
Mar 18, 2013, 1:52:26 PM3/18/13
to meds...@googlegroups.com

How many cases do you have? If it is small, things may be dicey.

--

--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules
 
---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



-----------------------------------------------------------------------
Confidentiality Notice: This e-mail message, including any attachments,
is for the sole use of the intended recipient(s) and may contain
privileged and confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply e-mail and destroy
all copies of the original message.

Adrian Sayers

unread,
Mar 18, 2013, 2:21:23 PM3/18/13
to meds...@googlegroups.com
look up rubins rules,

Its basically the mean of the means for the parameter, and something a bit trickier for the variance.

It strikes me as funny, if they are unwilling to accept your assumptions about a parametric model, i think it will be inconceivable how they are likely to accept your assumptions for an imputation model.

bw
A

Liz

unread,
Mar 19, 2013, 5:11:53 AM3/19/13
to meds...@googlegroups.com
Paul
I have 38 per group. Not tiny, but certainly not large numbers.

Adrian
Thanks. I am still having some trouble with Rubin's rules in this specific case. In Rubin's rules, you would perhaps combine the coefficients obtained from 20 standard regressions, one from each of the 20 imputed datasets. My understanding of permutation tests is that if you run a permuted regression in each set, you still only obtain 1 regression coefficient for each (the 'observed' coefficient prior to permutation) but the significance of the test is derived from the proportion of the (say 100000) permuted replicates in which the coefficient was larger i.e. if the proportion is less than 5%, this indicates significance at p<0.05. As far as I can make out, if I  just combined the regression coefficients from 20 permuted regressions in Rubin's rules this would not take into account the fact that these came from permutation tests, it would just be the same as combining the results from standard regression. How would one combine the permutation test results in a way that accounted for the permutation? The only way I can think of would be to combine the proportion, not the coefficient. Does this sound right? I believe you cannot usually combine p-values themselves in Rubin's rules.
I'm going to stick to parametric analysis, but I'd still like to know how I would go about performing permutation tests after MI, because it'll improve my understanding of both permutation tests and Rubin's rules.


Adrian Sayers

unread,
Mar 19, 2013, 5:57:02 AM3/19/13
to meds...@googlegroups.com
I think i rather naively suggested to look a Rubins rules for your problem.

I wonder if your situation is more like a double bootstrap. i.e. you would need to run and permutation test on every imputed dataset, and then work out a sensible way of averaging the permutation across the imputations.

I suspect this is unlikely to be a trivial undertaking and probably require a computing cluster.

bw
Adrian



Liz

unread,
Mar 19, 2013, 6:22:29 AM3/19/13
to meds...@googlegroups.com
I thought it might not be straightforward and I haven't been able to find any references which mention both permutation/randomisation/exact tests and multiple imputation. It might strengthen my case for keeping to parametric analysis if I could demonstrate that it would be impractical to run both MI and permutation tests. I'd like to work out how one would combine the results so I know exactly what I'd be objecting to. 
Incidentally I take your point about the assumptions of MI wrt parametric analysis and may also include this in my response; I've used predictive mean matching in the imputation model, but this still uses regression.

Belinda Dawson

unread,
Mar 28, 2013, 12:38:32 PM3/28/13
to meds...@googlegroups.com

Hi,


 I have a question about SAS proc forecast,


 I did some projection about percentage, but some of the results are above 100%, The percentage  should be within 0~100%, I donot know how to define the limit value of forecast. Thanks for your help.



Adrian Sayers

unread,
Mar 28, 2013, 2:33:24 PM3/28/13
to meds...@googlegroups.com
Can you different link function? with a more appropriate link you should be able to keep within appropriate bounds.

A

Basilio de Braganca Pereira

unread,
Mar 28, 2013, 3:21:51 PM3/28/13
to meds...@googlegroups.com
Have a look in Beta regression
Basilio

2013/3/28 Adrian Sayers <adrian...@gmail.com>:
Reply all
Reply to author
Forward
0 new messages