retrospective power calculations

41 vistas
Ir al primer mensaje no leído

Bland, M.

no leída,
15 nov 2007, 4:32:06 a.m.15/11/07
para MedStats
Can anybody suggest a good reference for the non-statistician about why
retrospective power calculations should not be done? I recall this
being discussed in the past.

Martin

--
***************************************************
J. Martin Bland
Prof. of Health Statistics
Dept. of Health Sciences
Seebohm Rowntree Building Area 2
University of York
Heslington
York YO10 5DD

Email: mb...@york.ac.uk
Phone: 01904 321334
Fax: 01904 321382
Web site: http://martinbland.co.uk/
***************************************************

kornbrot

no leída,
15 nov 2007, 6:46:32 a.m.15/11/07
para MedS...@googlegroups.com
I tried ‘retrsopective power’ in google
Following may be useful:
http://statpages.org/postpowr.html
How to do it with link to ms by lnth on why NOT to do it
Wikipeadia
http://wiki.math.yorku.ca/index.php/Statistics:_Post_hoc_power_analysis

1.    Zumbo, B.D. and A.M. Hubley, A Note on Misconceptions Concerning Prospective and Retrospective Power. Journal of the Royal Statistical Society: Series D (The Statistician), 1998. 47(2): p. 385-388.
May be too technical for your intended audience?

Best

diana

On 15/11/07 09:32, "Bland, M." <mb...@york.ac.uk> wrote:

retrospective power


Professor Diana Kornbrot
 
School of Psychology
 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK

 email: 
d.e.ko...@herts.ac.uk    
 
web:    http://web.mac.com/kornbrot/iweb/KornbrotHome.html                   
 voice:   +44 (0) 170 728 4626
 fax:      +44 (0) 170 728 5073
Home
 
19 Elmhurst Avenue
 London N2 0LT, UK
   
 voice: +44 (0) 208 444 2081
 fax:    +44 (0) 870 706 4997





Luiz Alberto S Melo Jr

no leída,
15 nov 2007, 8:52:08 a.m.15/11/07
para MedS...@googlegroups.com
Goodman SN, Berlin JA.
The use of predicted confidence intervals when planning experiments and the
misuse of power when interpreting results.
Ann Intern Med. 1994;121(3):200-6. Erratum in: Ann Intern Med
1995;122(6):478.

Luiz

Luiz Alberto S Melo Jr
Dept of Ophthalmology
Federal University of Sao Paulo
Brazil

Bland, M.

no leída,
15 nov 2007, 7:54:18 a.m.15/11/07
para MedS...@googlegroups.com
Thanks very much, I will have a look.

Martin

kornbrot wrote:
> I tried ‘retrsopective power’ in google
> Following may be useful:
> http://statpages.org/postpowr.html
> How to do it with link to ms by lnth on why NOT to do it
> Wikipeadia
> http://wiki.math.yorku.ca/index.php/Statistics:_Post_hoc_power_analysis
>

> 1. Zumbo, B.D. and A.M. Hubley, /A Note on Misconceptions Concerning
> Prospective and Retrospective Power./ Journal of the Royal Statistical
> Society: Series D (The Statistician), 1998. *47*(2): p. 385-388.


> May be too technical for your intended audience?
>
> Best
>
> diana
>
> On 15/11/07 09:32, "Bland, M." <mb...@york.ac.uk> wrote:
>
> retrospective power
>
>
>

> *Professor Diana Kornbrot
> *School of Psychology


> University of Hertfordshire
> College Lane, Hatfield, Hertfordshire AL10 9AB, UK
>

> email: _d.e.k...@herts.ac.uk_
> web: _http://web.mac.com/kornbrot/iweb/KornbrotHome.html_


> voice: +44 (0) 170 728 4626
> fax: +44 (0) 170 728 5073

> *Home
> *19 Elmhurst Avenue


> London N2 0LT, UK
>
> voice: +44 (0) 208 444 2081
> fax: +44 (0) 870 706 4997
>
>
>
>
>

--

Richard Goldstein

no leída,
15 nov 2007, 8:09:30 a.m.15/11/07
para MedS...@googlegroups.com

have a look at Russ Lenth's paper:

http://www.stat.iowa.edu/~rlenth/Power/2badHabits.pdf

Rich

Dale Glaser

no leída,
15 nov 2007, 12:26:43 p.m.15/11/07
para MedS...@googlegroups.com
Hi all...I have gathered a couple of the initial papers on interim analysis, but I was curious if any of you can provide a recent paper that cites the pros and cons of interim analysis
 
thank you very much......dale


Dale Glaser, Ph.D.
Principal--Glaser Consulting
Lecturer/Adjunct Faculty--SDSU/USD/AIU
President, San Diego Chapter of
American Statistical Association
3115 4th Avenue
San Diego, CA 92103
phone: 619-220-0602
fax: 619-220-0412
email: glaser...@sbcglobal.net
website: www.glaserconsult.com

SR Millis

no leída,
15 nov 2007, 4:29:48 p.m.15/11/07
para MedS...@googlegroups.com
Hoenig, J. M., & Heisey, D. M. (2001). The abuse of
power: The pervasive fallacy of power calculations for
data analysis. The American Statistician, 55(1),
19-24.

SR Millis

--- "Bland, M." <mb...@york.ac.uk> wrote:
> Can anybody suggest a good reference for the
> non-statistician about why
> retrospective power calculations should not be done?
> I recall this
> being discussed in the past.


Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat
Professor & Director of Research
Dept of Physical Medicine & Rehabilitation
Wayne State University School of Medicine
261 Mack Blvd
Detroit, MI 48201
Email: smi...@med.wayne.edu
Tel: 313-993-8085
Fax: 313-966-7682

kornbrot

no leída,
17 nov 2007, 4:35:50 a.m.17/11/07
para MedS...@googlegroups.com
Another good source is:
http://www.childrens-mercy.org/stats/size/posthoc.asp

‘dr mean’ has much useful advice in form that is suitable for non-experts
IT is backed up by solid peer refrereed sources with links , enclosed below for info – but need actual page for links
Some, but not all, already suggested.

There is a movement suggesting that, post hoc, confidence intervals are ALWAYS superior to power calculations. Personally, I am not convinced [reasons in future post], but these arguments certainly need to be more widely known

    Negative results of randomized clinical trials published in the surgical literature: equivalency or error? J. B. Dimick, M. Diener-West, P. A. Lipsett. Arch Surg 2001: 136(7); 796-800. [Medline]

    Post hoc power analysis--another view. J. Fogel. Pharmacotherapy 2001: 21(9); 1150. [Medline] [Full text]

    Post hoc power analysis: an idea whose time has passed? M. Levine, M. H. Ensom. Pharmacotherapy 2001: 21(4); 405-9. [Medline] [Abstract] (Sample Size, Post Hoc Power)

    The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Steven Goodman. Annals of Internal Medicine 1994: 121(3); 200-206. [Medline]

    Resolving discrepancies among studies: the influence of dose on effect size. I. Hertz-Picciotto, R. R. Neutra. Epidemiology 1994: 5(2); 156-63. [Medline]

    The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis. John M. Hoenig, Dennis M. Heisey. The American Statistician 2001: 55(1); 19-24.

    The Overemphasis On Power Analysis. Thomas Knapp. Nursing Research 1996: 45(6); 379. [Medline]

    Some Practical Guidelines for Effective Sample Size Determination. R.V. Lenth. The American Statistician 2001: 55(3); 187-193. [PDF]

    Confidence limit analyses should replace power calculations in the interpretation of epidemiologic studies. A. H. Smith, M. N. Bates. Epidemiology 1992: 3(5); 449-52. [Medline]

Best

Diana




Professor Diana Kornbrot
 
School of Psychology

 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK

 email:  d.e.ko...@herts.ac.uk    
 
web:    http://web.mac.com/kornbrot/iweb/KornbrotHome.html                   

 voice:   +44 (0) 170 728 4626
 fax:      +44 (0) 170 728 5073
Home
 
19 Elmhurst Avenue

john l moran

no leída,
17 nov 2007, 8:50:14 p.m.17/11/07
para MedS...@googlegroups.com

This reference may be illustrative of  the” movement” referred to below:

 

Anthony J. Onwuegbuzie and Nancy L. Leech. Post Hoc Power: A Concept Whose Time Has Come. Understanding Statistics 3 (4):201-230, 2004.

 

John L Moran

 

 

Department of Intensive Care Medicine

The Queen Elizabeth Hospital

28 Woodville Road

Woodville SA 5011

Australia

Tel 61 08 82227464

Fax 61 08 2226045

Mobile 0414 267 529

E-mail: john....@adelaide.edu.au

Doug Altman

no leída,
19 nov 2007, 6:05:08 a.m.19/11/07
para MedS...@googlegroups.com,MedS...@googlegroups.com
I was unaware of this paper. From its title it takes a diametrically opposite view from those other references on this theme that I am familiar with (but I do not know all of the references that have been circulated). Thus I do not think this paper by Onwuegbuzie and Leech is part of the movement mentioned, which was to condemn such tests, but rather represents an opposite movement.

Google quickly told me that the paper by Onwuegbuzie and Leech can be found as a preprint at
http://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/1a/9e/fd.pdf
I don't know how different it might be from their 2004 publication. Nor have I had time to read this paper, but it is clear that they advocate post hoc power calculations when results are non-significant. I would take some persuading that this is a sensible idea.

Doug

_____________________________________________________

Doug Altman
Professor of Statistics in Medicine
Centre for Statistics in Medicine
Wolfson College Annexe
Linton Road
Oxford OX2 6UD

email:  doug....@cancer.org.uk
Tel:    01865 284400 (direct line 01865 284401)
Fax:    01865 284424
www:     http://www.csm-oxford.org.uk/

EQUATOR Network - resources for reporting research
www: http://www.equator-network.org/



Gary Collins

no leída,
19 nov 2007, 6:28:25 a.m.19/11/07
para MedS...@googlegroups.com
there's unfortunately also

Thomas L (1997). Retrospective power analysis. Conservation Biology, 11(1): 276--280.

which tends to cautiously promote post-hoc power calculations under certain conditions.

Gary

John Whittington

no leída,
19 nov 2007, 9:51:04 a.m.19/11/07
para MedS...@googlegroups.com,MedS...@googlegroups.com
At 11:05 19/11/2007 +0000, Doug Altman wrote (in part):

>I don't know how different it might be from their 2004 publication. Nor
>have I had time to read this paper, but it is clear that they advocate
>post hoc power calculations when results are non-significant. I would take
>some persuading that this is a sensible idea.

I guess I'm asking for trouble by expressing this view, but I am a little
heartened to see that I am probably not totally alone.

I see no need for post-hoc power calculations for consumption by
Statisticians; for them, confidence intervals should tell them all they
need to know. However, in the specific case in which:

(a)...the results are 'non-significant', but of a magnitude at least as
great as the 'design value' for the trial
AND
(b)...it transpires that the initial sample size estimation was based on an
under-estimate of variability

... then, I think that post-hoc power calculations may provide a better way
to convey to non-statisticians 'what went wrong'. From the investigator's
viewpoint, they were advised that (on the basis of available information) a
certain sample size would give them a high (say 90%) chance of detecting
their 'minimum efffect of interest' as significant. Having undertaken a
trial of the size advised and finding a treatment effect greater (perhaps
much greater) than that 'minimum effect of interest', they may well have
some difficulty in understanding 'what went wrong'. One could explain the
reason (variability mucht greater than predicted) and even try to
illustrate it with CIs, but I suspect that the explanation they would find
most easy to understand was that, with the benefit of hindsight (knowledge
of the actual variability encountered) the trial of that sample size had
only a much lower chance (quantified by the post-hoc power calculation) of
detecting 'as significant' an effect of that magnitude.

So, in that particular situation, I do see some merit and justification (in
terms of 'communication') in undertaking a post-hoc power calculation - but
I dare say that many/most will probably disagree!

Kind Regards,


John

----------------------------------------------------------------
Dr John Whittington, Voice: +44 (0) 1296 730225
Mediscience Services Fax: +44 (0) 1296 738893
Twyford Manor, Twyford, E-mail: Joh...@mediscience.co.uk
Buckingham MK18 4EL, UK
----------------------------------------------------------------

BXC (Bendix Carstensen)

no leída,
19 nov 2007, 10:05:56 a.m.19/11/07
para MedS...@googlegroups.com
John,
It seems to me that you implicitly assume that another adequately
powered trial would yield the same point estimate and just a narrower
c.i. than the current one.

Which is:
1) not correct
2) the likely message that gets across to clinicians

Hence you have provided an excellent argument why post-hoc power
calculations should not be performed.

Best,
Bendix Carstensen


> -----Original Message-----
> From: MedS...@googlegroups.com
> [mailto:MedS...@googlegroups.com] On Behalf Of John Whittington
> Sent: 19. november 2007 15:51
> To: MedS...@googlegroups.com
> Cc: MedS...@googlegroups.com
> Subject: {MEDSTATS} Re: retrospective power calculations
>
>

John Whittington

no leída,
19 nov 2007, 10:24:11 a.m.19/11/07
para MedS...@googlegroups.com
At 16:05 19/11/2007 +0100, BXC (Bendix Carstensen) wrote:

>John,
>It seems to me that you implicitly assume that another adequately
>powered trial would yield the same point estimate and just a narrower
>c.i. than the current one.
>Which is:
>1) not correct
>2) the likely message that gets across to clinicians
>Hence you have provided an excellent argument why post-hoc power
>calculations should not be performed.

I don't think I was making that assumption. That is why I was careful to
speak in terms of 'chances' (i.e. probability/power) of getting a
significant result, not a guarantee that an adequately-powered trial
definitely _would_ have produced a significant result (of any mean magnitude).

So, I'm certainly NOT suggesting that investigators should be led to
believe that they would necessarily have got a 'significant' result (of the
same magnitude or whatever) had they conducted a trial with a sample size
calculated using a 'crystal ball' - and I would be at pains to make sure
that they did not get that impression. However, in answer to their
question about 'what went wrong', it seems to me very reasonable to
demonstrate that, 'with what we now know', it would never have been
suggested that the sample size used would be adequate to give a reasonable
power ('chance of detecting a result as significant').

I would say much the same in relation to any crucial (non-statistical)
assumptions which went into the trial design which proved to have been
incorrect. Again, without making any suggestions about what the result
would have been if correct assumptions had been made, I would want to point
out that we would never have considered the study as designed to be
adequate/satisfactory if we had known 'what we know now'.

BXC (Bendix Carstensen)

no leída,
19 nov 2007, 10:39:59 a.m.19/11/07
para MedS...@googlegroups.com
I cannot help wondering what the average clinician would think of
"what we know now" given an insignificant trial result.

I bet that they deep in their souls will think that the estimated
improvement
of say 20% (ci. (-5,+45)% ) is the TRUE effect. Eventhugh we can
persuade then to think otherwise.

And hence, as Hoenig and Heisey point out, essentially we are just
providing a determinsitic transformation of the p-value. And in vain
trying to convince clinicians that this is all there is to it while they
believe something important has been derived for them. And most likely
to the effect that there IS an effect of 20% we were not just good
enough to show it. It is important to maintain that there is no evidence
of effect, and that the ci. is the end of trial so to speak.

Best,
Bendix

> -----Original Message-----
> From: MedS...@googlegroups.com
> [mailto:MedS...@googlegroups.com] On Behalf Of John Whittington
> Sent: 19. november 2007 16:24
> To: MedS...@googlegroups.com
> Subject: {MEDSTATS} Re: retrospective power calculations
>
>

John Whittington

no leída,
19 nov 2007, 11:07:13 a.m.19/11/07
para MedS...@googlegroups.com
At 16:39 19/11/2007 +0100, BXC (Bendix Carstensen) wrote:

>I cannot help wondering what the average clinician would think of
>"what we know now" given an insignificant trial result.
>
>I bet that they deep in their souls will think that the estimated
>improvement
>of say 20% (ci. (-5,+45)% ) is the TRUE effect. Eventhugh we can
>persuade then to think otherwise.
>
>And hence, as Hoenig and Heisey point out, essentially we are just
>providing a determinsitic transformation of the p-value. And in vain
>trying to convince clinicians that this is all there is to it while they
>believe something important has been derived for them. And most likely
>to the effect that there IS an effect of 20% we were not just good
>enough to show it. It is important to maintain that there is no evidence
>of effect, and that the ci. is the end of trial so to speak.

I think there is probably somewhat of a misunderstanding here.

I hope we can all agree that the first thing to be done is to impress upon
the investigators that 'the result is the result' (i.e. 'no evidence of
effect'), and that no amount of waving of statistical 'magic wands' will
alter that.

I am talking about what happens after that. If a trial of adequate power
has produced 'no evidence of effect', then that will often mean that an
investigator (and, perhaps even more so, a trial sponsor) will not see any
need/merit in investigating the treatment (or whatever) any further. On
the other hand, if it can be shown that the study had proved to be
seriously under-powered, then there could well be a desire to conduct
further, adequately powered, trials.

Bruce Weaver

no leída,
19 nov 2007, 12:12:59 p.m.19/11/07
para MedStats
On Nov 19, 6:05 am, Doug Altman <doug.alt...@cancer.org.uk> wrote:
> I was unaware of this paper. From its title it
> takes a diametrically opposite view from those
> other references on this theme that I am familiar
> with (but I do not know all of the references
> that have been circulated). Thus I do not think
> this paper by Onwuegbuzie and Leech is part of
> the movement mentioned, which was to condemn such
> tests, but rather represents an opposite movement.
>
> Google quickly told me that the paper by
> Onwuegbuzie and Leech can be found as a preprint athttp://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/...
> I don't know how different it might be from their
> 2004 publication. Nor have I had time to read
> this paper, but it is clear that they advocate
> post hoc power calculations when results are
> non-significant. I would take some persuading that this is a sensible idea.
>
> Doug

The published article can be downloaded here if your institution
(unlike mine) has a subscription to the journal:

http://www.leaonline.com/doi/abs/10.1207/s15328031us0304_1


--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."

kornbrot

no leída,
19 nov 2007, 2:20:48 p.m.19/11/07
para MedS...@googlegroups.com
Recommendations for post hoc power may depend on whether differences are significant
SIGNIFICANT
When results are ‘significant’, most suggest that confidence intervals are the most INFORMATIVE way of presenting the results.
Some, including me, consider that effect sizes are also informative as one may want to know whter the CI is small because of large sample size or small withing group variability.
NON- SIGNIFICANT
Much more controversial. I have surveyed expert opinion on how to interpret NON- SIGNIFICANT
Experts disagree!!!!! See http://web.mac.com/kornbrot/iweb/KornbrotNonSignificantSummary.htm

A priori power is always recommended to support determination of N, but depends on assumption about population SD.
Actual result may show an underestimate of SD. In this event, it is reasonable to give a post hoc power based on new SD estimate.
In such a situation it is INFORMATIVE to report: the magnitude of the difference was large, but due to large sample variability the power [post hoc] to detect an effect that would have large clinical significance was low. The study needs to be repeated with larger N, now that we know population has high variability. Study INCONCLUSIVE
Conversely, the SD may be similar ot a priori estimate, and it is then reasonable to report: the study had a power of 0.95 to detect a clinically important effect [give real magnitude, not cohen ‘samll, ‘medium’, ‘large’].  Conclusion it is unlikely that a a clinically important effect exists. E.g. MMR vaccine almost certainly does NOT cause autism.
The fact that one can NOT ever prove non-existence of an effect does not mean that one always acts as if the effect exists!
Indeed, all the work looking for drug side effects is really trying to prove that evidence is sufficient to act ‘as if’ there is NO effect.
In my view it is always important to report power when results are NON- SIGNIFICANT, and post hoc power may be more reliable than a priori power, particularly if a priori is based on small piliot rather than large prior literature. Of course the power itself has a CI.

It is also important to distinguish betwee ‘SUFFICIENT’ reporting and ‘INFORMATIVE’ reporting.
Means, SDs and N are SUFFICIENT for a 2 group comparison of central tendency on a numeric measure.
With these given, Cis, effect sizes, power, df and p(null) may all be calculated.
Nevertheless, many argue that means, CI for difference and N are a more INFORMATIVE way of providing the same SUFFICIENT information
[this would lose the information about group SDs in a between group comparison, or individual group means in a within group comparison – this may ‘matter’].
Experts may [and according to our survey do] have different preferences for which set of SUFFICIENT information is most INFORMATIVE .
However, what is most important is that the information reported is indeed SUFFICIENT. A frightening number of our respondents did not choose to report either N or ANY descriptive statistic

Best

Diana

Bruce Weaver

no leída,
19 nov 2007, 5:22:59 p.m.19/11/07
para MedStats
On Nov 19, 2:20 pm, kornbrot <d.e.kornb...@herts.ac.uk> wrote:
> See http://web.mac.com/kornbrot/iweb/KornbrotNonSignificantSummary.htm
>
> A priori power is always recommended to support determination of N, but depends on assumption about population SD.


No argument there.


> Actual result may show an underestimate of SD. In this event, it is reasonable to give a post hoc power based on new SD estimate.


I think the case being discussed would have a larger SD in the
observed data than was used for the a priori sample size estimate,
would it not?


> In such a situation it is INFORMATIVE to report: the magnitude of the difference was large, but due to large sample variability the power [post hoc] to detect an effect that would have large clinical significance was low.


I think the key words there are "to detect an effect that would have
large clinical significance". This sounds like a follow-up power
analysis that uses a revised estimate of the SD (based on the observed
data), but keeps the same measure of effect size as the a priori
sample size estimate.

As I understand it, this is not what is typically meant by "post hoc"
power analysis. I *think* that standard post hoc power analysis uses
both the SD and the effect size measure from the observed data. And
in that case, as Russell Lenth observes (in his "Two Bad Habits"
article), observed power is just a transformation of the observed p-
value.

I can see some merit in doing a follow-up power analysis that uses a
revised estimate of the SD. But using the observed effect size
doesn't make a lot of sense to me. Unless (I suppose) it has
convinced you that you were previously wrong about how large an effect
is practically important.

kornbrot

no leída,
20 nov 2007, 3:23:08 a.m.20/11/07
para MedS...@googlegroups.com
I agree
There is, unofrtunately,ded by SPSS, etc more than 1 meaning for ‘post hoc power analysis’
To be clear:
  1. Post hoc power analysis using revised OBSERVED SD, but NOT observed effect magnitude is informative when results are non-significant. This interpretation is what is usually defended, by defenders of post hoc power. It is a reasonable position. Furthermore, defenders such as ref cited, give algorithms for calculating this  reasonable interpretation, rather than the misleading interpretation provi
BUT
 2. Post hoc power analysis using revised OBSERVED SD and revised OBSERVED effect magnitude is pointless & misleading. Of course power is low – that’s why its NS. Unfortunately, this is what is supplied by packages such as SPSS. This is  interpretation of many critics of post hoc power. The criticism is entirely justified, and very necessary, because of its availability in packages.
Bruce Weaver suggests that the correct terminology for this is ‘follow up power analysis’. It seems to me an excellent idea to encourage the use of this more accurate terminology. BUT there will still be many out there using different interpretation

Best

Diana


Professor Diana Kornbrot
  email: 
d.e.ko...@herts.ac.uk    
    
web:    http://web.mac.com/kornbrot/iweb/KornbrotHome.html
Work
School of Psychology
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
    voice: +44 (0) 170 728 4626
    fax    +44 (0) 170 728 5073
Home
19 Elmhurst Avenue
London N2 0LT, UK
   voice: +44 (0) 208 883 3657

   fax:    +44 (0) 870 706 4997

kornbrot

no leída,
20 nov 2007, 3:28:43 a.m.20/11/07
para MedS...@googlegroups.com
BUT given a low power test for side effects, clinician would probably assume drug was OK
Caveat emptor

Diana

BXC (Bendix Carstensen)

no leída,
20 nov 2007, 5:53:18 a.m.20/11/07
para MedS...@googlegroups.com
> -----Original Message-----
> From: MedS...@googlegroups.com
> [mailto:MedS...@googlegroups.com] On Behalf Of kornbrot
> Sent: 20. november 2007 09:29
> To: MedS...@googlegroups.com
> Subject: {MEDSTATS} Re: retrospective power calculations
>
> BUT given a low power test for side effects, clinician would
> probably assume drug was OK Caveat emptor

Exactly, and that is why it is essentail to communicate the information
that data is also compatible with quite a high rate of side-effects. And
that this cannot be explained away by statistical mumbo-jumbo.

Bendix

John Whittington

no leída,
20 nov 2007, 6:02:43 a.m.20/11/07
para MedS...@googlegroups.com
At 14:22 19/11/2007 -0800, Bruce Weaver wrote:

>On Nov 19, 2:20 pm, kornbrot <d.e.kornb...@herts.ac.uk> wrote:

> > A priori power is always recommended to support determination of N, but
> depends on assumption about population SD.
> > Actual result may show an underestimate of SD. In this event, it is
> reasonable to give a post hoc power based on new SD estimate.
> > In such a situation it is INFORMATIVE to report: the magnitude of the
> difference was large, but due to large sample variability the power [post
> hoc] to detect an effect that would have large clinical significance was low.

Exactly - and that is the situation I've been discussing, and the
particular situation in which I've been 'defending' the concept of post-hoc
power calculations.

>I think the case being discussed would have a larger SD in the
>observed data than was used for the a priori sample size estimate,
>would it not?

I think that Diana and you are saying the same thing - i.e. that the SD
figure used for the a priori sample size estimation proved to be an
underestimate of the actual SD observed in the study.

>I think the key words there are "to detect an effect that would have
>large clinical significance". This sounds like a follow-up power
>analysis that uses a revised estimate of the SD (based on the observed
>data), but keeps the same measure of effect size as the a priori
>sample size estimate.

Indeed - and, as above, that's what I've been discussing and 'defending'.

>As I understand it, this is not what is typically meant by "post hoc"
>power analysis. I *think* that standard post hoc power analysis uses
>both the SD and the effect size measure from the observed data. And
>in that case, as Russell Lenth observes (in his "Two Bad Habits"
>article), observed power is just a transformation of the observed p-
>value.

Ah, if that interpretation of the terminology is correct, then I have not
been talking about 'post hoc power analysis'!! As Bruce, Lenth and many
others have observed, to calculate 'power' retrospectively on the basis of
the observed variability AND the observed effect size would seem to be just
plain silly - and, as they say, mathematically just another way of
presenting what is effectively a p-value from a hypothesis test or a
CI. Even if one forgets the mathematics, the whole concept of 'the power
to detect something that has already been observed' is more than a little
questionable - not that much different from looking for a 'probability'
that last week's lottery had a particular result!

>I can see some merit in doing a follow-up power analysis that uses a
>revised estimate of the SD. But using the observed effect size
>doesn't make a lot of sense to me.

That is certainly my poistion, and the one I've been trying to present.

>....Unless (I suppose) it has convinced you that you were previously wrong

>about how large an effect is practically important.

Yes, although it'[s a very dangerous slippery slope to get onto, the
results of a study may make one 'realise' that one's up-front specification
of 'how large an effect is practically important' was incorrect/unrealistic
- but if that were the case, one could engage in those thought processes
simply by looking at the observed effect size, without any need for any
'power calculations'. It also goes without saying that if one decides
retrospectively that the study should have been designed to detect a
smaller effect, that the required sample size would have been larger -
again, without any need to 'quantify' that with 'power calculations'. I
therefore do not think that those considerations really represent any
justification for the sort of 'post hoc power calculations' which we all
seem to agree are inappropriate.

BXC (Bendix Carstensen)

no leída,
20 nov 2007, 6:14:23 a.m.20/11/07
para MedS...@googlegroups.com
With John's remarks it seems to be pretty clear that everyone agree the
post-hoc power in the narrow sense is sensless, as it is just a
transformation of the p-value.

As goes for the other ways of calculating (post-hoc) power it seems to
me that they aim a planning the size a future study based solely on
information from the current, which to me seems even more futile. There
is nothing wrong with calculations of power (or precision!) but I still
find it hard to understand why it is relevant to discuss future studies
as an integrated part of reporting one.

Best
Bendix Carstensen

> -----Original Message-----
> From: MedS...@googlegroups.com
> [mailto:MedS...@googlegroups.com] On Behalf Of John Whittington
> Sent: 20. november 2007 12:03
> To: MedS...@googlegroups.com
> Subject: {MEDSTATS} Re: retrospective power calculations
>
>

Jeremy Miles

no leída,
20 nov 2007, 7:12:32 a.m.20/11/07
para MedS...@googlegroups.com
On 20/11/2007, BXC (Bendix Carstensen) <b...@steno.dk> wrote:
>
> With John's remarks it seems to be pretty clear that everyone agree the
> post-hoc power in the narrow sense is sensless, as it is just a
> transformation of the p-value.
>

There was some debate on the SPSS list a few years ago about this, and
it turns out that post hoc power (of the kind using the estimates from
the data) isn't a transformation of the p-value in MANOVA (as in with
multiple outcome variables, to clarify, as MANOVA, like post hoc
power, can mean so many different things). I don't recall it made
the concept any more useful though.

[snip]

Jeremy


--
Jeremy Miles
Learning statistics blog: www.jeremymiles.co.uk/learningstats
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com

John Whittington

no leída,
20 nov 2007, 7:50:34 a.m.20/11/07
para MedS...@googlegroups.com
At 04:12 20/11/2007 -0800, Jeremy Miles wrote:

>There was some debate on the SPSS list a few years ago about this, and
>it turns out that post hoc power (of the kind using the estimates from
>the data) isn't a transformation of the p-value in MANOVA (as in with
>multiple outcome variables, to clarify, as MANOVA, like post hoc
>power, can mean so many different things). I don't recall it made
>the concept any more useful though.

Well, yes, once one moves away from the simplest of hypothesis testing
situations, things get more complicated and hence a 'power calculation'
will not literally be a 'transformation of the p-value'. Apart from
anything else, 'a power calculation' is not itself a very straightforward
concept when there are multiple outcome variables. However, I think it
will always remain the case that 'a power calculation' performed using the
observed variability and effect size will be little more than a
manifestation of some (perhaps not the intended/appropriate) hypothesis
test on the data, and therefore neither appropriate nor useful. Indeed,
when the intended analysis is 'more complicated' (as in Jeremy's example),
I would have thought that the argument against that sort of power
calculation' would be even stronger (if that is possible!).

Bruce Weaver

no leída,
26 nov 2007, 11:00:07 a.m.26/11/07
para MedStats
On Nov 19, 6:05 am, Doug Altman <doug.alt...@cancer.org.uk> wrote:
> I was unaware of this paper. From its title it
> takes a diametrically opposite view from those
> other references on this theme that I am familiar
> with (but I do not know all of the references
> that have been circulated). Thus I do not think
> this paper by Onwuegbuzie and Leech is part of
> the movement mentioned, which was to condemn such
> tests, but rather represents an opposite movement.
>
> Google quickly told me that the paper by
> Onwuegbuzie and Leech can be found as a preprint athttp://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/...
> I don't know how different it might be from their
> 2004 publication. Nor have I had time to read
> this paper, but it is clear that they advocate
> post hoc power calculations when results are
> non-significant. I would take some persuading that this is a sensible idea.
>
> Doug

I was unable to get the Onwuegbuzie & Leech article through my
library, but Tony Onwuegbuzie sent me a PDF. Here are a couple of
pertinent quotes from the published article. I've also included a
list of the references that are cited in these quotes.

-------------------------
<quote>
On the other hand, the effect of power on a statistically
nonsignificant finding can be assessed more appropriately by using the
observed (true) effect to investigate the performance of an NHST
(Mulaik et al., 1997; Schmidt, 1996; Sherron, 1988). Such a technique
leads to what is often called a post hoc power analysis.
Interestingly, several authors have recommended the use of post hoc
power analyses for statistically nonsignificant findings (Cohen, 1969;
Dayton, Schafer, & Rogers, 1973; Fagely, 1985; Fagley & McKinney,
1983; Sawyer & Ball, 1981; Wooley & Dawson, 1983). </quote> { pp
209-210 }

<quote>
Conveniently, post hoc power analyses can be conducted relatively
easily because some of the major statistical software programs compute
post hoc power estimates. In fact, post hoc power coefficients are
available in SPSS for the general
linear model (GLM). Post hoc power (or "observed power," as it is
called in the SPSS output) "is based on taking the observed effect
size as the assumed population effect, which produces a positively
biased but consistent estimate of the effect" (D. Nichols,^3 personal
communication, November 4, 2002). For example, the post hoc power
procedure for analyses of variance (ANOVAs) and multiple ANOVAs is
contained within the "options" button.^4 It should be noted that due
to sampling error, the observed effect size used to compute the post
hoc power estimate might be very different than the true (population)
effect size, culminating in a misleading evaluation of power. </
quote> { p. 219 }


References

Onwuegbuzie, A.J., & Leech, N.L. (2004). Post Hoc Power: A Concept
Whose Time Has Come. UNDERSTANDING STATISTICS, 3(4), 201-230

Cohen, J. (1969). Statistical power analysis for the behavioral
sciences. New York: Academic.

Dayton, C. M., Schafer, W. D.,& Rogers, B. G. (1973).Onappropriate
uses and interpretations of power analysis: A comment. American
Educational Research Journal, 10, 231-234.

Fagely, N. S. (1985). Applied statistical power analysis and the
interpretation of nonsignificant results by research consumers.
Journal of Counseling Psychology, 32, 391-396.

Fagley, N. S.,& McKinney, I. J. (1983). Reviewer bias for
statistically significant results:Areexamination. Journal of
Counseling Psychology, 30, 298-300.

Mulaik, S. A., Raju, N. S.,& Harshman, R. A. (1997). There is a time
and a place for significance testing. In L. L. Harlow, S. A. Mulaik, &
J. H. Steiger (Eds.), What if there were no significance tests? (pp.
65-115). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

Sawyer, A. G., & Ball, A. D. (1981). Statistical power and effect size
in marketing research. Journal of Marketing Research, 18, 275-290.

Schmidt, F. L. (1996). Statistical significance testing and cumulative
knowledge in psychology: Implications for the training of researchers.
Psychological Methods, 1, 115-129.

Sherron, R. H. (1988). Power analysis: The other half of the coin.
Community/Junior College Quarterly, 12, 169-175.

Wooley, T. W., & Dawson, G. O. (1983). A follow-up power analysis of
the statistical tests used in the Journal of Research in Science
Teaching. Journal of Research in Science Teaching, 20, 673-681.
-------------------------

Clearly, Onwuegbuzie & Leech are advocating post hoc power analysis
that uses both the observed effect size and the observed SD. And I
think the consensus in this thread is that that type of post hoc power
analysis is ill-advised.

Bruce Weaver

no leída,
26 nov 2007, 11:40:04 a.m.26/11/07
para MedStats
On Nov 15, 8:09 am, Richard Goldstein <richg...@ix.netcom.com> wrote:
> have a look at Russ Lenth's paper:
>
> http://www.stat.iowa.edu/~rlenth/Power/2badHabits.pdf
>
> Rich

I had read Lenth's "Two Bad Habits" article before, but I'd never
before looked at the Thomas (1997) article Lenth recommends in the
final sentence. It summarizes very nicely a lot of the points that
have been raised in this thread.

Thomas, L. (1997). Retrospective Power Analysis. Conservation Biology,
11, 276-280.
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos