--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules
The StatXact manual could probably be taken as an up-to-date and
'authoritative' reference on small sample statistics. It says:
“Fisher, Pearson and Likelihood Ratio Conditional Tests. Most
statisticians automatically pick one or another of these three exact
tests for p-value computations on a single 2 × 2 table. For one-sided
tests Davis (1986) has shown that the p-values computed by all three
methods are the same. For two-sided tests there can be differences.
Pearson and Fisher tend to have the same power and slightly higher
power than likelihood ratio in most designs, while in some cases,
perhaps characterized by heavily unbalanced designs, likelihood ratio
has highest power (Lydersen and Laake (2003), Kang and Kim (2004)).
Choosing any one of these tests implies that you accept the
statistical concept of conditional inference.” (StatXact Version 8
Manual, 2007; Chapter 17, Two Independent Binomial Samples, p. 287)
So it appears the other options mentioned in the manual are not
significantly better than the familiar Fisher exact test. Most
software packages (including Stata) will compute the Fisher exact
test.
Hope this helps.
References cited:
Davis, Linda June (1986).
Exact tests for 2 × 2 contingency tables.
The American Statistician, Vol. 40, No. 2 (May, 1986), pp. 139-141
Kang SH, Kim SJ (2004).
A comparison of the three conditional exact tests in two-way
contingency tables using the unconditional exact power.
Biometrical Journal, 46; 3, 320-330.
Lydersen S, Laake P (2003).
Power comparison of two-sided exact tests for association in
contingency tables using standard, mid p, and randomized test
versions.
Statistics in Medicine, 22, 24; 3859-3871.
John Uebersax PhD
http://www.john-uebersax.com
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules
There is a statistical effect size based on an arcsin transformation (see Cohen, 1988). However, it is better to use what is seen as an important difference in the particular field, also known as clinical effect size. SO what are the ramifications of a 10% difference?
Paul
Dr. Paul R. Swank,
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center-Houston
-----Original Message-----
From: zhon...@aol.com
Sent: Jan 13, 2010 1:57 PM
To: meds...@googlegroups.com
Subject: Re: {MEDSTATS} How much should be detected? Comparing two proportions, sample size and power
Hi Peter,Thank you very much for your help!It is about public health law research. MRSA rate in each state is the percentage, we should detect this.We assume the MRSA rate is 20% if no such public health law in a state. The intervention is the law.Frank
-----Original Message-----
From: Peter Flom <peterflom...@mindspring.com>
To: meds...@googlegroups.com
Sent: Wed, Jan 13, 2010 10:34 am
Subject: Re: {MEDSTATS} How much should be detected? Comparing two proportions, sample size and power
From: zhon...@aol.com wrote
<<<
I am doing sample size and power computing.p1=20% for control group is given, p2 for inervention group, hoping p2 < p1.How much the difference should be detected? 10%? or 15%? or totally subjective?>>>
How much of what? With what intervention? this is entirely context specific. It's not *subjective* exactly,
it's just not a statistical question
Peter
Peter L. Flom, PhD Statistical Consultant Website: http://www DOT statisticalanalysisconsulting DOT com/ Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom
Peter L. Flom, PhD Statistical Consultant Website: http://www DOT statisticalanalysisconsulting DOT com/ Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom
In medicine, this is known as the minimum clinically significant
difference. It is the boundary between a difference so small that no one
would adopt the new intervention on the basis of such a meager changer
and a difference large enough to make a difference (that is, to convince
people to change their behavior and adopt the new therapy).
Establishing the minimum clinically relevant difference is a tricky
task, but it is something that should be done prior to any research study.
For binary outcomes, the choice is not too difficult in theory. Suppose
that an intervention "costs" X dollars in the sense that it produces
that much pain, discomfort, and inconvenience, in addition to any direct
monetary costs. Suppose the value of a cure is kX where k is a number
greater than 1. A number less than 1, of course, means that even if you
could cure everyone, the costs outweigh the benefits of the cure.
For k>1, the minimum clinically significant difference in proportions is
1/k. So if the cure is 10 times more valuable than the costs, then you
need to show at least a 10% better cure rate (in absolute terms) than no
treatment or the current standard of treatment. Otherwise, the cure is
worse than the disease.
It helps to visualize this with certain types of alternative medicine.
If your treatment is aromatherapy, there is almost no cost involved, so
even a very slight probability of improvement might be worth it. But
Gerson therapy, which involves, among other things, coffee enemas, is a
different story. An enema is reasonably safe, but is not totally risk
free. And it involves a substantially greater level of inconvenience
than aromatherapy. So you'd only adopt Gerson therapy if it helped a
substantial fraction of patients. Exactly how many depends on the dollar
value that you place on having to endure a coffee enema, which I will
leave for someone else to quantify.
If there are side effects associated with the treatment that only occur
in a fraction of the patients receiving the treatment, then the
calculations are a bit trickier, but still possible in theory.
You explained in a later email that the intervention is passing a law.
Ask a politician how much change they would need to see in order to
justify passing the law, and that becomes your minimum clinically
significant difference.
Of course, no one does this, so typically they use a SWAG (if you don't
know this acronym, you'll have to look it up).
--
Steve Simon, Standard Disclaimer
"The first three steps in a descriptive
data analysis, with examples in PASW/SPSS"
Thursday, January 21, 2010, 11am-noon, CST.
Free to all! Details at www.pmean.com/webinars
Perhaps my greatest concern is that, having given a good definition of
"minimum clinically significant difference", indicated that it can be
tricky to establish, and saying (with which I would agree) that one should
nevertheless always try to do this, Steve then devotes most of the rest of
what he writes to the matter of cost-benefit (and/or risk-benefit) issues -
which I personally regard as a very different matter.
To my mind, the design of most research should (and usually does) keep
these two concepts separated. Hence, most studies are designed to detect
an effect which is at least as great as the "minimum clinically significant
difference" (which, as Steve says, means exactly what it says), regardless
of 'costs' - particularly when "direct monetary costs" are part of the
overall 'cost' being considered. First, one wants to know whether a
treatment is 'clinically useful'; if it is, then one subsequently has to
look at 'costs' (monetary and otherwise) in order to decide (on the basis
of a whole host of considerations) whether the clinical usefulness
outweighs the 'costs'.
Next, in terms of the simple examples given (relating essentially to
cost-benefit assessment), the very difficult (many would say
'next-to-impossible') problem is the need to ascribe monetary values (or
some other unified measure) to non-monetary costs - like the "pain,
discomfort and inconvenience" mentioned by Steve, but sometimes even things
like a risk of death. At best, this is pretty arbitrary - and consensus
difficult to achieve given the degree of human diversity.
I think that one of the greatest problems with the mathematical approaches
which we are obliged to take to these situations is that 'statistics'
obviously majors on the concept of what happens 'on average', and on
probabilistic information - whereas the real world concerns individual
patients and their clinicians. Steve touches on this (indicating that
calculations may be 'a bit trickier'), the most common situation being when
major 'costs' (such as side effects) can only be handled in probabilistic
terms - and the problems even greater when the outcome is not a simply
binary one. This is considerably further complicated by the fact that
different patients (and clinicians) will have very different views of what
one might call 'utility'
(i.e. the degree of risk-averseness) - and therefore will have different
views of situations in which one has to balance the probability of benefit
against the probability of harm. In the real world, it does not even
necessarily follow (as suggested by Steve in his example) that a given
patient or clinician will reject a treatment because ('on average') the
costs outweigh the benefits. A treatment which is ('on average') 'more
likely to kill than cure' can, in some people's minds, represent an
acceptable risk in certain situations.
For all these reasons, I think that it is best to first look to determine
whether a treatment achieves the "minimum clinically significant effect",
per se, and then subsequently try to grapple with the complex and partially
subjective issues of cost-benefit balance.
Returning to the issue of "minimum clinically significant difference",
although I rarely see this said, I would suggest that it can in some cases
be unethical to use that as the basis of a sample size estimation for a
study. If there is (as will often be the case) good reason to believe that
a treatment will result in effects considerably greater than the minimum
that would be considered clinically significant, then (quite apart from
cost/time considerations etc.) it would seem difficult to ethically justify
exposing (to the test treatment) the large number of subjects that would be
required if one were designing to be able to detect an effect which was
only "the minimum clinically significant".
That's how I see it, anyway,
Kind Regards,
John
John
----------------------------------------------------------------
Dr John Whittington, Voice: +44 (0) 1296 730225
Mediscience Services Fax: +44 (0) 1296 738893
Twyford Manor, Twyford, E-mail: Joh...@mediscience.co.uk
Buckingham MK18 4EL, UK
----------------------------------------------------------------
Indeed. But is this really a problem that comes up?
If we are fortunate enough to have good reason to believe there will be a large effect,
then all the researchers I've dealt with would be delighted to be told that they
need only a small N.
The problem is what to do when there is no reason to suspect some particular effect size.
Usually, the dialogue between me and the person requesting a power analysis consists of
me asking for estimates of things and my client telling me to take my best guess ...
Yes, of course - and if it were me that was being asked to undertake a
sample size estimation, it would not be a problem - because I always ask
about the expected magnitude of effect as well as the "minimum clinically
significant" effect magnitude. I would hope that this is what everyone
does but, as I wrote, I have rarely seen this said/written. It seems
almost universal that anything written or taught about sample size
estimation talks in terms of the "minimum clinically significant effect",
without any mention of the possible scenario I was discussing.
As for whether the issue comes up, it certainly does in my experience. For
obvious reasons, it is most likely to arise in relation to
placebo-controlled trials, and particularly when there is no 'pre-existing
treatment' available for whatever condition is being studied. In that
latter situation, almost any degree of effect would probably be regarded as
clinically significant, but the mechanism of action of the treatment might
be such that a very high degree of effect was expected. As an example,
consider a new antibiotic for treating a currently untreatable (maybe
because of the development of resistance to all existing drugs) serious
infection; in that situation, virtually any degree of true efficacy would
probably be regarded as "clinically significant" (i.e. 'better than
nothing"), but if pre-clinical studies had indicate a high level of
activity against the pathogen involved, there would be good reason to
expect a high level of efficacy.
>The problem is what to do when there is no reason to suspect some
>particular effect size.
Of course, and that's the common situation, in which one does need the
concept of "minimum clinically significant effect" - but, as I said, that
concept, per se, has not (at least in my mind) got anything to do with the
'cost' of the treatment.
>Usually, the dialogue between me and the person requesting a power
>analysis consists of
>me asking for estimates of things and my client telling me to take my best
>guess ...
Indeed so - although that is, at least in my experience, much more common
in relation to measures of variability than of the magnitude of effect.
Kind Regards,