On Fri, 30 Apr 2021 05:42:17 -0700 (PDT), Cosine <
ase...@gmail.com>
wrote:
>How do we determine if the width of the CI is adequate or too wide?
>
> The corrected data of Table I is given below:
>
>Case Mean Difference P-value 95%CI N
> 1 0.15 0.001 0.05-0.25 2000
> 2 2.10 0.005 1.25-2.95 1200
> 3 1.30 0.089 -1.10-3.70 400
>
Here's some computation showing Cohen's d for each Case.
Cohen's d is the ueual recommendation for two-group
comparisons of effect size. That seems very relevant to the
reported title of that paper.
Cohen's d = (m1-m2) / s_w for the Means and Within SD.
The s_w can be recovered from the t-test: note, the t is
incorporated in the computation of the CI, approximately
+/- 2 (easier than 1.96) for the 95% CI.
t-test t= (m1-m2)/ s_diff where I compute the standard error of
the difference, using the common s_w for Case 3, N= 400 as 200+200:
The variance of a difference is equal to the sum of the variances,
thus,
s_diff= sqrt( s_w**2 /200 + s_w**2 /200)
= sqrt( 2* s_w**2 /200)
= s_w /10
Or, s_w= 10* s_diff .
For Case 3, the range for +/- 1.96 is about 4* s_diff.
For Case 3, the range is 4.8, so that s_diff is 1.2.
Thus s_w is computed as 10 times that, or 12.
Cohen's d would be a "small" effect, 1.1 (from 1.3/12; but
that is less relevant than the fact that "12" is impossible as the
SD for scores between (0,15) -- If all scores are at 0 and
15, equally distributed, the maximum SD of 7.5 is achieved,
as you get by re-scaling of a 0-1 variable to 0-15.
Computations for Cases 1 and 2 get s_w's of 1.12 and 7.36
(nearly the max of 7.5); and Cohen's d's, respectively, of 0.13
and 0.29. Case 2 has a moderate difference.
I don't like to criticize a paper from a distance, that is, without
actually reading it. I'm using the numbers and description,
as given.
Am I all confused, and screwing up? or is this example, as
it has been presented, totally bad?
>For the data provided by the above paper, the author wrote:
>
>Let us reconsider the above-mentioned hypothetical study. The null hypothesis states that the mean difference between females and males on the GDS-15 (scale ranging from 0 to 15) is zero. Hence, if zero is detected in the 95% CI, the null hypothesis is not rejected. Examples of possible study results, using an ? of 5%, are displayed in Table I. ...
>Example 2 is not only statistically significant but also clinically relevant; the difference between females and males on the GDS-15 is approximately two whole points. Moreover, the confidence interval is quite
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>narrow, which indicates that the sample size is large enough to make a proper judgement.
>^^^^^^^^^
> What is the basis for the author to make this judgment?
Knowing the subject matter (almost) always matters.
Females rate higher on typical depression scales (U.S.)
because of non-depressive artifacts, like, TALKING more
with people about everything, including mood. Women
also see doctors more often, which is not entirely accounted
for by pregnancy or menustration. Thus - such results as
these be followed by showing that there are items that
/matter/ that are relevant and differ.
>
> The author also wrote:
>Example 3 is not statistically significant. The confidence interval in this example is very large (almost six
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>points), which makes it difficult to draw any firm conclusions. Since the confidence interval in this
>^^^^^^^
> Again, why could the author make this statement? What did it mean by almost 6 points?
That's what he calls, 4.8. "Clumsy" makes many mistakes.
"Careless" fails to catch them.
>
>example includes both negative and positive values, it is not yet clear if there is a difference between these two groups (if females report more depressive symptoms than males or vice versa). Consequently, this study should be repeated using a larger sample size, which will decrease the width of the confidence interval.
>
They should have started with real data.
--
Rich Ulrich