On Tue, 15 Aug 2023 08:10:14 -0700 (PDT), Cosine <
ase...@gmail.com>
wrote:
>Well, let's consider a more classical problem.
>
> Regarding the English teaching method for high school students, we
> develop a new method (A1) and want to demonstrate if it performs
> better than other methods (A2, A3, and A4) by comparing the average
> scores of the experimental class using different methods. Each
> comparison uses paired t-test. Since each comparison is independent of
> the other, the correct significance level using the Bonferroni test is
> alpha_original/( 4-1 ).
It took me a bit to figure out how this was a classical problem,
especially with paired t-tests -- I've never read that literature
in particular. 'Paired' on individuals does not work because you
can't teach the same material to the same student in two ways
from the same starting point.
Maybe I got it. 'Teachers' account for so much variance in
learning that the same teacher needs to teach two methods
to two different classes. 'Teachers' are the units of analyses,
comparing success for pairs of methods.
Doing this would be similar to what I've read a little more about,
testing two methods of clinical intervention. What also seems
similar for both is that the PI wants to know that the teacher/
clinician can and will properly administer the Method without too
much contamination.
>
> Suppose we want to investigate if the developed method (A1) is
> better than other methods (A2. A3. and A4) for English, Spanish, and
> German, then the correct alpha = alpha_original/( 4-1 )/3.
From my own consulting world, 'power of analysis' was always
a major concern. So I must mention that there is a very good
reason that studies usually compare only TWO methods if they
want a firm answer: More than two comparisons will require
larger Ns for the same power, and funding agencies (US, now)
typically care about the power of analysis matters. So if cost/size
is a problem, there won't be four Methods or four Languages.
For the combined experiment, I bring up what I said before:
Are you sure you are asking the question you want? (or that
you need?)
One way to comprise a simple design would be to look at the
two-way analysis of Method x Language. The main effect for
Method would matter, and the interaction of Method x Language
would say that they don't work the same. A main effect for
Language would mainly be confusing.
Beyond that, there is what I mentioned before, Are you sure
that family-wise alpha error deserves to be protected?
For educational methods -- or clinical ones -- being 'just as good'
may be fine if the teachers and students like it better. In fact, for
drug treatments (which I never dealt with on this level), NIH
had some (maybe confusing) prescriptions for how to 'show
equivalence'.
I say '(confusing)' because I do remember reading some criticism
and contradictory advice -- when I read about it, 20 years ago.
(I hope they've figured it out by now.)
--
Rich Ulrich