# Q when should we adjust the significant level for multiple comparisons

6 views

### Cosine

Apr 27, 2021, 8:28:18 PM4/27/21
to
Hi:

Sometimes we need to perform many pair-comparisons. When we need to do so, we would reduce the value of a significant level to reduce the type I error. But when do we need to do so, i.e., reduce the alpha-value? Or how many is too many that we ought to reduce the alpha value?

Thanks,

### duncan smith

Apr 28, 2021, 10:05:25 AM4/28/21
to
It's generally done to control some global Type I error rate (the
probability of incorrectly rejecting at least one null hypothesis given
that all the nulls are true). The minimum adjustment necessary depends
on how the comparisons are related, but the easiest / most common
adjustment is to divide the desired global rate by the number of
comparisons performed. Beyond that there's no straightforward answer.
There are lots of things other than statistical significance to consider
in practice (power, observed effect size, plausibility etc.).

Duncan

### Rich Ulrich

Apr 28, 2021, 1:53:34 PM4/28/21
to
On Tue, 27 Apr 2021 17:28:16 -0700 (PDT), Cosine <ase...@gmail.com>
wrote:

>Hi:
>
> Sometimes we need to perform many pair-comparisons.

When we do, we need to ask ourselves, WHY? What to
we intend to show with our presentation?

> When we need to do so, we would reduce the
>value of a significant level to reduce the type I error.

No, not too good, that is a sloppy statement. I prefer to think of
what we "preserve" rather than "reduce".

Or, using "reduce" - What we sometimes do is reduce the /nominal/
value of the alpha for each test in order to /preserve/ some overall
(global, as Duncan says) Type I error.

In my experience, the easiest way to "preserve" my power of
testing is to reduce the number of /important/ tests. Set up
hypotheses as "main" and subordinate (use hierarchies).

Test one variable, with one d.f. for hypothesis, for one hypothesis
that is most important to the study. Or two, or three at the most.
That worked for clinical research. Similar "indicators" could be
tested simultaneously by combining into a composite score.

If there are a dozen separate "hypotheses", then I wonder if they
aren't all equally exploratory ... or else, perhaps, deserve equal
attention as "clear hypotheses" that each deserve the same
respect and their own 5% level. - Huge surveys are an example
of studies that may have a dozen hyptheses. By virtue of huge
N, it might be possible to "adjust" for multiple testing, but journals
(wisely) say that you should use your large sample to estimate
effect sizes, and only report the results that are /interesting/, not
the ones that are "significant" by some standard.

> But when do we need to do so, i.e., reduce the
> alpha-value? Or how many is too many that we ought to reduce
> the alpha value?
>
> Thanks,

Do you know about avoiding Type II error?

that allows /extra/ Type I error while reducing Type II error.
That is, your nominal p-value can increase to 0.1 or 0.2 or
even 0.5 (I've noted 0.5, while convincing a PI not to use
Benjamini-Hochberg in her research report).

IIRC, that B-H procedure came from electronic communication
theory, rather than clinical research. Its application in clinical
research, as of 15 year ago, was most often a mistake (IMO),
but I can't speak to modern developments.

--
Rich Ulrich