Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Q direct observation of statistical comparison

26 views
Skip to first unread message

Cosine

unread,
Jun 12, 2023, 11:40:41 AM6/12/23
to
Hi:

A formal way to determine if the effect of a random variable is greater than another is to perform the hypothesis to check whether the difference or ratio of the metric is greater and whether this fact is significant.

However, are there special cases in which one could determine whether the effect of a random variable is greater than that of another without performing the above formal procedure?

For example, when comparing the salary of the domestic and foreign groups, the average salaries and the associated standard errors of the two groups are: (Avg_d, Se_d) and (Avg_f, Se_f). Could we quickly answer the question of greater salary by directly observing the numeric data given above? Say, the confidence interval of the two average salaries overlaps greatly.

Rich Ulrich

unread,
Jun 12, 2023, 12:25:06 PM6/12/23
to
On Mon, 12 Jun 2023 08:40:39 -0700 (PDT), Cosine <ase...@gmail.com>
wrote:
Hmm. You say "effect" a couple of times, suggesting
something more complicated, before you ask about means.

Means and their SDs are the basis of ordinary t-tests.

"Directly observing" the data? Do you want something like this?

https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
Tukey's Quick Test can be used when:

There are two unpaired samples of similar size that overlap each
other. Ratio of sizes should not exceed 4:3.
One sample contains the highest value, the other sample contains
the lowest value. One sample cannot contain both the highest and the
lowest value, nor can both samples have the same high or low value.

By adding the counts of the number of unmatched points on either end,
one can determine the 5%, 1% and 0.1% critical values as roughly 7,
10, and 13 points.

IIRC, the textbook that first showed me this test quoted Tukey
exactly. Tukey described the test AND its critical values in two
sentences. I was disappointed, a few years later, when I saw
that the newer edition of the textbook had dropped the topic.


If you want a full test on ranks, editors will prefer the K-S test
on ranks.

--
Rich Ulrich

Rich Ulrich

unread,
Jun 14, 2023, 12:43:15 AM6/14/23
to
By the way -- I remembered the Tukey Quick Test because I
kept it in mind and used it a number of times, for my own
confirmation when browsing data.

I've seen a text book (I forget whose) that had an appendix
with different cutoffs for various pairs of sample Ns. But I
would not suggest trying to publish something relying on it.

I speculate that the "4:3" ratio of Ns (mentioned above) is a
pretty good match to where the cutoffs are exact.

Tukey's two sentences did not specify the ratio of sample sizes,
and called it 'approximate'.

--
Rich Ulrich

Bruce Weaver

unread,
Jun 28, 2023, 2:27:10 PM6/28/23
to
I don't recall hearing about this test before. Apparently, it is sometimes called the Tukey-Duckworth (quick) test.

https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test

Rich Ulrich

unread,
Jun 29, 2023, 12:15:29 AM6/29/23
to
On Wed, 28 Jun 2023 11:27:08 -0700 (PDT), Bruce Weaver
<bwe...@lakeheadu.ca> wrote:

>I don't recall hearing about this test before. Apparently, it is sometimes called the Tukey-Duckworth (quick) test.
>
>https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test

Top-posting? Okay.

Okay. It adds that Duckworth requested a simple test, usable in
the field, and this is what Tukey provided. I'm not surprised if he
gave us some other Quick tests -- so someone added Duckworth?

Tukey was a prolific statistican, with a different perspective from
most of us. I gained useful insights from reading his textbooks,
though I still wonder if they are 'simple' enough to be used in
the intro courses they are written for. I think I got much of my
perspective on the proper use of transformations from his chapters
on the subject.

There is some paper on presenting data with useful graphics (IIRC
the topic rightly) which lists Tukey, whose ideas it presented, as
author #9; a statistician friend said that his professors had referred
to it as "et al. and Tukey" .


>
>
>On Wednesday, June 14, 2023 at 12:43:15?AM UTC-4, Rich Ulrich wrote:
>> On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich

< snip, original problem >

David Jones

unread,
Jun 29, 2023, 7:56:57 AM6/29/23
to
A problem seems to be in "One sample cannot contain both the highest
and the lowest value, nor can both samples have the same high or low
value."

Is a test a test, if you can't always apply it? Is there some action
advised if the test can't be applied?

Rich Ulrich

unread,
Jun 30, 2023, 2:09:19 PM6/30/23
to
A philosophical question? "Can't" or "shouldn't, because there
is no power or useful table of p-values"?

Pragmatically -- If I have a computer program for it, my program
will give SOME answer. The table of p-values must be a problem,
but it can return '0' for the sum of counts as a safe answer when
there's a doubt. I wonder how robust the Quick test is when the
data are discrete and (therefore) can have a tie at one end, while
the other end can be counted? Pragmatically, I don't know if the
test is robust against that assumption. Monte Carlo randomization
on all the data values could provide an ad-hoc assessment of p.

Assumptions?
The K-S rank test as a test for location has the ASSUMPTION that
the distributions are otherwise similar and differ by the location
parameter. When variances are vastly different, the KS test can
'reject' in either direction, depending on which end the counting
starts from.

No Power?
I've seen a lot of t-tests and contingency tables computed when
the power is virtually nil. For contingency tables and 'exact' tests,
the power for alpha= 0.05 might be exactly nil, for Ns too small.

I have told consultees, "You don't really have a test there, because
the N is too small."

> Is there some action
>advised if the test can't be applied?

Use a test with other assumptions?

--
Rich Ulrich


You are going to be a stickler about assumptions and the
table of p-values?



0 new messages