2 views

Skip to first unread message

Sep 21, 2006, 10:00:07 PM9/21/06

to

Ray Koopman wrote:

> Thais Minett wrote:

> > Dear All,

> >

> > Is there a rule of the thumb to judge if a 95% CI is wide or narrow?

> > If hypothetically I have a 95% CI for the difference in house price

> > means of £10 to £20, I know instinctively that this interval is

> > narrow, but if I have 95% CI for the difference in Cola price means of

> > £1 to £11, I know that this interval is wide. Is there a mathematical

> > method, rather than my opinion, to assess if these £10 intervals are

> > narrow or wide?

> >

> > Thanks,

> > Thaís

Thais had posted this problem in sci.stat.consult and had received

replies or several rounds. I wonder why he is starting anew here?

>

> This is the "arbitrary units" problem, which the use of standardized

> effect size indices (Cohen's d, etc) can solve if the standardizing

> variability is in some sense a natural component of the situation,

That is as much nonsense as Cohen's d, whatever it is!

This is a problem in distinquishing the meaning of "Statistically

significant" (which has very explicit meaning in Frequentist

statistics) and the term "Practical USEFULNESS".

Any statisticians worth his salt would know that a highly

"statistically

significant" result can be completely worthless from a practical point

of view of the usefulness of the result.

Conversely, a statistical result that is not statistically significant

at

some .05 or .10 level can be very useful.

The two concepts are TOTALLY different in terms of knowing how

to apply statistics sensibly and usefully.

-- Reef Fish Bob.

as

> opposed to representing some sort of experimental error. The drawback

> is that the CI must then incorporate uncertainty in the estimate of

> variability. This both widens the CI and makes it more difficult to

> calculate.

Sep 23, 2006, 11:06:18 AM9/23/06

to

Reef Fish wrote:

> Ray Koopman wrote:

> Ray Koopman wrote:

---- snip ----

>> This is the "arbitrary units" problem, which the use of standardized

>> effect size indices (Cohen's d, etc) can solve if the standardizing

>> variability is in some sense a natural component of the situation,

>

> That is as much nonsense as Cohen's d, whatever it is!

Xbar(1) - Xbar(2)

Cohen's d = -----------------

Pooled SD

--

Bruce Weaver

bwe...@lakeheadu.ca

www.angelfire.com/wv/bwhomedir

Sep 24, 2006, 12:09:47 AM9/24/06

to

Bruce Weaver wrote:

> Reef Fish wrote:

> > Ray Koopman wrote:

>

> ---- snip ----

>

> >> This is the "arbitrary units" problem, which the use of standardized

> >> effect size indices (Cohen's d, etc) can solve if the standardizing

> >> variability is in some sense a natural component of the situation,

> >

> > That is as much nonsense as Cohen's d, whatever it is!

>

> Xbar(1) - Xbar(2)

> Cohen's d = -----------------

> Pooled SD

>

>

> --

> Bruce Weaver

Thanks, Bruce for providing the reference.

As I said, whatever Cohen used it's IRRELEVANT to the quesiton of

"practical significance" or practical usefulness" which is a completely

different concept from "statistical significance".

Cohen's d appears to be nothing more than a TEST STATISTIC

used to determine the "statistical significance" or a test.

There are hundreds of thousands of such test statistics in the

subject of Statistics, but NONE tells you when a result is of any

PRACTICAL value, as in commonsense.

The above is an important AXIOM in my Data Analysis course.

To know the difference between "statistical significance" and

"practical usefulness".

The best example is that of correlation!! The one single statistic

that is abused by more users than any other.

At a .05 significance level, the Pearson correlation is statistically

significant when its ABSOLUTE value is approximately greater

than 2/sqrt(n), for large n.

That result was posted by me somewhere in sci.stat.math.

Just found it, in a November 2005 post:

http://groups.google.com/group/sci.stat.edu/msg/753acb0044a036e3?hl=en&

The 2 came from 1.96 for Z.

What that says is that a correlation coefficient that is greater

than .02 is statistically significant for n = 10,000.

===== excerpt of the PROOF of the asymptotic result

r is significant (two-tailed) if

RF> |R|* sqrt((n-2)/(1 - R*R)) > t(1-alpha/2;(n-2)).

RF> or equivalently, if |R| > t /(sqrt((n-2) + t*t))

where t is the critical value at alpha/2 for t with (n-2) df.

Since sqrt((n-2) + t*t)) is approximate sqrt(n) for large n,

an easy mnemonic device (using the asymptotic approx,)

is to think of the standard error of r as 1/sqrt(n).

Thus, the r is statistically significant at the 95% level if

I r I > 2/sqrt(100) = 0.2 if n = 100

and I r I > 2/sqrt(10000) = 0.02 if n = 10,000

and so on.

================== end excerpt

Has anyone seen scatterplots of correlation r = .2, or .4?

let along .02? It is completely indistinguishable from a

RANDOM scatter with 0 correlation.

A correlation of .02 is what I call practically USELESS.

A correlation of .98 MAY be USELESS.

There is that SPSS Multiple regression example I did from the

1975 Manual where the Multiple R exceeded .98 I believe, but

the result was completely USELESS.

These are the ideas imbedded in the notions of "statistical

significance" vs practical significance.

-- Reef Fish Bob.

-- Reef Fish Bob.

> bwe...@lakeheadu.ca

> www.angelfire.com/wv/bwhomedir

Sep 28, 2006, 5:42:15 AM9/28/06

to

Dear Reef Fish

I agree with you on most part of your answer, especially on the

distinction between significance and practical usefulness In French, I

usually make the distinction between two quite similar words :

"Significativité" (=significance) and "Signification" (= meaning). But

please let me disagree with you on 1 or 2 points.

> As I said, whatever Cohen used it's IRRELEVANT to the quesiton of

> "practical significance" or practical usefulness" which is a completely

> different concept from "statistical significance".

>

> Cohen's d appears to be nothing more than a TEST STATISTIC

> used to determine the "statistical significance" or a test.

First, you're interpretation of Cohen's d seems erroneous to me.

Cohen's d is a measure of effect size, just as r, R², Partiat Eta² or

Proportional Reduction in Error (Judd & McClelland, 1988). It provides

no information concerning the statistical significance of the

corresponding effects. Cohen's (1988) book on Statistical Power provide

a rule to the interpretation of what can be called a "Small", "Medium"

or "Large" effect size.

Large effect size : Cohen's d = .8 (approx. to r = .38),

Medium effect size : Cohen's d = .5 (approx. r = .25),

Small effect size : Cohen's d = .2 (approx. r = .15).

However, although these indications of effect sizes remains independant

of the question of practical usefulness, it helps to understand that

statistical significance is not a reliable indication of the importance

of the predictor.

> To know the difference between "statistical significance" and

> "practical usefulness".

>

> The best example is that of correlation!! The one single statistic

> that is abused by more users than any other.

>

> At a .05 significance level, the Pearson correlation is statistically

> significant when its ABSOLUTE value is approximately greater

> than 2/sqrt(n), for large n.

>

> A correlation of .02 is what I call practically USELESS.

>

> A correlation of .98 MAY be USELESS.

Second, a correlation of .02 may be useless, but, depending on the

field of research, it can be very useful. Let me quote Rosenthal (1990,

p 775) :

"The Physician's Aspirin Study

At a special meeting held December 18, 1987, it was decided to end

prematurely a randomized double blind experiment on the effects of

aspirin on reducing heart attacks (Steering Committee of the

Physician's Health Study Research Group, 1988). The reason for the

unusual termination of this experiment was that it had become so clear

that aspirin prevented heart attacks (and deaths from heart attacks)

that it would be unethical to continue to give half of the physician

research subjects a placebo. Now what do you suppose was the magnitude

of the experimental effect that was so dramatic as to call for the

termination of this research? Was r2 .90, or 30, or .70, or .60, so

that the corresponding rs would have been .95,.89,34, or .77? No. Well,

was r2 50, .40, .30, or even .20, so that the corresponding rs would

have been .7 1, .63, 3 , or .45? No. Actually, what r2 was, was .0011,

with a corresponding r of .034."

So, you are right in insisting on the distinction between usefulness

and significance. In some research, an effect size (r²) of .50 may be

large but meaningless, whereas in another field, an effect size of .001

can be very small, but very meaningful.

Best Regards,

Fabrice.

Sep 28, 2006, 1:19:44 PM9/28/06

to

"Fabrice" <FGab...@gmail.com> wrote in message

news:1159436535.1...@h48g2000cwc.googlegroups.com...

Dear Reef Fish

Best Regards,

Fabrice.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++

What has evolved from this "dichotomy" is not encouraging.

In psychology, and related "social" sciences, effect sizes has totally

dominated the scene. The statistics text books used really downplay

statistics, logical thinking and mathematics, and all focus on effect sizes.

In economics and in the natural life sciences, statistics has dominated, and

practical significance has been downplayed. So we keep seeing articles

expressing on one hand that "statistical significance" does not indicate

"practical signidicance", and the value of the papers should be based on

"practical significance".

Then on the other hand "practical signidicance" has no "standard", so we and

deluged by papers reporting all kinds of invented tests, weird ideas,

invented ways to get something out of a PCA, etc. Oh welll, it keeps all the

periodical publishers happy with all the subscribers who pay for this

nonsense.

David Heiser

Sep 28, 2006, 8:43:30 PM9/28/06

to

Fabrice wrote:

> Dear Reef Fish

>

> I agree with you on most part of your answer, especially on the

> distinction between significance and practical usefulness In French, I

> usually make the distinction between two quite similar words :

> "Significativité" (=significance) and "Signification" (= meaning). But

> please let me disagree with you on 1 or 2 points.

I read all newsgroups from Google. Today, Google has been constipated

since this morning and posts are coming out several hours late.

But one ADVANTAGE of such delay is that David Heiser has already

said many of the things I would have said in his post, especially his

closing paragraph:

DH> Then on the other hand "practical signidicance" has no

DH> "standard", so we and deluged by papers reporting all kinds

DH> of invented tests, weird ideas, invented ways to get something

DH> out of a PCA, etc. Oh welll, it keeps all the periodical

publishers

DH> happy with all the subscribers who pay for this nonsense.

"Practical usefulness", by its very nature, is a highly subjective

matter,

that is not subject to any quantification "standards" that are

contrived "nonsense" -- that's good technical term for it. :-)

I don't really have much to add ... but I'll do a quick read through

just

to see your disagreements.

> > Cohen's d appears to be nothing more than a TEST STATISTIC

> > used to determine the "statistical significance" or a test.

It looked like a test statistic, but I KNOW it's nonsense from general

consideration of my paragraphs and David's paragraph above.

>

> Cohen's (1988) book on Statistical Power provide

> a rule to the interpretation of what can be called a "Small",

> "Medium" or "Large" effect size.

Do you REALLY think those rules have any PRACTICAL usefulness?

> Large effect size : Cohen's d = .8 (approx. to r = .38),

> Medium effect size : Cohen's d = .5 (approx. r = .25),

> Small effect size : Cohen's d = .2 (approx. r = .15).

Can I write a paper to refine that? I think if the two digits are the

same, such as .33, .44, .55, they are much more useful effect

sizes because they can be correlated to SHOE sizes of 3, 4, 5,

etc. I've never seen any shoe size of .38, have you?

Then there are PRIME sizes, such as .13, .17, .41, etc. that

are clearly more practically useful than numbers that can be

factored into other numbers.

I think Cohen has really opened up a new field, that may be

called Numerological Statistics, that goes hand-in-hand with

his other theories, which I heard he called Type-2 a probability, :-)

What have you been smoking? Are you saying some study was called

off because the correlations were like .02 and that makes .02 useful

because it stops further wasted efforts?

When it comes to correlation, ALL correlations are practically USELESS.

It doesn't really matter if its .02, or .2, or .98, there are ALWAYS

better

(more USEFUL ways of expressing the same information by using

different measures).

For that reason, the more I look at the ABUSE in the use of

correlations,

the more appreciate Tukey's saying something to the effect that using

correlations is like "sweeping dirt under the rug WITH A VENGEANCE"

-- it is far worse than hiding dirt.

So, we'll forever disagree on our opinion about correlations.

Anything else?

> So, you are right in insisting on the distinction between usefulness

> and significance. In some research, an effect size (r²) of .50 may be

> large but meaningless, whereas in another field, an effect size of .001

> can be very small, but very meaningful.

That is saying something a little different that the size for

"practical

usefulness" which has no standard and no scale.

Your saying of .5 and .001 are in terms of ABSOLUTE scale. In

that respect, at least correlation is bounded between -1 and 1 so

that it has a "relative scale" of sorts but still cannot be used to

judge practical usefulness.

On the other hand, for measures that have NO size limits, any

statement about what's big and what's small is nothing short of

being INSANE. Is 100,000,000 large? It could be. But it's

negligible in the budget of the USA. The congress has just

approved the squandering of $7 BILLION for next year's budget

to fight the war in Iraq and Afganistan. 7,000,000,000, which

is 70 times that number, for just one tiny, tiny portion of the

national annual budget.

Is .0001 small? That could be astronomically large if its in

cm units measuring microscopic organisms under a million

power microscope.

But those who make up national, state, and city budget will

know what is PRACTICALLY significant or not in THEIR

budgets, and the microbiologists will know how small is HUGE.

As for Cohen, he doesn't know ANYTHING, but contrived

nonsense, to pander to the gullible sociologists.

-- Reef Fish Bob.

Sep 29, 2006, 4:57:35 PM9/29/06

to

On 28 Sep 2006 17:43:30 -0700, "Reef Fish"

<Large_Nass...@yahoo.com> wrote:

<Large_Nass...@yahoo.com> wrote:

[snip, much]

>

> As for Cohen, he doesn't know ANYTHING, but contrived

> nonsense, to pander to the gullible sociologists.

>

A few weeks ago, Reef Fish wrote several screens of diatribe

against Jacob Cohen, whose book on the subject was the single

main impetus for statistical power analysis. I suppose that the

book built a practical conclusion from Mosteller's entertaining,

"counter-intuitive" demonstrations about how long a World Series

would have to be to show which team is a bit better.

There yet is no internal evidence that I have noticed, that Reef

Fish had ever browsed or even seen either of Cohen's

much-respected and much-cited textbooks. "Data-free analysis."

--

Rich Ulrich, wpi...@pitt.edu

http://www.pitt.edu/~wpilib/index.html

Oct 2, 2006, 11:21:13 AM10/2/06

to

Richard Ulrich wrote:

> On 28 Sep 2006 17:43:30 -0700, "Reef Fish"

> <Large_Nass...@yahoo.com> wrote:

>

> [snip, much]

>

> >

> > As for Cohen, he doesn't know ANYTHING, but contrived

> > nonsense, to pander to the gullible sociologists.

> >

>

> A few weeks ago, Reef Fish wrote several screens of diatribe

> against Jacob Cohen, whose book on the subject was the single

> main impetus for statistical power analysis.

And apparently that's where Richard Ulrich learned his statistics.

Cohen's so-called power analysis is well within the framework of

statistics, EXCEPT when he made blunders such as calling

Type II error a probability, which I even defended that he couldn't

have done it when Richard Ulrich said he did, and it was Jerry Dallal

who testified that Cohen did the same thing (at least once) in his

book.

That is sufficient for me to characterize the much hoopla'd book

Cohen (never heard of him until Ulrich mentioned it) as "contrived

nonsense" for those in the subculture of sociologists.

> I suppose that the

> book built a practical conclusion from Mosteller's entertaining,

> "counter-intuitive" demonstrations about how long a World Series

> would have to be to show which team is a bit better.

The merit of Mosteller's work stands on its own. No amount of

Richard Ulrich inuendo or smearing will matter in the least.

http://en.wikipedia.org/wiki/Frederick_Mosteller

http://www.amstat.org/about/statisticians/index.cfm?fuseaction=biosinfo&BioID=10

If Richard Ulrich had learned just a LITTLE bit from the Mosteller and

Tukey book on Regression, he wouldn't be making all those blunders

on the subject that he learned from sociologists.

> There yet is no internal evidence that I have noticed, that Reef

> Fish had ever browsed or even seen either of Cohen's

> much-respected and much-cited textbooks. "Data-free analysis."

I can tell you definitively that I have NEVER, and will not EVER,

read any of Cohen's book, nor even anything he has written,

except second hand, via Ulrich, Dallal, and a few others. Based

on what I heard there, it was more than sufficient for ME to

decide that he is no "statistician", nor even one who has any good

ideas.

Richard, clip that paragraph and you can quote me any time,

and save that "there is no internal evidence" that I ever browsed

Cohen's writing. There are so many great statisticians whose

work I haven't had time to read YET, on topics that are secondary

to my interest and to the mainstream of Statistics, that the ONLY

way I would read Cohen would be if I my cruiseship sank, and I

am stranded in a Pacific Island of no inhabitant, and Cohen's

book floated on shore. Nah ... I changed my mind. I wouldn't

read it even then, in favor of using it to light a fire. :-)

I hope I didn't beat around the bush above on my assessment

of Cohen.

But to be on the serious side, if anyone had cited ANYTHING

by Cohen that's shown to have some value in statistics, I

would have at least gladly considered it. As it was, the only

thing Ulrich manage to cite are errors or nonsense written by

Cohen, and the number of times his book had been cited in

Google Scholar (a piece of absolute JUNK in Google, as I

had documented -- that it found more citations about my

publications on several subjects in which Tukey and Mostellers

are much better known and much better scholar than I am,

and Google Scholar missed my entry on "Interactive Data

Analysis" in the Encyclopedia of Statistical Sciences altogether.

But that's the kind of "junk research" Ulrich excels in -- relying

on the worst of information from Google, while he missed his

Statistical EDUCATION from the mainstream textbooks and

papers by, and for, statisticians.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu