Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Advanced Probability for Dummies?

4 views

Skip to first unread message

nomad

unread,

Mar 30, 2005, 11:39:24 PM3/30/05

Are there books that explain things like Cramer-Rao Lower Bounds, Fisher
Iteration, Measurement Theory, Bayesian stuff in a straightforward way? I'm
going to be taking a 2nd-year graduate course on the stuff in the Fall, and
I don't have a strong background in mathematical proofs.

Thanks

Herman Rubin

unread,

Mar 31, 2005, 9:00:49 PM3/31/05

In article <0aL2e.16431$DW.1...@newssvr17.news.prodigy.com>,

>Thanks

Are you sure you have your terms right? None of these are
considered to be probability, but statistics. BTW, did you
mean "measure theory"? It is needed for much, and is a
part of the branch of mathematics known as analysis. It
has lots of proofs. It is really a prerequisite for any
good probability course, which is needed for a good course
in statistics.

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Reef Fish

unread,

Mar 31, 2005, 9:43:33 PM3/31/05

Herman Rubin wrote:
> In article <0aL2e.16431$DW.1...@newssvr17.news.prodigy.com>,
> nomad <no...@hotmail.com> wrote:
> >Are there books that explain things like Cramer-Rao Lower Bounds,
Fisher
> >Iteration, Measurement Theory, Bayesian stuff in a straightforward
way? I'm
> >going to be taking a 2nd-year graduate course on the stuff in the
Fall, and
> >I don't have a strong background in mathematical proofs.
>
> >Thanks

The subject qualifies as a statistical oxymoron. :)

>
> Are you sure you have your terms right? None of these are
> considered to be probability, but statistics.

And most are found in undergraduate statistics courses now.
The "Bayesian stuff" are even treated in many First Course
in Statistics, typically in schools of Bueiness.

> BTW, did you
> mean "measure theory"? It is needed for much, and is a
> part of the branch of mathematics known as analysis.

I think he meant "measure theory" too, but I strongly
disagree with you on its "need" or "usefulness" unless he
studies statistics as a branch of mathematics. For an
applied statistician, measure theory is COMPLETELY USELESS.
I can make that statement because I've studied Measure
Theory from Halmos, Kolmogorov-Fomin, and other textbooks
and I have found them to "completely useless" in applied
statistics or good applications of statistical methods.

> It
> has lots of proofs. It is really a prerequisite for any
> good probability course, which is needed for a good course
> in statistics.

No.

It is necessary for a good course in statistics at a
Mathematical Statistics department, but not necessarily at
even the TOP applied departments of statistics such as
Harvard, Yale, Princeton, et al.

I should add that some of the "best" and "most competent" students
in mathematical statistics (such as an advanced Harvard Ph.D.
student in theoretical statistics holding coveted Fellowships)
MAY not have the slightest idea when it comes to applications
or data analysis. I have an anedoctal example of such a
student who could not come up with ANY suitable data in the
statistical literature for a "multiple regression" project,
when asked to select any data set from anywhere.

Incredible but absolutely true. But said student could prove
difficult theorems in mathematical statistics or probability
with ease.

Let the Buyer Beware.

-- Bob.

nomad

unread,

Apr 1, 2005, 3:03:09 AM4/1/05

"Herman Rubin" <hru...@odds.stat.purdue.edu> wrote in message
news:d2ia0h$2s...@odds.stat.purdue.edu...

> Are you sure you have your terms right? None of these are
> considered to be probability, but statistics. BTW, did you
> mean "measure theory"? It is needed for much, and is a
> part of the branch of mathematics known as analysis. It
> has lots of proofs. It is really a prerequisite for any
> good probability course, which is needed for a good course
> in statistics.

So what are the good books for them?

Herman Rubin

unread,

Apr 1, 2005, 9:27:00 PM4/1/05

In article <1112323413....@f14g2000cwb.googlegroups.com>,
Reef Fish <Large_Nass...@Yahoo.com> wrote:

>Herman Rubin wrote:
>> In article <0aL2e.16431$DW.1...@newssvr17.news.prodigy.com>,
>> nomad <no...@hotmail.com> wrote:
>> >Are there books that explain things like Cramer-Rao Lower Bounds,
>Fisher
>> >Iteration, Measurement Theory, Bayesian stuff in a straightforward
>way? I'm
>> >going to be taking a 2nd-year graduate course on the stuff in the
>Fall, and
>> >I don't have a strong background in mathematical proofs.

>> >Thanks

>The subject qualifies as a statistical oxymoron. :)

>> Are you sure you have your terms right? None of these are
>> considered to be probability, but statistics.

>And most are found in undergraduate statistics courses now.
>The "Bayesian stuff" are even treated in many First Course
>in Statistics, typically in schools of Bueiness.

The Bayesian stuff can be treated in first courses, but
not without an understanding (NOT the ability to prove
complicated theorems) of measure theory, which belongs
in high school, instead of all that calculation, which
will have to be done by computers for "real" problems,
anyhow. Probability theory is represented by measure
theory on spaces of total measure 1.

But can one understand why formal Bayesian procedures
may or may not be ridiculous without an understanding
of measure theory? Can one understand the idea of
robust prior Bayes, which I believe should be the type
of procedure used in all practical problems? Can one
understand the problems in a sensible use of Bayes or
classical procedures without it? Can one even see the
basic content of the Neyman-Pearson Lemma? Yes, I can
teach this with high school algebra, but with the idea
of measure, which starts out with measures on finite
spaces. However, this is limited.

>> BTW, did you
>> mean "measure theory"? It is needed for much, and is a
>> part of the branch of mathematics known as analysis.

>I think he meant "measure theory" too, but I strongly
>disagree with you on its "need" or "usefulness" unless he
>studies statistics as a branch of mathematics. For an
>applied statistician, measure theory is COMPLETELY USELESS.
>I can make that statement because I've studied Measure
>Theory from Halmos, Kolmogorov-Fomin, and other textbooks
>and I have found them to "completely useless" in applied
>statistics or good applications of statistical methods.

Have you tried to use it? See my paragraph above. There
is a big difference between formal proofs and using the
ideas; neither implies the other. Stochastic differential
equations are being used more and more; these involve
quite careful use of measure theory.

>> It
>> has lots of proofs. It is really a prerequisite for any
>> good probability course, which is needed for a good course
>> in statistics.

>No.

>It is necessary for a good course in statistics at a
>Mathematical Statistics department, but not necessarily at
>even the TOP applied departments of statistics such as
>Harvard, Yale, Princeton, et al.

Much damage has been done by the misapplication of applied
statistics; I know of too many cases where these have used
totally inappropriate methods because they did not try to
find out the real problem, nor could they have considered
it had they known it. A good applied statistician needs to
recognize when it is necessary to invent methods, and to
be aware of what is needed to justify them; I started doing
this more than 60 years ago. Even when the theorems of
measure theory are not used, the thinking is.

I am often requested to repost my five commandments. These are
posted here without exegesis.

For the client:

1. Thou shalt know that thou must make assumptions.

2. Thou shalt not believe thy assumptions.

For the consultant:

3. Thou shalt not make thy client's assumptions for him.

4. Thou shalt inform thy client of the consequences
of his assumptions.

For the person who is both (e. g., a biostatistician or psychometrician):

5. Thou shalt keep thy roles distinct, lest thou violate
some of the other commandments.

The consultant is obligated to point out how their assumptions affect
their views of their domain; this is in the 4-th commandment. But the
consultant should be very careful in the assumption-making process not
to intrude beyond possibly pointing out that certain assumptions make
large differences, while others do not. A good example here is regression
analysis, where often normality has little effect, but the linearity of
the model is of great importance. Thus, it is very important for the
client to have to justify transformations.

There are, unfortunately, many fields in which much of the activity
consists of using statistical procedures without regard for any assumptions.

>I should add that some of the "best" and "most competent" students
>in mathematical statistics (such as an advanced Harvard Ph.D.
>student in theoretical statistics holding coveted Fellowships)
>MAY not have the slightest idea when it comes to applications
>or data analysis. I have an anedoctal example of such a
>student who could not come up with ANY suitable data in the
>statistical literature for a "multiple regression" project,
>when asked to select any data set from anywhere.

Could the student carry out a multiple regression given the
data and the model? And can the applied statistician know
when NOT to consider a regression?

>Incredible but absolutely true. But said student could prove
>difficult theorems in mathematical statistics or probability
>with ease.

However, for all X, to do a good job of applying X, one must
have an understanding of the concepts of X, and much of the
theory as well.

>Let the Buyer Beware.

Agreed. Let the buyer beware the applied statistician who
does not understand the considerable limitations, which may
require measure theory. Sometimes witch doctors do good,
but much of the time they do not. And the present use of
the statistical religion (there is a paper with this title)
by most of the applied statisticians in medicine does bring
this about; I have seen many questionable papers.

>-- Bob.

Reef Fish

unread,

Apr 1, 2005, 11:51:52 PM4/1/05

Herman Rubin wrote:
> In article <1112323413....@f14g2000cwb.googlegroups.com>,
> Reef Fish <Large_Nass...@Yahoo.com> wrote:

Introduction and non-essential points snipped. The main point of
controversy
is the roll of "measure theory" in enlightedned statistical data
analysis.

My view was clearly stated:

> >I think he meant "measure theory" too, but I strongly
> >disagree with you on its "need" or "usefulness" unless he
> >studies statistics as a branch of mathematics. For an
> >applied statistician, measure theory is COMPLETELY USELESS.
> >I can make that statement because I've studied Measure
> >Theory from Halmos, Kolmogorov-Fomin, and other textbooks
> >and I have found them to "completely useless" in applied
> >statistics or good applications of statistical methods.

Professor Rubin argues otherwise.

HR> >> It is really a prerequisite for any
HR> >> good probability course, which is needed for a good course
HR> >> in statistics.
>
RF> >No.
RF>
RF> >It is necessary for a good course in statistics at a
RF> >Mathematical Statistics department, but not necessarily at
RF> >even the TOP applied departments of statistics such as
RF> >Harvard, Yale, Princeton, et al.

>
> Much damage has been done by the misapplication of applied
> statistics; I know of too many cases where these have used
> totally inappropriate methods because they did not try to
> find out the real problem, nor could they have considered
> it had they known it. A good applied statistician needs to
> recognize when it is necessary to invent methods, and to
> be aware of what is needed to justify them;

I agree with you TOTALLY on the preceding paragraph, but that
has absolutely no relevance to our debate about the need of
"measure theory" in statistical data analysis!

http://encyclopedia.laborlawtalk.com/Measure_theory

"Measure theory is that branch of real analysis which investigates
sigma algebras, measures, measurable functions, and integrals."

> I started doing this more than 60 years ago.

Indeed your age showed. :-) You should have read more carefully
John Tukey's lengthy paper in the Annals of Mathematical Statistics
on "The Future of Data Analysis" 44 years ago. As well as the papers
by George Box and other eminent "data analysts" since Tukey's paper.

> I am often requested to repost my five commandments. These are
> posted here without exegesis.
>
> For the client:
>
> 1. Thou shalt know that thou must make assumptions.

Impeccable!

> 2. Thou shalt not believe thy assumptions.

Poorly put. Thou shalt try to VALIDATE thy assumptions.

That's now standard practice in enlightened statistical data analysis.
One
sponsors one's model (assumption) and then turn around and act as the
CRITIC of what's assumed. ONLY if the assumptions turned out to be
invalid,
then one attempts remedies, such as using transformation of the
original
variables, sponsor a new model, and hundreds of other well-known
techniques
in data analysis today. It was well-expressed by George Box in a JASA
paper
circa 1970s titled "Science and Statistics" <inexact citation from
memory>..

>
> For the consultant:
>
> 3. Thou shalt not make thy client's assumptions for him.
>
> 4. Thou shalt inform thy client of the consequences
> of his assumptions.

No real disagreement here. But for the present discussion/debate, do
YOU
inform your client that s/he had better have a good understanding of
"measure
theory" <as in Helmos' book on the subject or in Functional Analysis>
before
he could do the statistical analysis well?

> For the person who is both (e. g., a biostatistician or
psychometrician):
>
> 5. Thou shalt keep thy roles distinct, lest thou violate
> some of the other commandments.

Statistics is STATISTICS, as far as the proper application of
statistical
methods and procedures are concerned. It makes no difference whether
the practitioner is a biostatistician, psychometrician, business exec,
or
a generic student of statistics. For a particular procedure, such as
the
well-known regression analysis, whether one is doing the analysis
properly doesn't depend at all on what field the person is in!

>
> The consultant is obligated to point out how their assumptions affect
> their views of their domain; this is in the 4-th commandment. But
the
> consultant should be very careful in the assumption-making process
not
> to intrude beyond possibly pointing out that certain assumptions make
> large differences, while others do not. A good example here is
regression
> analysis, where often normality has little effect, but the linearity
of
> the model is of great importance. Thus, it is very important for the
> client to have to justify transformations.
>
> There are, unfortunately, many fields in which much of the activity
> consists of using statistical procedures without regard for any
assumptions.

True, but that's using a strawman of those who abuse the use of
statistics,
and it has no relevance to whether they know "measure theory" or not,
does it?

I might even venture to suggest that on a PROPORTIONAL basis, there are

more mathematical statisticians who know "measure theory" and who
routinely abuse the use of statistics, than applied statisticians who
know nothing
about "measure theory".

Allow me to add a 6th commandment to this discussion:

6. Thou shalt not blow ill-directed hot air and get down to SPECIFICS.

For example, either tell us why "measure theory" is needed to do a
regression analysis properly, or needed to do ANY applied statistical
or data analysis procedure properly.

> >I should add that some of the "best" and "most competent" students
> >in mathematical statistics (such as an advanced Harvard Ph.D.
> >student in theoretical statistics holding coveted Fellowships)
> >MAY not have the slightest idea when it comes to applications
> >or data analysis. I have an anedoctal example of such a
> >student who could not come up with ANY suitable data in the
> >statistical literature for a "multiple regression" project,
> >when asked to select any data set from anywhere.
>
> Could the student carry out a multiple regression given the
> data and the model?

Not THAT student. But he would be perfectly at ease discussing
Lebesque, Haar, sigma-finite, and other measures in "measure
theory" with you.

> And can the applied statistician know
> when NOT to consider a regression?

You'd better believe it! And most of them don't know anything about
"measure theory" either, but they know how to do statistical data
analysis properly, and with the appropriate models and techniques.
That's what some of us teach them, instead of teaching them how to
prove useless theorems on useless Functional Analysis concepts,
when their goal is to learn how to analyze statistical data properly.

>
> >Incredible but absolutely true. But said student could prove
> >difficult theorems in mathematical statistics or probability
> >with ease.
>
> However, for all X, to do a good job of applying X, one must
> have an understanding of the concepts of X, and much of the
> theory as well.

True. But "measure theory" is NOT one of the needed concepts
to do statistical data analysis well!

>
> >Let the Buyer Beware.
>
> Agreed. Let the buyer beware the applied statistician who
> does not understand the considerable limitations, which may
> require measure theory.

See commandment 6 above. Name ONE common problem for
the applied statistician that requires an understanding of
"measure theory", and you would have at least shown the
"existence" of the problem -- which I haven't seen, and I have
analyzed THOUSANDS of applied REAL data anslysis problems
conducted by students in graduate courses or clients in consulting.

> Sometimes witch doctors do good,
> but much of the time they do not. And the present use of
> the statistical religion (there is a paper with this title)
> by most of the applied statisticians in medicine does bring
> this about; I have seen many questionable papers.

Medicine is not exactly one area that excels in the enlightened
practice of statistics and data analysis. So much so one medical
doctor was caught cheating with fake data because he didn't have
the slightest idea that real data could not possibly be as non-random
as he thought random data would look. ;-)

But knocking those who don't do statistics well, or kicking some
strawmen has added NOTHING in support of your argument that
"measure theory" is needed in the enlightened statistical analysis
of Real Data.

Let's get down to earth momentarily. Consider the tens of
thousands of APPLIED books on statistics and statistical
methodolgy, how many of them even MANTION "measure
theory" as lip service? Why? Because one can use statistics
and do data analysis perfectly (or as perfectly as can be done)
without ANY of the gobbledy-goop in "measure theory", as in

RF> http://encyclopedia.laborlawtalk.com/Measure_theory
RF>
RF> "Measure theory is that branch of real analysis which investigates
RF> sigma algebras, measures, measurable functions, and integrals."

Just do a POLL in this or other statistics newsgroups, and see how many
have even heard of "sigma algebras" and "Harr measure"? LOL

I rest my case.

-- Bob.

Data Matter

unread,

Apr 2, 2005, 1:04:26 AM4/2/05

Measure theory is hard to grasp without a solid background in real
analysis and calculus. You might even need some metric space and
topology as well. It really isn't something you can pick up without
heavy commitment. There are many many books out there. Some
recommended ones are the books by Sidney Resnick (has detailed proofs),
Billingsley, Chung and Capinski. You might take a look at David
William's text "Weighing the Odds" -- this is an eclectic one that
cover some probability, measure theory and statistics.

Reef Fish

unread,

Apr 2, 2005, 8:33:07 AM4/2/05

I hate making typos and the keyboard devil routinely made me do it.
But some typos are much worse than others. Here's one:

> Just do a POLL in this or other statistics newsgroups, and see how
many
> have even heard of "sigma algebras" and "Harr measure"? LOL

It's "Haar measure" of course. I guess nobody heard of Harr measure
either because it was my typo that I am correcting now before someone
else does it.

-- Bob.

nomad

unread,

Apr 2, 2005, 4:05:47 PM4/2/05

Thanks, I will look into these.

It seems that in most advanced math texts, the proofs are frustratingly
vague -- I'll devise my own explanation, but I'm not sure if I'm right, so
then it leads to the labor-intensive tasks of study groups or office hour
visits. It's nice when a book can just explain things as they are.

"Data Matter" <fun...@gmail.com> wrote in message
news:1112421866.0...@o13g2000cwo.googlegroups.com...

Herman Rubin

unread,

Apr 5, 2005, 12:54:28 PM4/5/05

In article <1112417512.7...@f14g2000cwb.googlegroups.com>,
Reef Fish <Large_Nass...@Yahoo.com> wrote:

>Professor Rubin argues otherwise.

>RF> >No.

>http://encyclopedia.laborlawtalk.com/Measure_theory

My "definition" of measure theory is much wider. There
are the details, and there are the ideas. Probability,
at any level, is represented by measure theory at that
level. Also, I criticize, and some of my colleagues
now criticize, measure theory as starting with the real
line, rather than finite spaces. It becomes much more
understandable when one realizes that everything is
discrete or limits of discrete, rather than having what
can be called "cute" definitions. The earliest use of
integration with respect to a measure was computing a
merchant's bill. I very much object to the way that
expectation is taught in low-level courses; there is
no real difference between discrete distributions and
absolutely continuous ones, except in the tricks one
can use to obtain answers.

Measure theory uses only high school algebra, plus limits.
One cannot avoid limits, and that should be in elementary
school, taught intelligently. Euclid's students had a
rather good understanding of limits, although they could
not define them.

>> I started doing this more than 60 years ago.

>Indeed your age showed. :-) You should have read more carefully
>John Tukey's lengthy paper in the Annals of Mathematical Statistics
>on "The Future of Data Analysis" 44 years ago. As well as the papers
>by George Box and other eminent "data analysts" since Tukey's paper.

I am quite aware of what Tukey has done; much of it had
to be redone or discarded. He was what can be called an
anti-mathematical statistician. On the other hand, Neyman,
who is usually considered an applied statistician, required
his students to learn the basic abstract mathematics he did
not know.

>> For the client:

>Impeccable!

>> For the consultant:

No; as above, I do not consider it a good presentation of the
ideas of measure theory. The user of statistics has to
understand the concepts, not the full formalism, or the proofs.

Measure theory is needed to understand what probability is.

A computer can carry out any procedure "properly", but cannot
decide whether that is the procedure called for by the problem.
Someone cannot decide what "significance tests" or "p values"
or "confidence intervals" are without this understanding.
In fact, I do not believe that these are what should be done,
but they are.

The concepts of measure theory are, not the details.

>> >Let the Buyer Beware.

>> Agreed. Let the buyer beware the applied statistician who
>> does not understand the considerable limitations, which may
>> require measure theory.

>See commandment 6 above. Name ONE common problem for
>the applied statistician that requires an understanding of
>"measure theory", and you would have at least shown the
>"existence" of the problem -- which I haven't seen, and I have
>analyzed THOUSANDS of applied REAL data anslysis problems
>conducted by students in graduate courses or clients in consulting.

To how many of these have you applied decision theory?
I do not mean Bayesian calculations with a convenient
or intuitively simple prior, but a simultaneous consideration
of all the consequences. This can be done, and robustly.

>> Sometimes witch doctors do good,
>> but much of the time they do not. And the present use of
>> the statistical religion (there is a paper with this title)
>> by most of the applied statisticians in medicine does bring
>> this about; I have seen many questionable papers.

>Medicine is not exactly one area that excels in the enlightened
>practice of statistics and data analysis. So much so one medical
>doctor was caught cheating with fake data because he didn't have
>the slightest idea that real data could not possibly be as non-random
>as he thought random data would look. ;-)

I was not discussing cheating. Few of the numerous
meta-analyses being done are sound, and honesty has nothing
to do with the reason they are unsound.

>But knocking those who don't do statistics well, or kicking some
>strawmen has added NOTHING in support of your argument that
>"measure theory" is needed in the enlightened statistical analysis
>of Real Data.

>Let's get down to earth momentarily. Consider the tens of
>thousands of APPLIED books on statistics and statistical
>methodolgy, how many of them even MANTION "measure
>theory" as lip service? Why? Because one can use statistics
>and do data analysis perfectly (or as perfectly as can be done)
>without ANY of the gobbledy-goop in "measure theory", as in

Probability is represented by measure theory, and cannot
be understood otherwise. One can learn concepts without
going through the proofs, and vice versa. Too many do the
latter, and I do not endorse it.

One of my former colleagues told me, when he was asked to
review one of these applied books, suggested that as he
opposed burning books, that instead the author should be
burned. I question whether I would consider 5% of the
applied books as even sound, in that someone reading them
would know when to use the methods. How to use methods
is generally for computers, and the present pedagogy,
with its emphasis on how, ignores the reasoning needed.

>RF> http://encyclopedia.laborlawtalk.com/Measure_theory
>RF>
>RF> "Measure theory is that branch of real analysis which investigates
>RF> sigma algebras, measures, measurable functions, and integrals."

>Just do a POLL in this or other statistics newsgroups, and see how many
>have even heard of "sigma algebras" and "Harr measure"? LOL

>I rest my case.

BTW, even in an elementary course, one can show why the
"sigma" is needed. Otherwise, one can have a positive
random variable less than any positive real number.
It is what is needed to show that likelihood ratios
exist, without which statistics is difficult.

Herman Rubin

unread,

Apr 5, 2005, 12:59:07 PM4/5/05

In article <1112421866.0...@o13g2000cwo.googlegroups.com>,

This is NOT true. If you want to be able to prove the
theorems, yes, but the ideas can be easily presented at
the high school level by someone who understands them.

As for needing calculus, it is not needed at all to do
the proofs. Neither is metric spaces or topology, except
for limits of real numbers. As for detailed proofs, using
a somewhat unusual, but sound, approach, I recommend Loeve;
he is quite verbose, but does not use the cute tricks of
those other books, which impede understanding.

Reef Fish

unread,

Apr 5, 2005, 8:22:34 PM4/5/05

It's about time to give this continuation of the existing thread,
"Advanced Probability for Dummies?" a new subject to reflect the
subject of the ongoing debate between Professor Rubin and myself.

Why did you start with THAT in the first place? "I, Herman Rubin
have my definition of "measure theory" which is at odds with all
known definitions and usage of measure theory, but mine is better."
Did I paraphrase you correctly?

Then I could have saved much bandwidth.

Why don't you look up the item "Measure Theory in Probability and
Statistics" in the Encyclopedia of Statistical Sciences and see
that you are completely off base in your use of "measure theory"?

< I invoked the 6th Commandment (see below) to snip some hot air>

Rubin> Measure theory uses only high school algebra, plus limits.

The Encyclopedia of Statistical Sciences article listed these six areas
as "Basic Ingredients of Measure Theory":

1. Extension Theorem.
<Measure defined on a semiring of sets extended to a measure on the

sigma-field generated by the simiring.>

2. Integration and Limit Theorems
<Monotone convergence theorems, Fatou's Lemma, etc.>

3. Transformation of Integrals

4. Inequality and Lp spaces
<Minkowski and Holder Inequalities in Banach and other spaces>

5. Absolute Continuity and Singularity; Lebesgue Decomposition and
Radon-Nicodym Theorem.

6. Iterative Evaluation of Multiple Integrals, Fubini's Theorem.

Which of the above do you consider "high school algebra" and "uses
only high school algebra"?

You were no less absurd in your criticism and argument against
the poster "Data Matter" when he correctly asserted, though somewhat
of an UNDERstatement of the mathematical background needed,

DM> Measure theory is hard to grasp without a solid background in real
DM> analysis and calculus. You might even need some metric space and
DM> topology as well.

and gave serveral references which the original poster "nomad" asked
for.

The Enclopedia article had these pertinent statement about the
mathematical level required to understand "measure theory" as that
which is far beyond the "calculus", let alone high school algebra:

"The clearest distinction between what is achievable with calculus only
and what requires measure theory is contained in Feller's two volumes.
The first is restricted to discrete sample spaces, and the second
describes probability theory in general using measure theory."

I agree. I have used those two volumes in a sequence of two-semester
graduate courses in Probability Theory.

High school algebra only? LOL

The Encyclopedia also gave this recommendation, which is relevant to
Data Matter's post, relevant to "nomad"'s request (but quite clearly
well beyond his mathematical capability):

"There are several excellent texts which develop the general theory
of probability based on measure theory such as Loeve, Neveu, Feller's
second volume, Breiman, Chung, Chow and Teicher, and Billingsley.

Which of these textbooks are accessible by advanced undergraduates
in college, let alone HIGH SCHOOL students using "high school algebra
only"?

I am sure I have over=killed those silly statement of yours, Professor
Rubin. But I want to leave no room of doubt that you were talking
nonsense about a subject "measure theory" which is well-known to
most mathematicians, many mathematical statisticians, and some
applied statisticians.

> >> I started doing this more than 60 years ago.
>
> >Indeed your age showed. :-) You should have read more carefully
> >John Tukey's lengthy paper in the Annals of Mathematical Statistics
> >on "The Future of Data Analysis" 44 years ago. As well as the
papers
> >by George Box and other eminent "data analysts" since Tukey's paper.
>
> I am quite aware of what Tukey has done; much of it had
> to be redone or discarded.

That's called "science", Herman (I hope you don't mind I'll use it
for shortl instead of Probessor Rubin). That's how Newtonian
Physis was discarded in favor of the physics of the 20th and 21st
century.

No one ever accused Tukey as being perfect. In fact there's much
to be said that Tukey has about 100 new ideas a year, and about
98 of them are eventually discarded. But Tukey has done more good
to the ENLIGHTENED practice of statistics (and "data analysis", a
term he introduced to accent "exploratory data analysis", as
opposed to the old-fashioned "confirmatory analysis" of hypothesis
testing).

> He was what can be called an
> anti-mathematical statistician.

No!

He was a STATISTICIAN who is anti those mathematicians who ABUSE
the application of statistics -- namely he was against those in
mathematical statistics trained in mathematics, who later called
themselves mathematical statisticians, but don't know the first
thing about APPLYING statistics to any statistical problems.

He was a mathematician himself, as were ALL of the statisticians of
his era because there was no Department of Statistics anywhere until
the latter part of the 20th century.

That made it all the more credible to have "mathematical statisticians"
criticized by knowledgeable mathematicians, John Tukey himself
included.

Your Department of Statistics in Purdue wasn't formed until 1963:
http://www.stat.purdue.edu/about_us/facts.html

Unfortunately, some old mathematical statisticians have never outgrown
their 18th century mentality about what APPLIED statistics is, and the
uselessness of "measure theory" in present-day statistical practice.

> On the other hand, Neyman,
> who is usually considered an applied statistician, required
> his students to learn the basic abstract mathematics he did
> not know.

That's because he was in a Department of Mathematics full of dianosaurs
of the 19th century whose ideas of "applied statistics" had been
obsolete for decades.

< Herman's 5 Commandments and my comments snipped for brevity. >

>
>
> >Allow me to add a 6th commandment to this discussion:
>
> >6. Thou shalt not blow ill-directed hot air and get down to
SPECIFICS.
>
> >For example, either tell us why "measure theory" is needed to do a
> >regression analysis properly, or needed to do ANY applied
statistical
> >or data analysis procedure properly.
>
> Measure theory is needed to understand what probability is.

Repeating a falsehood ad infinitum will not make it true. See the
citation from the Encyclopedia of Statistics about Feller's volume 1.

Of course NOW we know it's Herman's own (only one in the world who uses
HIS definition of "measure theory" ... the "uses only high school
algebra" kind) definition to suit his own <absurd> argument.

> A computer can carry out any procedure "properly", but cannot
> decide whether that is the procedure called for by the problem.

Herman, I've got to give you credit for being the Champion of
Strawmen Knock Down, to make irrelevant points that are also
irrelevant to your argument about the necessity of measure theory
in doing statistics well.

> >> And can the applied statistician know
> >> when NOT to consider a regression?
>
> >You'd better believe it! And most of them don't know anything about
> >"measure theory" either, but they know how to do statistical data
> >analysis properly, and with the appropriate models and techniques.
> >That's what some of us teach them, instead of teaching them how to
> >prove useless theorems on useless Functional Analysis concepts,
> >when their goal is to learn how to analyze statistical data
properly.

> >> >Let the Buyer Beware.
>
> >> Agreed. Let the buyer beware the applied statistician who
> >> does not understand the considerable limitations, which may
> >> require measure theory.
>
> >See commandment 6 above. Name ONE common problem for
> >the applied statistician that requires an understanding of
> >"measure theory", and you would have at least shown the
> >"existence" of the problem -- which I haven't seen, and I have
> >analyzed THOUSANDS of applied REAL data anslysis problems
> >conducted by students in graduate courses or clients in consulting.
>
> To how many of these have you applied decision theory?

Plenty. Applied decision theory are taught and practiced in most of
the reputable schools of businesses in this country.

> I do not mean Bayesian calculations with a convenient
> or intuitively simple prior, but a simultaneous consideration
> of all the consequences. This can be done, and robustly.

You are knocking down your own strawmen this time. And your premise
that everything has to be done "robustly", in YOUR way of course, is
shall I charitably say, "all wet"?

> Probability is represented by measure theory, and cannot
> be understood otherwise. One can learn concepts without
> going through the proofs, and vice versa. Too many do the
> latter, and I do not endorse it.

Nobody cares whether you endorse it or not. Your notion of
measure theory and its relevance to statistical pracctice had
already been completely shot down, more than half a century ago,
by renouned statisticians in statistical practice.

> One of my former colleagues told me, when he was asked to
> review one of these applied books, suggested that as he
> opposed burning books, that instead the author should be
> burned.

^
And he would probably suggest that 97% of those who give statistical
advise in rec.stat.* newsgroups should be burnt too. I would actually
support the latter than his idea about burning books. :-)

> >RF> http://encyclopedia.laborlawtalk.com/Measure_theory
> >RF>
> >RF> "Measure theory is that branch of real analysis which
investigates
> >RF> sigma algebras, measures, measurable functions, and integrals."
>
> >Just do a POLL in this or other statistics newsgroups, and see how
many
> >have even heard of "sigma algebras" and "Harr measure"? LOL
>
> >I rest my case.

Apparently not. I was baited by Herman to show some SUBSTANTIVE
material from the Encyclopedia of Statistical Sciences to completely
debunk Herman's absurd arguments, based on his own absurd definition
of "measure theory", about the need of "measure theory" (any measure
theory, even Herman's brand) to understand the
correct/enlightened/sound
practice of statistical applications.

> This address is for information only. I do not claim that these
views
> are those of the Statistics Department or of Purdue University.
> Herman Rubin, Department of Statistics, Purdue University
> hru...@stat.purdue.edu Phone: (765)494-6054 FAX:
(765)494-0558

I posted this in a newsgroup at 12:00 am 2005:

RF> I yam what I yam; and that's all that I yam, says Reef Fish
RF> the sailorman. Toot-Toot! (The toot-toot bit was from the
RF> horn I used to bring in 2005!)

-- Bob.

Data Matter

unread,

Apr 5, 2005, 9:09:22 PM4/5/05

Reef Fish,

You baited me.

I think Herman can help us as well as generations of students by
pointing to a text that teaches measure theory using only high-school
algebra (and limits).

Lets try something practical: how does one explain a sigma algebra
using the above? That requires understanding what an abstract set is,
and what is a collection of subsets of a set. And then how do you
explain a sigma algebra generated by a set?

Reef Fish

unread,

Apr 5, 2005, 9:47:12 PM4/5/05

Data Matter wrote:
> Reef Fish,
>
> You baited me.
>
> I think Herman can help us as well as generations of students by
> pointing to a text that teaches measure theory using only high-school
> algebra (and limits).

You missed the point, I'm afraid.

Herman's measure theory is NOT what you or I or anyone else calls
or understand to be the "measure theory" underlying Probability of
the Loeve level.

Herman could just point to any high school text that discusses the
elementary probability space and call that "measure theory" --
which is more or less what he said in his post which I gave my
detailed follow-up in reply.

>
> Lets try something practical: how does one explain a sigma algebra
> using the above? That requires understanding what an abstract set
is,
> and what is a collection of subsets of a set. And then how do you
> explain a sigma algebra generated by a set?

That's the notion of one of the six "basic ingredients" (Extension)
explained in the Encyclopedia of Statistical Sciences entry under
"Measure Theory in Probability and Statistivcs".

So far, Herman has evaded every issue of possible serious discussion,
in favor of his unilateral definition and position, while using the
tectics of striking down irrelevant strawmen without substantiating
his own position.

Since his position that "measure theory" is needed to PRACTICE
statistics and data analysis -- a completely untenable position --
I am doubtful that anything new will come from further discussion
by Herman on this subject other than repeating his same old song.

-- Bob.

Herman Rubin

unread,

Apr 6, 2005, 5:07:35 PM4/6/05

In article <1112746954.7...@l41g2000cwc.googlegroups.com>,

>> >Professor Rubin argues otherwise.

>> >RF> >No.

>> >http://encyclopedia.laborlawtalk.com/Measure_theory

Try looking up the corresponding definition of algebra.

It will look like nobody would need algebra for statistics,
either. One can do measure theory at many levels.

>Then I could have saved much bandwidth.

There is the theory of finitely additive measures and
finitely additive integrals. Obviously, sigma algebras
are not involved in that.

As for measures, every representation of probability,
i.e., probability spaces at ANY level, is measure.
It necessarily uses fields of sets, not sigma fields.
When one considers conditional probability, the
collection of measurable sets and measurable functions
is reduced. Expectation is integral with respect to
a probability measure. One can do these things at
many levels, and it is not necessary to do the most
complete early.

>Why don't you look up the item "Measure Theory in Probability and
>Statistics" in the Encyclopedia of Statistical Sciences and see
>that you are completely off base in your use of "measure theory"?

>< I invoked the 6th Commandment (see below) to snip some hot air>

>Rubin> Measure theory uses only high school algebra, plus limits.

>The Encyclopedia of Statistical Sciences article listed these six areas
>as "Basic Ingredients of Measure Theory":

>1. Extension Theorem.
> <Measure defined on a semiring of sets extended to a measure on the

> sigma-field generated by the simiring.>

If the measure is countably additive; if not, it does not
extend. This allows the extension of Riemann to Lebesgue.

Most mathematicians do not know what a measure on a
semiring of sets is. It can be useful, even at the
high school level, as a semiring can be derived from
a lattice, and the modular law on the lattice is
equivalent to finite additivity on the semiring, or
its extension to a field.

It is still only high school algebra, plus limits.

>2. Integration and Limit Theorems
> <Monotone convergence theorems, Fatou's Lemma, etc.>

Analysis is high school algebra, plus limits. Nothing
else is used here.

>3. Transformation of Integrals

Most of this is calculus, not part of measure theory.
This is not needed early.

>4. Inequality and Lp spaces
> <Minkowski and Holder Inequalities in Banach and other spaces>

>5. Absolute Continuity and Singularity; Lebesgue Decomposition and
> Radon-Nicodym Theorem.

Without this, we do not have likelihood ratios and any
good form of statistical inference. BTW, this cannot
be done with finitely additive, although many have
carried out formal limits which are unjustified.

>6. Iterative Evaluation of Multiple Integrals, Fubini's Theorem.

>Which of the above do you consider "high school algebra" and "uses
>only high school algebra"?

The basic material in 2 can be done at that level.
This is what is needed to discuss probability and
expectation, as well as conditional probability.
There is no essential difference between expectation
for discrete and non-discrete distributions; using
the typical computational definitions makes it very
difficult to see later.

>You were no less absurd in your criticism and argument against
>the poster "Data Matter" when he correctly asserted, though somewhat
>of an UNDERstatement of the mathematical background needed,

>DM> Measure theory is hard to grasp without a solid background in real
>DM> analysis and calculus. You might even need some metric space and
>DM> topology as well.

>and gave serveral references which the original poster "nomad" asked
>for.

>The Enclopedia article had these pertinent statement about the
>mathematical level required to understand "measure theory" as that
>which is far beyond the "calculus", let alone high school algebra:

>"The clearest distinction between what is achievable with calculus only
>and what requires measure theory is contained in Feller's two volumes.
>The first is restricted to discrete sample spaces, and the second
>describes probability theory in general using measure theory."

Feller's second volume is more of an analysis book than
a probability book. I would never use such an analysis
approach which thoroughly hides the probability.

>I agree. I have used those two volumes in a sequence of two-semester
>graduate courses in Probability Theory.

>High school algebra only? LOL

>The Encyclopedia also gave this recommendation, which is relevant to
>Data Matter's post, relevant to "nomad"'s request (but quite clearly
>well beyond his mathematical capability):

>"There are several excellent texts which develop the general theory
>of probability based on measure theory such as Loeve, Neveu, Feller's
>second volume, Breiman, Chung, Chow and Teicher, and Billingsley.

>Which of these textbooks are accessible by advanced undergraduates
>in college, let alone HIGH SCHOOL students using "high school algebra
>only"?

Teach high school students the basics of real analysis,
which does NOT require a calculus course, and they will
be able to handle Loeve without too much difficulty.
Feller would be impossible. Most books use too much
complex analysis, and too many cute tricks, to be easily
understood. One should use a conceptual approach, not
using irrelevant ideas which HIDE the concepts. There
are very few theorems in probability theory not
concerning the characteristic function which need that in
their proofs, and those proofs should rarely be used.

I doubt that there are any high school students who can
understand analysis after the usual calculus courses
who can not understand it without them. I also suspect
that the reverse is false. Teaching how to calculate
without understanding is the real problem.

On the other hand, these characteristic function and
moment generating function methods should be used by
the applied statistician to get numerical answers.
I have posted some of these, of which a small number
are original with me, on the sci.stat.math newsgroup.

>I am sure I have over=killed those silly statement of yours, Professor
>Rubin. But I want to leave no room of doubt that you were talking
>nonsense about a subject "measure theory" which is well-known to
>most mathematicians, many mathematical statisticians, and some
>applied statisticians.

>> >> I started doing this more than 60 years ago.

>> >Indeed your age showed. :-) You should have read more carefully
>> >John Tukey's lengthy paper in the Annals of Mathematical Statistics
>> >on "The Future of Data Analysis" 44 years ago. As well as the
>papers
>> >by George Box and other eminent "data analysts" since Tukey's paper.

>> I am quite aware of what Tukey has done; much of it had
>> to be redone or discarded.

>That's called "science", Herman (I hope you don't mind I'll use it
>for shortl instead of Probessor Rubin). That's how Newtonian
>Physis was discarded in favor of the physics of the 20th and 21st
>century.

No; it was wrong in the first place.

>No one ever accused Tukey as being perfect. In fact there's much
>to be said that Tukey has about 100 new ideas a year, and about
>98 of them are eventually discarded. But Tukey has done more good
>to the ENLIGHTENED practice of statistics (and "data analysis", a
>term he introduced to accent "exploratory data analysis", as
>opposed to the old-fashioned "confirmatory analysis" of hypothesis
>testing).

I have seen too much of this misdone. Tukey's ideas of
"contamination" are too restricted. What was worse was
that he tried to keep mathematics out.

>> He was what can be called an
>> anti-mathematical statistician.

>No!

>He was a STATISTICIAN who is anti those mathematicians who ABUSE
>the application of statistics -- namely he was against those in
>mathematical statistics trained in mathematics, who later called
>themselves mathematical statisticians, but don't know the first
>thing about APPLYING statistics to any statistical problems.

This is false; I doubt that he would have considered the
ideas of Sewall Wright, who knew little mathematics, to
be reasonable. Sewall Wright is, AFAIK, the first to use
structural equations with many dependent variables.

>He was a mathematician himself, as were ALL of the statisticians of
>his era because there was no Department of Statistics anywhere until
>the latter part of the 20th century.

There were quite a few such departments; most of them had
no mathematicians whatever. Iowa State was a notable one,
which produced PhD"s who would have difficulty with Feller
volume 1, as well as some who could use complex analysis.
Snedecor was the most well known one from there, and they
certainly had a systematic Analysis of Variance program.
I do not happen to know the others offhand, but there were
many at the time.

However, there were mathematical statistics programs in
mathematics departments, such as Princeton under Wilks.
Hotelling managed to set one up in the economics department
at Columbia, the mathematicians being unwilling to have
anything to do with it. Neyman set up the program at
Berkeley in the mathematics department at Berkeley, where
it remained for many years, until it became a separate
department. The program at Michigan State was in mathematics
until it broke away. At Illinois, it was in the mathematics
department until recently.

But most of these programs, started after WWII, were started
under the influence of such as Neyman and Hotelling, and they
were often in mathematics departments where mathematics was
willing AND ABLE to support them. I do not think that moving
them out as separate departments was a good idea.

>That made it all the more credible to have "mathematical statisticians"
>criticized by knowledgeable mathematicians, John Tukey himself
>included.

Many mathematicians have criticized mathematical statisticians
as weak mathematically; that was the case in some places, and
is getting worse now. Tukey was the only one I know who
claimed that applied statistics (or applied mathematics) should
be separated from theory. Neyman was extremely strong in having
students, even if they would be applied, taking strong abstract
mathematics, and Hotelling would not let them get away with much.

>Your Department of Statistics in Purdue wasn't formed until 1963:
>http://www.stat.purdue.edu/about_us/facts.html

The Department was not as a unit; it was in the Department of
Mathematics before. At that time, a Division of Mathematical
Sciences was formed, with Statistics and Computer Sciences
being departments in that division. It did not become a
separate administrative unit until 1969.

>Unfortunately, some old mathematical statisticians have never outgrown
>their 18th century mentality about what APPLIED statistics is, and the
>uselessness of "measure theory" in present-day statistical practice.

I do not know of any mathematical statistician who had an
18th century mentality about applied statistics. Karl Pearson
and a few others overlapped into the 19th century, and the
bulk of the applied statistics of the 19th century was the
use of least squares for reduction of errors in physics and
engineering and astronomy and surveying. The use of statistics
in agriculture, psychology, etc., is late 19th century.

The mathematical statisticians produced were heavily exposed
to a Fisherian view of statistics. Fisher did put some sound
mathematics in, and Neyman added more; these were definitely
applied. I started out by applying my mathematical knowledge
to inference problems from systems of simultaneous stochastic
equations arising in econometrics, and have always been looking
for ways to apply theory. I have heard the chairman at Stanford
state that Charles Stein was his best applied statistician,
although someone was needed to translate.

Now some of the probabilists may have taken a negative view to
applying statistics, such as Feller. Also, there are always
those who insist on doing pure theory, but their contributions
are still used.

>> On the other hand, Neyman,
>> who is usually considered an applied statistician, required
>> his students to learn the basic abstract mathematics he did
>> not know.

>That's because he was in a Department of Mathematics full of dianosaurs
>of the 19th century whose ideas of "applied statistics" had been
>obsolete for decades.

Wrong. He required MORE than the pure mathematicians did.

Few of the other members of the department even paid much
attention to either theoretical or applied statistics.

>> >> >Let the Buyer Beware.

But in few statistics departments. The reason is that too
many are thinking in terms of statistical methods, not
concepts.

>> I do not mean Bayesian calculations with a convenient
>> or intuitively simple prior, but a simultaneous consideration
>> of all the consequences. This can be done, and robustly.

>You are knocking down your own strawmen this time. And your premise
>that everything has to be done "robustly", in YOUR way of course, is
>shall I charitably say, "all wet"?

My "definition" of the robustness of a procedure is the extent
to which its properties do not depend on the assumptions one
does not wish to make. The arguments for the so-called
"robust" regressions, which I believe were started by Tukey
and his followers, was tha normality was too strong an
assumption. But Gauss proved this in the Gauss-Markoff
theorem in 1823, and he did not worry about proving things
were normal after that.

>> Probability is represented by measure theory, and cannot
>> be understood otherwise. One can learn concepts without
>> going through the proofs, and vice versa. Too many do the
>> latter, and I do not endorse it.

>Nobody cares whether you endorse it or not. Your notion of
>measure theory and its relevance to statistical pracctice had
>already been completely shot down, more than half a century ago,
>by renouned statisticians in statistical practice.

>> One of my former colleagues told me, when he was asked to
>> review one of these applied books, suggested that as he
>> opposed burning books, that instead the author should be
>> burned.

>And he would probably suggest that 97% of those who give statistical

>advise in rec.stat.* newsgroups should be burnt too. I would actually
>support the latter than his idea about burning books. :-)

.......................
--

Herman Rubin

unread,

Apr 6, 2005, 5:17:43 PM4/6/05

In article <1112749762.1...@l41g2000cwc.googlegroups.com>,

>You baited me.

As I have pointed out, sigma algebras are not always needed.

Take a look at Mosteller, Rourke, and Thomas. This IS at
the high school algebra level. It discusses sample spaces
from the beginning, including allowing different sample
spaces for the same probability problems. It does not
use sigma algebras, but does uses algebras of sets, so the
idea of a collection of subsets of a set is there. It also
uses partitions, which are collections smaller than the
collection of all subsets.

It does not do enough on continuous distributions. One
can show the need for countable additivity without starting
with sigma algebras. It is easy enough to prove that, if
a finite measure is countably additive on a field, it can
be "Lebesgue extended" by countable approximations, and it
is trivial that this extension is a sigma field. This will
not get Borel sets or Borel measurability, but this is not
needed until MUCH later.

Reef Fish

unread,

Apr 6, 2005, 6:51:09 PM4/6/05

Herman Rubin wrote:
> In article <1112746954.7...@l41g2000cwc.googlegroups.com>,
> Reef Fish <Large_Nass...@Yahoo.com> wrote:
> >It's about time to give this continuation of the existing thread,
> >"Advanced Probability for Dummies?" a new subject to reflect the
> >subject of the ongoing debate between Professor Rubin and myself.

Our debate was about the meaning of "measure theory" and its necessity
or lack thereof, in sound applications of statistical methods, as in
Applied Statistics.

I've given my view

>
> >> >> >I think he meant "measure theory" too, but I strongly
> >> >> >disagree with you on its "need" or "usefulness" unless he
> >> >> >studies statistics as a branch of mathematics. For an
> >> >> >applied statistician, measure theory is COMPLETELY USELESS.
> >> >> >I can make that statement because I've studied Measure
> >> >> >Theory from Halmos, Kolmogorov-Fomin, and other textbooks
> >> >> >and I have found them to "completely useless" in applied
> >> >> >statistics or good applications of statistical methods.
>

Herman argued otherwise.

>
> >> >HR> >> It is really a prerequisite for any
> >> >HR> >> good probability course, which is needed for a good course
> >> >HR> >> in statistics.

Then it turned out that Herman had a completely unconventional
notion (AFAIK he is the only one in the world with his view
about what "measure theory" is). Herman argued that it only takes
high school algebra level mathematics to understand measure theory.
even at the LOEVE level, he just added. :-)

Contrary to:

> >> >http://encyclopedia.laborlawtalk.com/Measure_theory
>
> >> >"Measure theory is that branch of real analysis which
investigates
> >> >sigma algebras, measures, measurable functions, and integrals."

and what I cited from the Encylopedia of Statistical Sciences.

>
> >Why don't you look up the item "Measure Theory in Probability and
> >Statistics" in the Encyclopedia of Statistical Sciences and see
> >that you are completely off base in your use of "measure theory"?

> >Rubin> Measure theory uses only high school algebra, plus limits.

At this point of the continuing discussion/argument/debate, I think
it's
amply clear to anyone who was patient enough to follow through the
details, that we have both more than adequately expressed our
diametrically opposite points of view.

I thank Herman for his very elaborate, detailed, and informative
comments to my comments in the preceding post, while graciously
overlooking most of my sarcastic remarks. :)

IMNSHO, any further continuation of this "debate" will be counter-
productive as well as boring to everyone.

So, I am going to bring our debate to a closure (on MY part) on a
light anedotal tale (which is 100% true without exaggeration), about
Herman, which reflects this statement he made in this latest post:

************************************************************
HR> Teach high school students the basics of real analysis,
HR> which does NOT require a calculus course, and they will
HR> be able to handle Loeve without too much difficulty
************************************************************

I wish there is a category in Guinness's World Record in the
over-statement or under-statement categories for any of us to
submit the above as a candidate entry.

Here's my related anecdote.

In 1962-3, Herman was teaching a 3-quarter (Ph.D. level) course on
Probability Theory, at Michigan State University, using Loeve as
the textbook (possibly a course or equivalent in Measure Theory
using Halmos' book as pre-requisite).

I happened to know several of the students in that course,
possibly the entire small population of those enrolled in that
class. <G>

They had independently and collectively told the story that Herman
"covered" the entire book of Loeve in the first week of the
three-quarter course, and then started to lecture on "more
interesting" material, such as spending several lectures on the
blackboard working on his own research problems, WITHOUT having
done it before or knew where he was heading. :-) Then Herman
finally gave up after several such lectures without getting any
results and went on to OTHER "interesting problems". The words
in quote, such as "covered" were verbatim and unexaggerated.

Mind you -- this was NOT my first hand experience about Herman,
but first-hand experience TALKING to Herman's students in that
advanced 3-quarter course, most of whom finished their Ph.D.s
in statistics, some of whom from ELSEWHERE than Michigan State,
in departmentz that were more "applied" than mathematical :-)

Now it all made sense. Of course the Ph.D. students in Herman's
advanced probability course knew high school algebra well. So
why bother lecturing from Loeve? I am not exactly certain if
his students said he "covered" Loeve in three lectures or three
weeks. But you get the idea -- for those who are familiar with
level and difficulty of Loeve's probability textbook. :-)

To this day, I can even NAME some of those students from whom I
got the anecdote above. But I don't think it would serve any
useful purpose, and I am sure Herman has his own version of the
anecdotal account.

But I'll rest on that note, with respect to Herman's notion of
(a) high school level algebra and (b) Loeve's book, and (c)
how they relate to Herman's view about "measure theory".

-- Bob.

R. Martin

unread,

Apr 6, 2005, 10:55:39 PM4/6/05

snip

Much of this reminds me, in a general way, of the interactions between
Tukey and a number of people at Iowa State when he gave a short course
on EDA there in the late 1970s, IIRC. I'm sure if I'd known as much
then as I do now (which is about 0.1% of what I'd like to know about
statistics), I'd have enjoyed the "discussions" even more.

Cheers,
Russell
--
All too often the study of data requires care.

Reef Fish

unread,

Apr 7, 2005, 1:35:46 AM4/7/05

I don't know Oscar Kempthorne nearly as well as I knew John Tukey,
and his students and colleagues who share his gneeral views about
"data analysis". But I can easily imagine the lively debate those
two might have, about statistics. :-)

> I'm sure if I'd known as much
> then as I do now (which is about 0.1% of what I'd like to know about
> statistics), I'd have enjoyed the "discussions" even more.
>
> Cheers,
> Russell
> --

> All too often the study of data requires care.

Don't recall seeing this sig before. Very nice!

-- Bob.

R. Martin

unread,

Apr 7, 2005, 7:11:44 AM4/7/05

Reef Fish wrote:
>
> R. Martin wrote:
> > Reef Fish wrote:

snip

> > >
> > > IMNSHO, any further continuation of this "debate" will be counter-
> > > productive as well as boring to everyone.
> > >
> >
> > snip
> >
> > Much of this reminds me, in a general way, of the interactions
> between
> > Tukey and a number of people at Iowa State when he gave a short
> course
> > on EDA there in the late 1970s, IIRC.
>
> I don't know Oscar Kempthorne nearly as well as I knew John Tukey,
> and his students and colleagues who share his gneeral views about
> "data analysis". But I can easily imagine the lively debate those
> two might have, about statistics. :-)
>
> > I'm sure if I'd known as much
> > then as I do now (which is about 0.1% of what I'd like to know about
> > statistics), I'd have enjoyed the "discussions" even more.
> >
> > Cheers,
> > Russell
> > --
>
> > All too often the study of data requires care.
>
> Don't recall seeing this sig before. Very nice!
>
> -- Bob.

Thanks. I adapted it from Blackman and Tukey's book on spectral
analysis: "All too often the study of spectra requires care".

Herman Rubin

unread,

Apr 7, 2005, 2:46:08 PM4/7/05

In article <1112827869....@g14g2000cwa.googlegroups.com>,
Reef Fish <Large_Nass...@Yahoo.com> wrote:

>Herman argued otherwise.

>Contrary to:

>> >> >http://encyclopedia.laborlawtalk.com/Measure_theory

Hardly. I "covered" the measure theory part of that book in
the first week, as a graduate course in measure theory from
the math department was a prerequisite. Many had just taken
such a course the previous year.

I did come up with some new results in that class, and this
is not the only time I have done research in teaching. One
of my former students from the University of Oregon told me
that he wished he had not rewritten his notes, as the methods
which did not work out easily then would almost certainly work
elsewhere.

BTW, I believe it was in that course that I tried to do some
of the sessions according to the R. L. Moore method. I am
afraid I do not have the patience to wait through an entire
class period while the students were trying to come up with
a way to do a problem they had not seen. Some mathematicians
swear by that teaching method.

>Mind you -- this was NOT my first hand experience about Herman,
>but first-hand experience TALKING to Herman's students in that
>advanced 3-quarter course, most of whom finished their Ph.D.s
>in statistics, some of whom from ELSEWHERE than Michigan State,
>in departmentz that were more "applied" than mathematical :-)

That is NOT that advanced a course. It is only that late
because the undergraduate and earlier material does not even
present the concepts, but is mainly methods and formulas.
But it did go into topology of measures, and this I did
quite efficiently by using general topological spaces, which
was also a prerequisite then.

>Now it all made sense. Of course the Ph.D. students in Herman's
>advanced probability course knew high school algebra well. So
>why bother lecturing from Loeve? I am not exactly certain if
>his students said he "covered" Loeve in three lectures or three
>weeks. But you get the idea -- for those who are familiar with
>level and difficulty of Loeve's probability textbook. :-)

>To this day, I can even NAME some of those students from whom I
>got the anecdote above. But I don't think it would serve any
>useful purpose, and I am sure Herman has his own version of the
>anecdotal account.

>But I'll rest on that note, with respect to Herman's notion of
>(a) high school level algebra and (b) Loeve's book, and (c)
>how they relate to Herman's view about "measure theory".

If you want to use an anecdote, here is one from one of my
colleagues. For those who do not know the level, STAT 528
at Purdue is a beginning mathematical statistics course at
the level of the second edition of Bickel and Doksum, which
is one of the few books I can recommend highly, and assumes
essentially Hoel-Port-Stone probability.

< Gentlemen, I had asked in my 528 class if the prior
<with density exp(mu^4) is a proper prior for
<a normal mean mu on (-infinity, infinity).

< 3 stat Ph.D. students answered : yes.
< ----------

<Their proof was : the calculator gave a value of 12.3
<for the integral.

< What can a stat Professor assume his students know
<in a Ph.D. level class ?

Herman Rubin

unread,

Apr 7, 2005, 3:00:38 PM4/7/05

In article <425515...@wdn.com>, R. Martin <russell...@wdn.com> wrote:
>Reef Fish wrote:

>> R. Martin wrote:
>> > Reef Fish wrote:

snip

>> > All too often the study of data requires care.

>> Don't recall seeing this sig before. Very nice!

>> -- Bob.

>Thanks. I adapted it from Blackman and Tukey's book on spectral
>analysis: "All too often the study of spectra requires care".

When Tukey came up with his method for analyzing spectra, it
was clear that there was something radically wrong with it.

A student is now working with me on coming up with a robust
method under certain assumptions, and not too non-robust
even without them, by using a decision-theoretic analysis.
The idea of robust prior (restricted) Bayes had not occurred
to me then, or in fact until much later, or I would have done
it 50 years ago.

One does have to use methods from mathematical statistics, and
some of these are based on measure theory.

Reef Fish

unread,

Apr 7, 2005, 3:45:26 PM4/7/05

I stand corrected! But the students did say "Loeve", perhaps as
a hyperbole of how quickly you "covered" the substance of the
course content.

OTOH, even if the students did have the measure theory course
from the math department as a prerequisite to the course, a week
would seem inadequate to REVIEW the essential contents of "meassure
theory" (our main topic of discussion here) of the Halmos level,
especially in view of how little is usually covered and how much
is forgotten by the students the day after the final exam. :-)

>
> I did come up with some new results in that class, and this
> is not the only time I have done research in teaching. One
> of my former students from the University of Oregon told me
> that he wished he had not rewritten his notes, as the methods
> which did not work out easily then would almost certainly work
> elsewhere.

A good and valid point of view.

But I think the point of view of some students in the same course
was that you were using their time to do your own research, when
you sometimes came unprepared to lecture the course material, but
went on the tangent of unchartered territory of your own research.

>
> BTW, I believe it was in that course that I tried to do some
> of the sessions according to the R. L. Moore method. I am
> afraid I do not have the patience to wait through an entire
> class period while the students were trying to come up with
> a way to do a problem they had not seen. Some mathematicians
> swear by that teaching method.

And some students swear AT that method. :) That reminded me of
the movie "A Beautiful Mind" which dramatized the life and style
of the Princeton mathematician John Nash:

http://cepa.newschool.edu/het/profiles/nash.htm

Apart from the fact that Nash dislikes textbooks and proved
major mathematical results of others on his own, rather than reading
textbook presentation and proof of the same. He was shown in the
movie to enter an advanced graduate course, completely unprepared,
and threw the textbook of the course into the trash can, and
started talking about things that interested him.

Of course we see now that Nash's excuse was that he was already
half crazy at that time, or more scholarly put, suffereing from
schizophrenia.

But that was a different story, and Nash was one of a few
mathematicians who won a Nobel Prize in some area other than math
because Nobel made sure that there will never be a Nobel Prize in
Mathematics because <legend has it> Nobel found out that some
mathematician was messing with Nobel's wife.

> >Now it all made sense. Of course the Ph.D. students in Herman's
> >advanced probability course knew high school algebra well. So
> >why bother lecturing from Loeve? I am not exactly certain if
> >his students said he "covered" Loeve in three lectures or three
> >weeks. But you get the idea -- for those who are familiar with
> >level and difficulty of Loeve's probability textbook. :-)
>
>

> If you want to use an anecdote, here is one from one of my
> colleagues. For those who do not know the level, STAT 528
> at Purdue is a beginning mathematical statistics course at
> the level of the second edition of Bickel and Doksum, which
> is one of the few books I can recommend highly, and assumes
> essentially Hoel-Port-Stone probability.
>
> < Gentlemen, I had asked in my 528 class if the prior
> <with density exp(mu^4) is a proper prior for
> <a normal mean mu on (-infinity, infinity).
>
> < 3 stat Ph.D. students answered : yes.
> < ----------
>
> <Their proof was : the calculator gave a value of 12.3
> <for the integral.

Ah yes. There are many anecdotes of this genre.

My own anecdote about Bayesian assessment and the medical profession
was an NSF funded project in assessing the "Efficacy of Radiology
in Diagnosing Fractures". Medical docs at emergency hospitals
were asked about their personal PROBABILITY assessment of skull
fractures of the ER patients, BEFORE and AFTER seeing the x-rays.

Training sessions were conducted, both for the purpose of designing
the questions to ask the MDs, and get an idea of what feedback
the questions would get from the training samples.

When the MDs were asked to aseess their SUBJECTIVE PROBABILITY of
fracture of the patient, more than just a few answered "1.96". :-)

>
> < What can a stat Professor assume his students know
> <in a Ph.D. level class ?

Very little is my own experience.

That's all the more reason it should take more than a week to review
the necessary ingredients of "measure theory" in a 3-quarter course
in Loeve-level Probability Theory. LOL <-- that's my last laugh
at this subject.

-- Bob.

R. Martin

unread,

Apr 7, 2005, 5:32:08 PM4/7/05

Herman Rubin wrote:
>
> In article <425515...@wdn.com>, R. Martin <russell...@wdn.com> wrote:
> >Reef Fish wrote:
>
> >> R. Martin wrote:
> >> > Reef Fish wrote:
>
> snip
>
> >> > All too often the study of data requires care.
>
> >> Don't recall seeing this sig before. Very nice!
>
> >> -- Bob.
>
> >Thanks. I adapted it from Blackman and Tukey's book on spectral
> >analysis: "All too often the study of spectra requires care".
>
> When Tukey came up with his method for analyzing spectra, it
> was clear that there was something radically wrong with it.

I'm curious as to what was radically wrong?

>
> A student is now working with me on coming up with a robust
> method under certain assumptions, and not too non-robust
> even without them, by using a decision-theoretic analysis.
> The idea of robust prior (restricted) Bayes had not occurred
> to me then, or in fact until much later, or I would have done
> it 50 years ago.

If you have discovered a way to analyse data with taking care
(as my advisor and fellow grad students who put the sentence on
the blackboard use the word "care") then you will put all of us
out of work, because any idiot will be able to throw data into
a program and be certain of getting clear and meaningful results
out. :-)

>
> One does have to use methods from mathematical statistics, and
> some of these are based on measure theory.

Good hunting.

Cheers,
Russell
--

Reef Fish

unread,

Apr 8, 2005, 2:02:21 AM4/8/05

Time to switch subtopic from "Measure theory" to the proper use of
prior distributions in Bayesian statistical inference.

Herman Rubin wrote:
> In article <425515...@wdn.com>, R. Martin
<russell...@wdn.com> wrote:
> >Reef Fish wrote:
>

> snip
>
RM > All too often the study of data requires care.
RF > Don't recall seeing this sig before. Very nice!

> >Thanks. I adapted it from Blackman and Tukey's book on spectral
> >analysis: "All too often the study of spectra requires care".

I should now add, in the light of this new subtopic,

All too often the study of Bayesian inference requires care on the
PRIOR distribution specification.

>
> When Tukey came up with his method for analyzing spectra, it
> was clear that there was something radically wrong with it.

This can be still another subtopic Herman may debate with others.
That subject is outside of my realm of interest or expertise.

> A student is now working with me on coming up with a robust
> method under certain assumptions, and not too non-robust
> even without them, by using a decision-theoretic analysis.

This almost sounds as if decision-theoretic analysis is the
panacea to all statistical estimation problems. That of course
is not true but I'll pass on any debate about it also.

> The idea of robust prior (restricted) Bayes had not occurred
> to me then, or in fact until much later, or I would have done
> it 50 years ago.

NOW you've gone too far off the limb again, Herman! :)

A PRIOR distribution, to a TRUE Bayesian, must reflect his own
PERSONAL opinion about the parameter of interest (which is the
random variable of the prior distribution.

It is a very difficult problem to assess once own prior (personal)
belief realistically and accurately, even for very simple one-
parameter problems, such as the unknown p of a Bernoulli Process.

L.J.Savage and other Bayesians have written on the difficulties
as well as methods of eliciting one's own prior distribution, by
cross-examining oneself and checking for conherency and consistency.

It takes CARE, and much effort to do this well -- which ironically
is the biggest obstacle to being a TRUE Bayesian, in practice.

Bob Schlaifer was keenly aware of that challenging problem, and
devoted a considerable amount of his time (at the Harvard Business
School) to develop a computer program at help the assessment of
ones prior distribution (by numerical approximation of irregular
pdf shapes -- because NOBODY's true prior follows that of any
known probability distribution. :-) and then use numerical
integration to obtain the posterior distribution for Bayesian
inference and decision.

I reviewed the book by Schlaife, "Computer Programs for Elementary
Decision Analysis," for JASA eons ago, and gave it two thumbs up
because that was back in the days before Bill Gates and Steve Job
dropped out of Harvard and Stanford, and Jim Goodnight's SAS had
about 3 employees, including himself, so any COMPUTER PROGRAMS of
that sort for REALISTIC application of Bayesian methods was path-
breaking and exciting.

Alas, of all the statisticians who either admit to be or would
call themselves "Bayesian", 99% of them were already thoroughly
CORRUPT before they got to first base by:

1. Resorting to "conjugate priors" for mathematical convenience,
not reality or any semblance to one's "prior belief".

2. When there are no conjugate prior for the pseudo-Bayesians to
cheat themselves, they would argue for Jeffrey's type of
"uninformative priors" -- which is the complete opposite of
what a true Bayesian would do. Again, mathematics killed
most Bayesians in Bayesian inference.

3. There is a class of problems where it is actually sensible
to use "diffuse priors" (sharp likelihood function) by a
Bayesian for posterior inference, but those problems are
few and far between.

4. Robust priors and robust Bayesian inference. There ain't no
such a thing as a robust Bayesian. There is such a thing as
a non-Bayesian mathematician and non-parametric non-Bayesian
turn infinite-parametric pseudo-Bayesian. They are no more
true=Bayesian in spirit as Bin laden is a true Christian in
spirit. I would classify Richard Savage (Jimmie's brother)
in this category (from non-parametric non-Bayesian to non-
parametric "bayesian"; and now Herman claims he is one of
those "robust bayesians" 50 year ago.

5. Then there are matrix-slingers who called themselves a Bayesian.
Arnold Zellner is probably the better known in this group.
Arnold doesn't have the SLIGHTEST idea what his prior is in
any given applied problem or how he would elicit his own prior.
In a multivariate context. Granted it is almost an impossible
task.to elicit one's multivariate prior, but no respectable
true Bayesian would match ONE summary statsistic with a highly
multivariate and complex mathematical distribution and calls
it his prior, and continue with the Baysian version of GIGO --
Garbage In, Garbage Out.

So there you have it. A tabloid review of the world of Bayesians
and pseudo-Bayesians.

Oh, I almost forgot Jack Good, sometimes said to be the only "Good
Bayesian". :-) He is probably more Bayesian in spirit than most
of the other self-proclaimed Bayesians.

If you want to be a TRUE Bayesian and incorporate your PERSONAL
beliefs into a statistical problem, you MUST realistically assess
your PERSONAL prior, Robustness or non-robustness is not even a
valid issue for consideration! A robust-Bayesian is at best an
oxymoron -- IMHO.

Where shall the debate begin? :-)

-- Bob.

P.S. This post was typed without a single interruption by my formmer
Keyboard Devil. Therefore, any typo, misspelling, grammatical errors
are all mine.

Herman Rubin

unread,

Apr 8, 2005, 4:39:37 PM4/8/05

In article <4255A7...@wdn.com>, R. Martin <russell...@wdn.com> wrote:
>Herman Rubin wrote:

>> In article <425515...@wdn.com>, R. Martin <russell...@wdn.com> wrote:
>> >Reef Fish wrote:

>> >> R. Martin wrote:
>> >> > Reef Fish wrote:

snip

>> >> > All too often the study of data requires care.

>> >> Don't recall seeing this sig before. Very nice!

>> >> -- Bob.

>> >Thanks. I adapted it from Blackman and Tukey's book on spectral
>> >analysis: "All too often the study of spectra requires care".

>> When Tukey came up with his method for analyzing spectra, it
>> was clear that there was something radically wrong with it.

>I'm curious as to what was radically wrong?

One thing which was radically wrong was the assumption
that an engineering smoothing method was "the way to go",
not considering alternatives. Another was the failure
to come up with any statistical justification for the
entire approach; if was eminently clear that the Fourier
coefficients did describe the situation, and that these
could be used for an analysis, but how? He gave no good
reason for choosing among them.

There has been lots since in the literature, and only
recently has there been much except asymptotic rates
for certain types of kernels. There has been some on
doing better for restricted families of spectral densities.

>> A student is now working with me on coming up with a robust
>> method under certain assumptions, and not too non-robust
>> even without them, by using a decision-theoretic analysis.
>> The idea of robust prior (restricted) Bayes had not occurred
>> to me then, or in fact until much later, or I would have done
>> it 50 years ago.

>If you have discovered a way to analyse data with taking care
>(as my advisor and fellow grad students who put the sentence on
>the blackboard use the word "care") then you will put all of us
>out of work, because any idiot will be able to throw data into
>a program and be certain of getting clear and meaningful results
>out. :-)

There are still assumptions to be made, and spectral analyses
of time series are still not the only thing to consider.
The same method will not work for density estimates, for
example, but I believe that one can get robust prior Bayes
estimates; they will not look like what is done now.

>> One does have to use methods from mathematical statistics, and
>> some of these are based on measure theory.

>Good hunting.

Look at my paper with Sethuraman in Sankhya 1965 on the
decision theoretic asymptotics of testing a point null
against a composite alternative in low dimensions. One
could follow this paper without measure theory, but not
get the probability results needed. It gives a robust
method (more robust than indicated in the paper) of
choosing the "significance level" as a function of
sample size with not too many prior assumptions. The
rate of convergence to "optimality" is not great, but
it is easy and greatly reduces the prior assumptions
which need to be made.

>Cheers,
>Russell
>--
>All too often the study of data requires care.

Herman Rubin

unread,

Apr 8, 2005, 5:24:56 PM4/8/05

In article <1112940141.8...@g14g2000cwa.googlegroups.com>,

Reef Fish <Large_Nass...@Yahoo.com> wrote:
>Time to switch subtopic from "Measure theory" to the proper use of
>prior distributions in Bayesian statistical inference.

>Herman Rubin wrote:
>> In article <425515...@wdn.com>, R. Martin
><russell...@wdn.com> wrote:
>> >Reef Fish wrote:

snip

>RM > All too often the study of data requires care.
>RF > Don't recall seeing this sig before. Very nice!

>> >Thanks. I adapted it from Blackman and Tukey's book on spectral
>> >analysis: "All too often the study of spectra requires care".

>I should now add, in the light of this new subtopic,

>All too often the study of Bayesian inference requires care on the
>PRIOR distribution specification.

..................

>This almost sounds as if decision-theoretic analysis is the
>panacea to all statistical estimation problems. That of course
>is not true but I'll pass on any debate about it also.

It certainly is, done intelligently. The essence of
decision theory is the one sentence

It is necessary to simultaneously consider
all consequences of the proposed action in
all states of nature.

Except for estimation, little of classical statistics
comes close to satisfying this.

>> The idea of robust prior (restricted) Bayes had not occurred
>> to me then, or in fact until much later, or I would have done
>> it 50 years ago.

>NOW you've gone too far off the limb again, Herman! :)

>A PRIOR distribution, to a TRUE Bayesian, must reflect his own
>PERSONAL opinion about the parameter of interest (which is the
>random variable of the prior distribution.

You have gotten this from the rash Bayesians; if one takes
a behavioristic Bayesian point of view, any self-consistent
procedure must consider a linear function of the expected
risk given the state of nature. Linear functions are integrals,
whence Bayes. But it is only the loss-prior combination which
is effective. It is the loss-prior combination which is needed,
and is personal. One can change either, if the corresponding
change is made in the other. One CAN look at the measure as a
multiple of a probability measure, but need not. The condition
for one to conclude that a Bayes procedure may be good per se
is that the integrated risk is finite; only then is the use of
Bayes Theorem justified. This does not require that the prior
is a probability measure.

I first showed this in 1947 with an additional axiom allowing
the separation of loss and prior; Chernoff remembered it in
his interview. There is an abstract in 1949. The last
version, with much weaker axioms, appeared in the journal
_Statistics and Decisions_, 1987.

This is the case whatever the problem. Looking at the problem
from a basic approach, the entire state of the universe is
involved. Everything else is an approximation. And looking
at it that way. only approximate loss-prior combinations
can be made.

>It is a very difficult problem to assess once own prior (personal)
>belief realistically and accurately, even for very simple one-
>parameter problems, such as the unknown p of a Bernoulli Process.

The point of the behavioristic approach is that one can
find loss-prior combinations in many cases which do well
with a fair amount of uncertainty, and this reduces what
has to be done. It also deals with the non-symmetric
problem of what is a good approximation when.

>L.J.Savage and other Bayesians have written on the difficulties
>as well as methods of eliciting one's own prior distribution, by
>cross-examining oneself and checking for conherency and consistency.

Look at the above. What Savage and others have done is
to take the conclusion of Fubini's Theorem and make it the
be-all and end-all. What errors are important and what are
not is totally obscured in that approach. It is the linear
combinations of losses which matter.

>It takes CARE, and much effort to do this well -- which ironically
>is the biggest obstacle to being a TRUE Bayesian, in practice.

Again, look at the risk, rather than the posterior. Much
of the time, this will enable the choice of procedure to
depend on few items from the user; for testing a point null
against a composite alternative for reasonable sized samples,
only two items are of major importance. The prior probability
of the null is NOT one of them. If we are testing that the
mean of a normal distribution is 0 against the alternative
that it is not, and we have an estimate with variance 1, the
prior probability that the mean exceeds 10 in magnitude is
almost completely irrelevant.

>Bob Schlaifer was keenly aware of that challenging problem, and
>devoted a considerable amount of his time (at the Harvard Business
>School) to develop a computer program at help the assessment of
>ones prior distribution (by numerical approximation of irregular
>pdf shapes -- because NOBODY's true prior follows that of any
>known probability distribution. :-) and then use numerical
>integration to obtain the posterior distribution for Bayesian
>inference and decision.

Again, looking at the risk function finds out what is needed,
and it might not be any of this. The decision involves the
prior-loss combination, not the prior alone, and one must
keep it in mind. Nobody has found a way to separate these,
and Seidenfeld and others have shown that this makes group
actions even worse, with common priors not enough.

>I reviewed the book by Schlaife, "Computer Programs for Elementary
>Decision Analysis," for JASA eons ago, and gave it two thumbs up
>because that was back in the days before Bill Gates and Steve Job
>dropped out of Harvard and Stanford, and Jim Goodnight's SAS had
>about 3 employees, including himself, so any COMPUTER PROGRAMS of
>that sort for REALISTIC application of Bayesian methods was path-
>breaking and exciting.

>Alas, of all the statisticians who either admit to be or would
>call themselves "Bayesian", 99% of them were already thoroughly
>CORRUPT before they got to first base by:

>1. Resorting to "conjugate priors" for mathematical convenience,
> not reality or any semblance to one's "prior belief".

Agreed.

>2. When there are no conjugate prior for the pseudo-Bayesians to
> cheat themselves, they would argue for Jeffrey's type of
> "uninformative priors" -- which is the complete opposite of
> what a true Bayesian would do. Again, mathematics killed
> most Bayesians in Bayesian inference.

Agreed.

>3. There is a class of problems where it is actually sensible
> to use "diffuse priors" (sharp likelihood function) by a
> Bayesian for posterior inference, but those problems are
> few and far between.

Not as scarce as you think. Bickel and Yahav's results on
sequential parametric estimation show this, as do the
results of Gideon Schwarz and Herman Chernoff on sequential
testing of separated alternatives.

>4. Robust priors and robust Bayesian inference. There ain't no
> such a thing as a robust Bayesian. There is such a thing as
> a non-Bayesian mathematician and non-parametric non-Bayesian

Non-parametric problems are very few; most called that are
infinite-parametric, and calling them non-parametric is a
misnomer.

> turn infinite-parametric pseudo-Bayesian. They are no more
> true=Bayesian in spirit as Bin laden is a true Christian in
> spirit. I would classify Richard Savage (Jimmie's brother)
> in this category (from non-parametric non-Bayesian to non-
> parametric "bayesian"; and now Herman claims he is one of
> those "robust bayesians" 50 year ago.

You should read more of what I have written; you would know this.

>5. Then there are matrix-slingers who called themselves a Bayesian.
> Arnold Zellner is probably the better known in this group.
> Arnold doesn't have the SLIGHTEST idea what his prior is in
> any given applied problem or how he would elicit his own prior.
> In a multivariate context. Granted it is almost an impossible
> task.to elicit one's multivariate prior, but no respectable
> true Bayesian would match ONE summary statsistic with a highly
> multivariate and complex mathematical distribution and calls
> it his prior, and continue with the Baysian version of GIGO --
> Garbage In, Garbage Out.

Agreed.

>So there you have it. A tabloid review of the world of Bayesians
>and pseudo-Bayesians.

>Oh, I almost forgot Jack Good, sometimes said to be the only "Good
>Bayesian". :-) He is probably more Bayesian in spirit than most
>of the other self-proclaimed Bayesians.

Jack Good and I have similar views.

>If you want to be a TRUE Bayesian and incorporate your PERSONAL
>beliefs into a statistical problem, you MUST realistically assess
>your PERSONAL prior, Robustness or non-robustness is not even a
>valid issue for consideration! A robust-Bayesian is at best an
>oxymoron -- IMHO.

Wrong. Consider the risk, not the posterior. It is often
possible to obtain robust procedures which are not precisely
Bayes, and this is what has been done for some problems.
Some of us have even managed to do this for infinite dimensional
problems.

>Where shall the debate begin? :-)

Consider it engaged. Robust prior Bayesian can exist and
even be quite flexible. I do not believe that anyone will
ever get a Bayes posterior for a density with a reasonable
prior, but I do believe that a robust prior Bayes estimate,
which will not even be formal Bayes, can be done.

Reef Fish

unread,

Apr 9, 2005, 2:26:40 AM4/9/05

Herman Rubin wrote:
> In article <1112940141.8...@g14g2000cwa.googlegroups.com>,
> Reef Fish <Large_Nass...@Yahoo.com> wrote:
> >Time to switch subtopic from "Measure theory" to the proper use of
> >prior distributions in Bayesian statistical inference.

LIFO:

> >Where shall the debate begin? :-)
>
> Consider it engaged.

There is not that much to debate between us. You agreed so much with
my characterization of the MIS-USE of priors, and where we disagree,
there is little room for compromise and we just have to agree to
disagree.

Let'a summarize where we agree.

> >Alas, of all the statisticians who either admit to be or would
> >call themselves "Bayesian", 99% of them were already thoroughly
> >CORRUPT before they got to first base by:
>
> >1. Resorting to "conjugate priors" for mathematical convenience,
> > not reality or any semblance to one's "prior belief".
>
> Agreed.
>
> >2. When there are no conjugate prior for the pseudo-Bayesians to
> > cheat themselves, they would argue for Jeffrey's type of
> > "uninformative priors" -- which is the complete opposite of
> > what a true Bayesian would do. Again, mathematics killed
> > most Bayesians in Bayesian inference.
>
> Agreed.

> >5. Then there are matrix-slingers who called themselves a Bayesian.

> > Arnold Zellner is probably the better known in this group.
> > Arnold doesn't have the SLIGHTEST idea what his prior is in
> > any given applied problem or how he would elicit his own prior.
> > In a multivariate context. Granted it is almost an impossible
> > task.to elicit one's multivariate prior, but no respectable
> > true Bayesian would match ONE summary statsistic with a highly
> > multivariate and complex mathematical distribution and calls
> > it his prior, and continue with the Baysian version of GIGO --
> > Garbage In, Garbage Out.
>
> Agreed.

Where we disagreed:

> >A PRIOR distribution, to a TRUE Bayesian, must reflect his own
> >PERSONAL opinion about the parameter of interest (which is the
> >random variable of the prior distribution.
>
> You have gotten this from the rash Bayesians

That "rash Bayesian" was Jimmie Savage :-)
http://www.umass.edu/wsp/statistics/tales/savage.html

*> "In his development of personal probability, Savage moved more
*> and more to a proselytizing position. Personal probability was
*> not only useful and interesting to study; it became for him
*> the only sensible approach to probability and statistics ..."

Note the repeated use of "personal" for the "subjective" part of
Savage's Bayesian approach.

*> In 1964 he moved again, to a named professorship at Yale, ..."

when Yale started its Department of (Applied) Statistics, with an
innovative program that all students spend half a year in FULL
TIME statistical consulting in the 2nd year.

I was very privileged to have had a close association with Jimmie,
during his years at Yale until his untimely death in 1971 at the
age of 52.

He was a most remarkable man, in many different ways. I highly
recommend reading

http://froogle.google.com/froogle?q=Leonard+Jimmie+Savage&hl=en&lr=&c2coff=1&safe=off&sa=N&tab=ff&oi=froogler

the Memorial Volume of his writings, and about his personal life,
in tributes written by Allen Wallis, Fred Mosteller, and Dennis
Lindley, including humorous autobiographical sketch of Savage,
given on the occasion of his Honorary Dr. Sc. degree in Rochester.

> >L.J.Savage and other Bayesians have written on the difficulties
> >as well as methods of eliciting one's own prior distribution, by
> >cross-examining oneself and checking for conherency and consistency.
>
> Look at the above. What Savage and others have done is
> to take the conclusion of Fubini's Theorem and make it the
> be-all and end-all. What errors are important and what are
> not is totally obscured in that approach. It is the linear
> combinations of losses which matter.

You are speaking and thinking as a mathematical statistician.
That might have been what he did in the Foundation of Statistics
book, but I am speaking of Savage as an APPLIED Bayesian,
articulated at the very elementary level paper by Edwards, Lindman,
and Savage.

> >3. There is a class of problems where it is actually sensible
> > to use "diffuse priors" (sharp likelihood function) by a
> > Bayesian for posterior inference, but those problems are
> > few and far between.
>
> Not as scarce as you think. Bickel and Yahav's results on
> sequential parametric estimation show this, as do the
> results of Gideon Schwarz and Herman Chernoff on sequential
> testing of separated alternatives.

Here, we'll split 50/50 on our disagreement, about the frequency
of suitable usage of diffuse priors, which reduces the
posterior distribution to a normalized likelihood function.

>
> >4. Robust priors and robust Bayesian inference. There ain't no
> > such a thing as a robust Bayesian. There is such a thing as
> > a non-Bayesian mathematician and non-parametric non-Bayesian
>
> Non-parametric problems are very few; most called that are
> infinite-parametric, and calling them non-parametric is a
> misnomer.

That's a techinical nitpick. Notice I did distinguish between
a non-parametric non-Bayesian and an infinite-parametric
pseudo-Bayesian. :) Bayesian inference is all about parameters
-- which is why NON becomes INFINITE.

>
> > turn infinite-parametric pseudo-Bayesian. They are no more
> > true=Bayesian in spirit as Bin laden is a true Christian in
> > spirit. I would classify Richard Savage (Jimmie's brother)
> > in this category (from non-parametric non-Bayesian to non-
> > parametric "bayesian"; and now Herman claims he is one of
> > those "robust bayesians" 50 year ago.
>
> You should read more of what I have written; you would know this.

Actually I don't NEED to read what you have written to know that
you were addressing your problem as a decision theoretic mathematical
problem, rather than an APPLIED problem in Bayesian statistics.
Besides, you've articulated your approach very well here. I just
don't buy it, that's all.

> >So there you have it. A tabloid review of the world of Bayesians
> >and pseudo-Bayesians.
>
> >Oh, I almost forgot Jack Good, sometimes said to be the only "Good
> >Bayesian". :-) He is probably more Bayesian in spirit than most
> >of the other self-proclaimed Bayesians.
>
> Jack Good and I have similar views.

Perhpas, perhaps not. It was a serious omission on my part not to
have mentioned David Blackwell -- a decision-theoretic mathematical
statistician who is a highly regarded THEORETICAL Bayesian. You may
share many of his views too.

> >If you want to be a TRUE Bayesian and incorporate your PERSONAL
> >beliefs into a statistical problem, you MUST realistically assess
> >your PERSONAL prior, Robustness or non-robustness is not even a
> >valid issue for consideration! A robust-Bayesian is at best an
> >oxymoron -- IMHO.
>
> Wrong. Consider the risk, not the posterior.

I stand pad on my statement. I have the option to look at the
posterior risk if I choose to for certain problems. But as
a personal Bayesian, robustness of my prior is not a valid issue
for consideration.

> It is often
> possible to obtain robust procedures which are not precisely
> Bayes, and this is what has been done for some problems.

Then it's neither Bayes nor Bayesian.

I am satisfied that we agree on some issues of Bayesian abuse,
in the callous use of priors, but have to afree to disagree on
how APPLIED Bayesian statistics should be practiced.

-- Bob.

Richard Ulrich

unread,

Apr 10, 2005, 4:35:27 PM4/10/05

On 8 Apr 2005 23:26:40 -0700, "Reef Fish"
<Large_Nass...@Yahoo.com> wrote:

> Herman Rubin wrote:
[snip]

Well, I want to thank you guys for the seminar.
- in case you are wondering what someone else
thought about this dialog on Bayesian theories
and theorists.

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Herman Rubin

unread,

Apr 10, 2005, 10:04:18 PM4/10/05

In article <1113028000.1...@g14g2000cwa.googlegroups.com>,

>LIFO:

>> Consider it engaged.

>> Agreed.

>Where we disagreed:

He was converted by de Finetti. I do not know when, but
this was after I showed that rational behavior required
that the utility not knowing the state of nature has to
be a positive linear functional on the utilities given
the states of nature, and I believe he was aware of my
results, both of use being at Chicago at the time. If
you look at my work on this, I always stated that one
has to act AS IF there was this personal evaluation of
both utilities and prior weights. I was able to stop
there, but it seems most could not.

However, as I came to it from a decision approach, where
it was clear that one starts with the full state of nature,
and approximates down (Newton assumed every particle in
the universe attracted every other, and when electricity
was similarly formulated, it was every particle attracting
or repelling every other), so it was clear that it was
necessary to approximate.

Empirical Bayes COULD be beaten by Bayes methods, but
not that much. In many cases, one can approximate
Bayes methods by those which are not quite Bayes, but
which have good properties, and which are like using
least squares without assuming normality.

>*> In 1964 he moved again, to a named professorship at Yale, ..."

>when Yale started its Department of (Applied) Statistics, with an
>innovative program that all students spend half a year in FULL
>TIME statistical consulting in the 2nd year.

You might be interested to know that I proposed this (not
full time) for students who had passed their comprehensive
examinations at Purdue. Unfortunately, most now do it
before that, and this gives them a distorted view.

>I was very privileged to have had a close association with Jimmie,
>during his years at Yale until his untimely death in 1971 at the
>age of 52.

>He was a most remarkable man, in many different ways. I highly
>recommend reading

>http://froogle.google.com/froogle?q=Leonard+Jimmie+Savage&hl=en&lr=&c2coff=1&safe=off&sa=N&tab=ff&oi=froogler

>the Memorial Volume of his writings, and about his personal life,
>in tributes written by Allen Wallis, Fred Mosteller, and Dennis
>Lindley, including humorous autobiographical sketch of Savage,
>given on the occasion of his Honorary Dr. Sc. degree in Rochester.

>> >L.J.Savage and other Bayesians have written on the difficulties
>> >as well as methods of eliciting one's own prior distribution, by
>> >cross-examining oneself and checking for conherency and consistency.

However, when it comes to problems where posteriors for
reasonable priors cannot be computed, it may still be
able to get non-Bayesian procedures which can be shown
to be not bad from the Bayesian standpoint. Prior Bayes
risks can often be at least well estimated, and they
show where the specifications of the prior matter.

One can also consider the effect of the prior if one
only knows what happens with certain types of procedures.
Prior Bayes risk is still useful, and can often be shown
to be quite good.

>> Look at the above. What Savage and others have done is
>> to take the conclusion of Fubini's Theorem and make it the
>> be-all and end-all. What errors are important and what are
>> not is totally obscured in that approach. It is the linear
>> combinations of losses which matter.

>You are speaking and thinking as a mathematical statistician.
>That might have been what he did in the Foundation of Statistics
>book, but I am speaking of Savage as an APPLIED Bayesian,
>articulated at the very elementary level paper by Edwards, Lindman,
>and Savage.

Ad what I am saying is that there are practical uses of
prior Bayes in situations which are common. In testing a
null hypothesis, relatively little of the prior is
important, and one can ask the user about that, instead
about the full prior, where the important parts are likely
to be almost ignored. Asymptotically, two numbers suffice
if the point null can be assumed, or if the width of the
acceptance region is small compared to the precision of
the usual estimator, and asymptotic is reached for not
too large samples. The prior probability of the null is
not one of them, as the far-out prior is irrelevant.

>> >3. There is a class of problems where it is actually sensible
>> > to use "diffuse priors" (sharp likelihood function) by a
>> > Bayesian for posterior inference, but those problems are
>> > few and far between.

>> Not as scarce as you think. Bickel and Yahav's results on
>> sequential parametric estimation show this, as do the
>> results of Gideon Schwarz and Herman Chernoff on sequential
>> testing of separated alternatives.

>Here, we'll split 50/50 on our disagreement, about the frequency
>of suitable usage of diffuse priors, which reduces the
>posterior distribution to a normalized likelihood function.

>> >4. Robust priors and robust Bayesian inference. There ain't no
>> > such a thing as a robust Bayesian. There is such a thing as
>> > a non-Bayesian mathematician and non-parametric non-Bayesian

>> Non-parametric problems are very few; most called that are
>> infinite-parametric, and calling them non-parametric is a
>> misnomer.

>That's a techinical nitpick. Notice I did distinguish between
>a non-parametric non-Bayesian and an infinite-parametric
>pseudo-Bayesian. :) Bayesian inference is all about parameters
>-- which is why NON becomes INFINITE.

So is classical, and the NON was always INFINITE. There
is little which is non-parametric; univariate distribution-
free testing, and little else.

>> > turn infinite-parametric pseudo-Bayesian. They are no more
>> > true=Bayesian in spirit as Bin laden is a true Christian in
>> > spirit. I would classify Richard Savage (Jimmie's brother)
>> > in this category (from non-parametric non-Bayesian to non-
>> > parametric "bayesian"; and now Herman claims he is one of
>> > those "robust bayesians" 50 year ago.

>> You should read more of what I have written; you would know this.

>Actually I don't NEED to read what you have written to know that
>you were addressing your problem as a decision theoretic mathematical
>problem, rather than an APPLIED problem in Bayesian statistics.
>Besides, you've articulated your approach very well here. I just
>don't buy it, that's all.

I believe in applying theory, and I have done so in quite
a few cases, often having to invent procedures. I am
still doing this.

>> >So there you have it. A tabloid review of the world of Bayesians
>> >and pseudo-Bayesians.

>> >Oh, I almost forgot Jack Good, sometimes said to be the only "Good
>> >Bayesian". :-) He is probably more Bayesian in spirit than most
>> >of the other self-proclaimed Bayesians.

>> Jack Good and I have similar views.

>Perhpas, perhaps not. It was a serious omission on my part not to
>have mentioned David Blackwell -- a decision-theoretic mathematical
>statistician who is a highly regarded THEORETICAL Bayesian. You may
>share many of his views too.

We have even worked together.

>> >If you want to be a TRUE Bayesian and incorporate your PERSONAL
>> >beliefs into a statistical problem, you MUST realistically assess
>> >your PERSONAL prior, Robustness or non-robustness is not even a
>> >valid issue for consideration! A robust-Bayesian is at best an
>> >oxymoron -- IMHO.

>> Wrong. Consider the risk, not the posterior.

>I stand pad on my statement. I have the option to look at the
>posterior risk if I choose to for certain problems. But as
>a personal Bayesian, robustness of my prior is not a valid issue
>for consideration.

No, the risk involves both the loss and the prior,
in fact just the product, and is personal. Also,
in an applied problem, it is the user's evaluation
of the consequences which counts, not the statisticians,
and so we have to find out what to ask the user to
narrow down the problem. Just as we do not need to
ask the user to precisely specify the distribution
of the disturbances in a linear model if they are
not really weird, so we can inform the user that his
action will not depend on certain properties of the
loss-prior combination, but will depend on others.

>> It is often
>> possible to obtain robust procedures which are not precisely
>> Bayes, and this is what has been done for some problems.

>Then it's neither Bayes nor Bayesian.

One could call it "engineering Bayesian". This is
what is used all the time in physics and other
fields; one cannot do precise calculations taking
into account all the factors, so one decides how
to approximate.

>I am satisfied that we agree on some issues of Bayesian abuse,
>in the callous use of priors, but have to afree to disagree on
>how APPLIED Bayesian statistics should be practiced.

>-- Bob.

Reef Fish

unread,

Apr 11, 2005, 3:59:34 PM4/11/05

As I said in my last post, there isn't much left for us to debate on
because whatever differences we have in opinion about Bayesian priors
are unlikely to be resolved via debate, especially in this forum.

So, for this post, I'll merely act as a "color commnetator" as they
call it in TV shows of live broadcasting of games. The role of the
"color commentator" merely adds tidbits of trivia to make the game
more interesting for the audience to listen to.

There's no question de Finetti had much influence on Jimmie's Bayesian
practice. In Savage's Bayesian lectures, he mentioned de Finetti more
than any other Bayesian. He and de Finetti published a long (about
70 page) paper, in 1962, written in Italian, with a three page English
summary, titled

sul modo di scegliere le probabilita iniziali

which has to do with choosing prior ("initial") probabilities

"Il presente articolo deriva da numerosi scambi di idee tra i due
autori
durante il soggiorno di L. J. Savage savage a Roma per l'anno sabatico
(1958-59)."

< The present article derives from numerous exchanges of ideas between
the two authors during the stay of L.J.Savage to Rome for the
sabbatical
year (1958-59). >

It was published in 1962. In the same year, Savage published three
other papers, authored by himself, in Italian, presumably on his work
during that sabbatical.

Savage speaks and writes fluently in Italian, French, and German, and
published papers written in those languages. He also has developed his

own unmistable writing sytle in English, after, in Fred Mosteller's
words, "Milton (Friedman) explained to us that we knew little about
writing, that there were books from which we could and should learn,
and he recommended several to us." :)

Up to the late 1950s, both you and Savage were heavily involved in the
MATHEMATICS of statistics, and made no visible contribution to the
APPLICATION.

I have a pragmatic discriminant function for Theoretical vs Applied
Statisticians -- the former dwell mostly in the IMS journals and
IMS activities in the Annual JMS (Joint Meetings inStatistics), while
the latter in ASA journals and activities.

That accounts for the fact that I've seen you at the Annual Meetings
for decades, but never heard of seen you in any of the major talks
sponsored by ASA, or in any talk for that matter. That's because I
almost never attend the IMS sessions because my interests are APPLIED.

Just about ALL the statisticians I know, or those whose names are
well-known, are Fellows of the ASA, with the exception of Herman Rubin
and Jimmie Savage! :-) Ronald Fisher is not a Fellow either, but for
different reasons. My explanation is that neither of you published
anything that could be considered as "applied" through the early part
(and later part too) of your professional careers to be so recognized
by the ASA.

Of the well-known statisticians *I* know personally, all are Fellows
of the ASA, though some of them I would rate no more than 5 on a scale
of 1 to 100 as an "Applied Statistician", but "at least 75" as a
Mathematical Statistician or Probabilist. These include Chernoff
(1961),
Feller (1948), Persi Diaconis (1994), and Richard Savage (1961). <year

elected ASA Fellow in parentheses.

Speaking of rating other statisticians, it is certainly a subject is
s PERSONAL prior, non-robust, and involves no mathematical "risk"
considerations. :-) <Just bring back some of terms On-Topic>

I am reading a book, "The Man Who Loves Only Numbers" about the SECOND
most prolific writer in the history of mathematics, an eccentric
mathematician, statistician, and probabilist Paul Erdos, whom I've
met in his lectures and cocktail parties. A tale emerged in the book
about Hardy's (of Hardy and Littlewood fame) PERSONAL (subjective)
rating of mathematicians.

Hardy liked to rank mathematicians on a scale of 1 to 100, said Erdos,
according to the biographer of the book Paul Hoffmann. He <Hardy> gave

himself 25, Littlewood 30, David Hilbert 30, and Ramanujan 100.

Since much of our discussion and debate (on measure theory and Bayesian
priors) had to do with "mathematical statistics" and "applied
statistics"
I am curious, but I wouldn't be so insenditive to ask the readership of
these newsgroups to rate Herman Rubin, on a scale of 1 to 100, as an
APPLIED statisticisn. :-)

I rated Chernoff 5 (no question about him being a brilliant
mathematical
statistician) as an APPLIED statistician only because of the "Chernoff
faces" (actually cartoon faces) he talked about in JASA as a graphical
method to represent multivariate data. It was FUN and innovative, but
I haven't seen a "Chernoff face" (or its relatives) in any application
in recent years.

< A gigantic leap to a couple of practical points >

> >You are speaking and thinking as a mathematical statistician.
> >That might have been what he did in the Foundation of Statistics
> >book, but I am speaking of Savage as an APPLIED Bayesian,
> >articulated at the very elementary level paper by Edwards, Lindman,
> >and Savage.
>
> Ad what I am saying is that there are practical uses of
> prior Bayes in situations which are common. In testing a
> null hypothesis, relatively little of the prior is
> important,

Nor is there any NEED for a Bayesian to test any hypothesis, in the
manner sharp hypotheses are tested in the non-Bayesian context.

> and one can ask the user about that, instead
> about the full prior, where the important parts are likely
> to be almost ignored.

But a a true Bayesian, I AM the user! As a PERSONAL Bayesian, I am
not going to gloss over my full prior. And when that realization is
impossible (usually in the multivariate priors), then I simply abandon
the Bayesian methodology, fall back on being a non-Bayesian, and try
to minimize all the inherent defects of the "classifical" approach,
and grin and bear it. That happens the vast majority of the time. :-)

So, my confession is that I am a TRUE Bayesian whenver it's possible.
But most of the time, I just do the best I can under the frequentist
strait-jacket, after I escaped from the Bayesian strait-jacket.

> >> >So there you have it. A tabloid review of the world of Bayesians
> >> >and pseudo-Bayesians.

> >> >If you want to be a TRUE Bayesian and incorporate your PERSONAL
> >> >beliefs into a statistical problem, you MUST realistically assess
> >> >your PERSONAL prior, Robustness or non-robustness is not even a
> >> >valid issue for consideration! A robust-Bayesian is at best an
> >> >oxymoron -- IMHO.
>
> >> Wrong. Consider the risk, not the posterior.
>

> >I stand pad <sic> on my statement. I have the option to look at the

> >posterior risk if I choose to for certain problems. But as
> >a personal Bayesian, robustness of my prior is not a valid issue
> >for consideration.
>
> No, the risk involves both the loss and the prior,
> in fact just the product, and is personal. Also,
> in an applied problem, it is the user's evaluation
> of the consequences which counts, not the statisticians,

Again, I am both the user AND the statistician. If I can't do it
realistically for myself, what hope is there to do it PROPERLY for
the client USER?

The trouble with the risk-function mathematical construct is that
to be PERSONAL about it, you need to know what "widgets" your LOSS
is measured in, what is your utility function of those widgets, and
from my point of view, that's just a lot of smoke screen to divert
one's attention from the REAL problem in the Real World context of
making statistical sense, and making inference by incorporating one's
prior information that is relevant AND non-negligible and clearly
non-robust.

> >> It is often
> >> possible to obtain robust procedures which are not precisely
> >> Bayes, and this is what has been done for some problems.
>
> >Then it's neither Bayes nor Bayesian.
>
> One could call it "engineering Bayesian". This is
> what is used all the time in physics and other
> fields; one cannot do precise calculations taking
> into account all the factors, so one decides how
> to approximate.

I have no problem with this point of view about science, nor am
I arguing against approximations. What I am arguing against is
GROSS approximations that alters a real, sensible problem into
one that bears no resemblance to how the problem should be
approached, as a Bayesian!

We are just haggling about the admissible degree of approximation

-- Bob.

Herman Rubin

unread,

Apr 12, 2005, 3:32:22 PM4/12/05

In article <1113249574.5...@o13g2000cwo.googlegroups.com>,
Reef Fish <Large_Nass...@Yahoo.com> wrote:

>Herman Rubin wrote:
>> In article <1113028000.1...@g14g2000cwa.googlegroups.com>,
>> Reef Fish <Large_Nass...@Yahoo.com> wrote:
>> >Herman Rubin wrote:
>> >> In article
><1112940141.8...@g14g2000cwa.googlegroups.com>,
>> >> Reef Fish <Large_Nass...@Yahoo.com> wrote:
>> >> >Time to switch subtopic from "Measure theory" to the proper use
>of
>> >> >prior distributions in Bayesian statistical inference.

....................

>> >> >A PRIOR distribution, to a TRUE Bayesian, must reflect his own
>> >> >PERSONAL opinion about the parameter of interest (which is the
>> >> >random variable of the prior distribution.

>> >> You have gotten this from the rash Bayesians

>> >That "rash Bayesian" was Jimmie Savage :-)
>> >http://www.umass.edu/wsp/statistics/tales/savage.html

>> >*> "In his development of personal probability, Savage moved more
>> >*> and more to a proselytizing position. Personal probability was
>> >*> not only useful and interesting to study; it became for him
>> >*> the only sensible approach to probability and statistics ..."

However, in applying statistics, it is the user's assessment
of the situation which is relevant. The client benefits or
loses, and the statistician is a consultant.

You should check my papers more carefully. I am
responsible for quite a few applications, from methods of
computing estimates to simulations to aspects of applied
prior Bayes to density estimation.

>I have a pragmatic discriminant function for Theoretical vs Applied
>Statisticians -- the former dwell mostly in the IMS journals and
>IMS activities in the Annual JMS (Joint Meetings inStatistics), while
>the latter in ASA journals and activities.

>That accounts for the fact that I've seen you at the Annual Meetings
>for decades, but never heard of seen you in any of the major talks
>sponsored by ASA, or in any talk for that matter. That's because I
>almost never attend the IMS sessions because my interests are APPLIED.

To apply something correctly, one should understand the
underlying theory. There is much applied material there.

>Just about ALL the statisticians I know, or those whose names are
>well-known, are Fellows of the ASA, with the exception of Herman Rubin
>and Jimmie Savage! :-) Ronald Fisher is not a Fellow either, but for
>different reasons. My explanation is that neither of you published
>anything that could be considered as "applied" through the early part
>(and later part too) of your professional careers to be so recognized
>by the ASA.

>Of the well-known statisticians *I* know personally, all are Fellows
>of the ASA, though some of them I would rate no more than 5 on a scale
>of 1 to 100 as an "Applied Statistician", but "at least 75" as a
>Mathematical Statistician or Probabilist. These include Chernoff
>(1961),
>Feller (1948), Persi Diaconis (1994), and Richard Savage (1961). <year

>elected ASA Fellow in parentheses.

To be a Fellow of the ASA, one must be a member. I am
a Fellow of IMS since 1952.

>Speaking of rating other statisticians, it is certainly a subject is
>s PERSONAL prior, non-robust, and involves no mathematical "risk"
>considerations. :-) <Just bring back some of terms On-Topic>

>I am reading a book, "The Man Who Loves Only Numbers" about the SECOND
>most prolific writer in the history of mathematics, an eccentric
>mathematician, statistician, and probabilist Paul Erdos, whom I've
>met in his lectures and cocktail parties. A tale emerged in the book
>about Hardy's (of Hardy and Littlewood fame) PERSONAL (subjective)
>rating of mathematicians.

>Hardy liked to rank mathematicians on a scale of 1 to 100, said Erdos,
>according to the biographer of the book Paul Hoffmann. He <Hardy> gave

>himself 25, Littlewood 30, David Hilbert 30, and Ramanujan 100.

>Since much of our discussion and debate (on measure theory and Bayesian
>priors) had to do with "mathematical statistics" and "applied
>statistics"
>I am curious, but I wouldn't be so insenditive to ask the readership of
>these newsgroups to rate Herman Rubin, on a scale of 1 to 100, as an
>APPLIED statisticisn. :-)

Just yesterday, some students asked me about how to estimate
the parameters in a particular applied problem. Global
maximum likelihood does not exist, and finding a starting
point for local maximum likelihood is difficult. Today,
one of them told me it was successful to fit a few points
on the characteristic function to the sample characteristic
function, which I suggested. Theory should be applied.

>I rated Chernoff 5 (no question about him being a brilliant
>mathematical
>statistician) as an APPLIED statistician only because of the "Chernoff
>faces" (actually cartoon faces) he talked about in JASA as a graphical
>method to represent multivariate data. It was FUN and innovative, but
>I haven't seen a "Chernoff face" (or its relatives) in any application
>in recent years.

Chernoff has done more important applied statistics than that.
It has not all showed up. My paper with Chernoff on estimating
the location of a discontinuity in density has set off quite a
bit of work on estimating when a change in process occurred.

>< A gigantic leap to a couple of practical points >

>> >You are speaking and thinking as a mathematical statistician.
>> >That might have been what he did in the Foundation of Statistics
>> >book, but I am speaking of Savage as an APPLIED Bayesian,
>> >articulated at the very elementary level paper by Edwards, Lindman,
>> >and Savage.

>> Ad what I am saying is that there are practical uses of
>> prior Bayes in situations which are common. In testing a
>> null hypothesis, relatively little of the prior is
>> important,

>Nor is there any NEED for a Bayesian to test any hypothesis, in the
>manner sharp hypotheses are tested in the non-Bayesian context.

There is a need from the standpoint of what action to take.
I do have a paper on testing an imprecise null in one dimension
from a Bayesian standpoint, and there are some simple conclusions
which can be drawn, but when the acceptance region and the variance
are comparable is the only situation where the user has to specify
a lot. The others are to use the method in my paper with Sethuraman
if the region is small, and to see if the estimate is in the region
if the region is large. The intermediate part is sensitive, and
taking the union test does not accept often enough.

>> and one can ask the user about that, instead
>> about the full prior, where the important parts are likely
>> to be almost ignored.

>But a a true Bayesian, I AM the user! As a PERSONAL Bayesian, I am
>not going to gloss over my full prior. And when that realization is
>impossible (usually in the multivariate priors), then I simply abandon
>the Bayesian methodology, fall back on being a non-Bayesian, and try
>to minimize all the inherent defects of the "classifical" approach,
>and grin and bear it. That happens the vast majority of the time. :-)

In applied statistics, it is the person with the problem who is
the user. What has to be done is to identify the parts of the
loss-prior combination which affect the solution, and ignore the
rest of it.

>So, my confession is that I am a TRUE Bayesian whenver it's possible.
>But most of the time, I just do the best I can under the frequentist
>strait-jacket, after I escaped from the Bayesian strait-jacket.

>> >> >So there you have it. A tabloid review of the world of Bayesians
>> >> >and pseudo-Bayesians.

>> >> >If you want to be a TRUE Bayesian and incorporate your PERSONAL
>> >> >beliefs into a statistical problem, you MUST realistically assess
>> >> >your PERSONAL prior, Robustness or non-robustness is not even a
>> >> >valid issue for consideration! A robust-Bayesian is at best an
>> >> >oxymoron -- IMHO.

>> >> Wrong. Consider the risk, not the posterior.

>> >I stand pad <sic> on my statement. I have the option to look at the
>> >posterior risk if I choose to for certain problems. But as
>> >a personal Bayesian, robustness of my prior is not a valid issue
>> >for consideration.

>> No, the risk involves both the loss and the prior,
>> in fact just the product, and is personal. Also,
>> in an applied problem, it is the user's evaluation
>> of the consequences which counts, not the statisticians,

>Again, I am both the user AND the statistician. If I can't do it
>realistically for myself, what hope is there to do it PROPERLY for
>the client USER?

Rubin's 5th Commandment: If thou art both the client and
the consultant, keep thy roles distinct, lest thou violate

some of the other commandments.

>The trouble with the risk-function mathematical construct is that

>to be PERSONAL about it, you need to know what "widgets" your LOSS
>is measured in, what is your utility function of those widgets, and
>from my point of view, that's just a lot of smoke screen to divert
>one's attention from the REAL problem in the Real World context of
>making statistical sense, and making inference by incorporating one's
>prior information that is relevant AND non-negligible and clearly
>non-robust.

No, the risk is often more important than the posterior.
You want to be much more careful about letting in someone
with Marburg's disease than with smallpox.

>> >> It is often
>> >> possible to obtain robust procedures which are not precisely
>> >> Bayes, and this is what has been done for some problems.

>> >Then it's neither Bayes nor Bayesian.

>> One could call it "engineering Bayesian". This is
>> what is used all the time in physics and other
>> fields; one cannot do precise calculations taking
>> into account all the factors, so one decides how
>> to approximate.

>I have no problem with this point of view about science, nor am
>I arguing against approximations. What I am arguing against is
>GROSS approximations that alters a real, sensible problem into
>one that bears no resemblance to how the problem should be
>approached, as a Bayesian!

>We are just haggling about the admissible degree of approximation

What I am telling you is that posterior robustness is rarely
attainable, but prior robustness is often easily attainable.

Reef Fish

unread,

Apr 12, 2005, 7:07:55 PM4/12/05

Herman Rubin wrote:
> In article <1113249574.5...@o13g2000cwo.googlegroups.com>,
> Reef Fish <Large_Nass...@Yahoo.com> wrote:
>

< snipping much of background discussion for brevity >
< for details consult the archives or read thread from groups.google >

> ....................
>
> >> >> >A PRIOR distribution, to a TRUE Bayesian, must reflect his own
> >> >> >PERSONAL opinion about the parameter of interest (which is the
> >> >> >random variable of the prior distribution.
>
> >> >> You have gotten this from the rash Bayesians
>
> >> >That "rash Bayesian" was Jimmie Savage :-)
> >> >http://www.umass.edu/wsp/statistics/tales/savage.html
>
> >> >*> "In his development of personal probability, Savage moved
more
> >> >*> and more to a proselytizing position. Personal probability
was
> >> >*> not only useful and interesting to study; it became for him
> >> >*> the only sensible approach to probability and statistics ..."

< big snip >

> >Up to the late 1950s, both you and Savage were heavily involved in
the
> >MATHEMATICS of statistics, and made no visible contribution to the
> >APPLICATION.
>
> You should check my papers more carefully.

Perhaps. Perhaps not. An application of Gordon Sande's "duck test"
would suffice. :-) A "duck test" for a non-applied statistician.

Never seen you present any application talk or paper in any JSM Annual
Meeting. Never seen an application paper of yours. Never seen anyone
REFERENCE any application papers of yours ... see also my pragmatic
"discriminant function" below.

>
> >I have a pragmatic discriminant function for Theoretical vs Applied
> >Statisticians -- the former dwell mostly in the IMS journals and
> >IMS activities in the Annual JMS (Joint Meetings inStatistics),
while
> >the latter in ASA journals and activities.
>
> >That accounts for the fact that I've seen you at the Annual Meetings
> >for decades, but never heard of seen you in any of the major talks
> >sponsored by ASA, or in any talk for that matter. That's because I
> >almost never attend the IMS sessions because my interests are
APPLIED.
>
> To apply something correctly, one should understand the
> underlying theory. There is much applied material there.

I always agreed totally on this aspect of applied statistics. There's
plenty of good statistical practice in the ASA sessions and journals.

> >Just about ALL the statisticians I know, or those whose names are
> >well-known, are Fellows of the ASA, with the exception of Herman
Rubin
> >and Jimmie Savage! :-) Ronald Fisher is not a Fellow either, but
for
> >different reasons. My explanation is that neither of you published
> >anything that could be considered as "applied" through the early
part
> >(and later part too) of your professional careers to be so
recognized
> >by the ASA.
>
>
> >Of the well-known statisticians *I* know personally, all are Fellows
> >of the ASA, though some of them I would rate no more than 5 on a
scale
> >of 1 to 100 as an "Applied Statistician", but "at least 75" as a
> >Mathematical Statistician or Probabilist. These include Chernoff
> >(1961),
> >Feller (1948), Persi Diaconis (1994), and Richard Savage (1961).
<year
>
> >elected ASA Fellow in parentheses.
>
> To be a Fellow of the ASA, one must be a member.

That is only ONE of the pre-requisites. (2) You must be nominated.
(3) You must be elected by the Committee on Fellows based on the
documentation of the sponsor on WHY deserve to be elected Fellow.

The fact that you're not even a member of the ASA is still another
"duck test" that you're not at all interested in the application of
statistics, IMHO of course, even though your call your own
non-applied ideas applications of statistics, such as fitting data
to a charactereistic function.

> I am a Fellow of IMS since 1952.

Of course. So is EVERYONE of those Fellows of ASA I know personally
who are well-known in applied statistics, Data Analysis, and the ASA.
Of those I explicitly rated less than 5 in a scale of 1 to 100 on
"applied statistics", Chernoff, Diaconis, Feller, and I.R.Savage --
they are all Fellows of the IMS, and with even more prestigeous elected
titles than mere both Fellows of the ASA and IMS.

>
> >Speaking of rating other statisticians, it is certainly a subject is
> >s PERSONAL prior, non-robust, and involves no mathematical "risk"
> >considerations. :-) <Just bring back some of terms On-Topic>
>
> >I am reading a book, "The Man Who Loves Only Numbers" about the
SECOND
> >most prolific writer in the history of mathematics, an eccentric
> >mathematician, statistician, and probabilist Paul Erdos, whom I've
> >met in his lectures and cocktail parties. A tale emerged in the
book
> >about Hardy's (of Hardy and Littlewood fame) PERSONAL (subjective)
> >rating of mathematicians.
>
> >Hardy liked to rank mathematicians on a scale of 1 to 100, said
Erdos,
> >according to the biographer of the book Paul Hoffmann. He <Hardy>
gave
> >himself 25, Littlewood 30, David Hilbert 30, and Ramanujan 100.

> Just yesterday, some students asked me about how to estimate
> the parameters in a particular applied problem. Global
> maximum likelihood does not exist, and finding a starting
> point for local maximum likelihood is difficult. Today,
> one of them told me it was successful to fit a few points
> on the characteristic function to the sample characteristic
> function, which I suggested. Theory should be applied.

Only a non-applied statistician who is a mathematical statistician
could even think of suggesting such a needless tool of the
characteristic function and empirical FITTING to it, rather than
the DATA ANALYTIC diagnostic tools that are as theoretical and
have been successfully and effectively used for over half a
century!

Your paragraph, IMHO again, is just your self-indictment that you
are NOT an applied statistician.

It should be quite clear to everyone by now what YOU call application
is quite different from what I, and most other applied statisticians
or data analysts call "application" of statistical METHODS.

>
> >< A gigantic leap to a couple of practical points >
>
> >> >You are speaking and thinking as a mathematical statistician.
> >> >That might have been what he did in the Foundation of Statistics
> >> >book, but I am speaking of Savage as an APPLIED Bayesian,
> >> >articulated at the very elementary level paper by Edwards,
Lindman,
> >> >and Savage.
>
> >> Ad what I am saying is that there are practical uses of
> >> prior Bayes in situations which are common. In testing a
> >> null hypothesis, relatively little of the prior is
> >> important,
>
> >Nor is there any NEED for a Bayesian to test any hypothesis, in the
> >manner sharp hypotheses are tested in the non-Bayesian context.
>
> There is a need from the standpoint of what action to take.
> I do have a paper on testing an imprecise null in one dimension
> from a Bayesian standpoint, and there are some simple conclusions
> which can be drawn, but when the acceptance region and the variance
> are comparable is the only situation where the user has to specify
> a lot. The others are to use the method in my paper with Sethuraman
> if the region is small, and to see if the estimate is in the region
> if the region is large. The intermediate part is sensitive, and
> taking the union test does not accept often enough.

I said there is no NEED for a Bayesian to test any hypothesis -- to
even if you wish to add the action condition of knowing what action
to take.

Jimmie Savage talked about how a Bayesian MAY use the posterior ODDS
of one region against another to mimic the notion of a test of
hypothesis the non-Bayesians do (which is what you are doing), but it's
not an everyday diet for a Bayesian. Inference and action do NOT
require any test of any hypothesis.

>
> >> and one can ask the user about that, instead
> >> about the full prior, where the important parts are likely
> >> to be almost ignored.
>
> >But a a true Bayesian, I AM the user! As a PERSONAL Bayesian, I am
> >not going to gloss over my full prior. And when that realization is
> >impossible (usually in the multivariate priors), then I simply
abandon
> >the Bayesian methodology, fall back on being a non-Bayesian, and try
> >to minimize all the inherent defects of the "classifical" approach,
> >and grin and bear it. That happens the vast majority of the time.
:-)

I was merely making the simple point that I AM both the user and
the statistician, so no distinction need to be made. And if I don't
know how to do it properly for myself, how could there be any hope
that I do it correctly and properly for ANY client/user?

> >Again, I am both the user AND the statistician. If I can't do it
> >realistically for myself, what hope is there to do it PROPERLY for
> >the client USER?
>
> Rubin's 5th Commandment: If thou art both the client and
> the consultant, keep thy roles distinct, lest thou violate
> some of the other commandments.

Nah, that's NOT Rubin's 5th Commandment. You're Rubin MIS-applying
Rubin's 5th Commandment to suit your argument this time.

This is Rubin's 5th Commendment:

HR >> For the person who is both (e. g., a biostatistician or
>psychometrician):
HR >> 5. Thou shalt keep thy roles distinct, lest thou violate
HR >> some of the other commandments.

The context is that the statistician and the client are two DISTINCT
persons, and for one whose field of substantive application is outside
that of statistics, such as biostat and psychometics, they should
keep the roles (whether client vs consultant or statistician vs
non-stat subject matter) distinct.

In our present discussion, there is only ONE entity, ME, the
applied statistician applying my own statistical problem.

See, Herman, you can't even follow your OWN Commandment correctly. ;^)

>
> >The trouble with the risk-function mathematical construct is that
> >to be PERSONAL about it, you need to know what "widgets" your LOSS
> >is measured in, what is your utility function of those widgets, and
> >from my point of view, that's just a lot of smoke screen to divert
> >one's attention from the REAL problem in the Real World context of
> >making statistical sense, and making inference by incorporating
one's
> >prior information that is relevant AND non-negligible and clearly
> >non-robust.
>
> No, the risk is often more important than the posterior.
> You want to be much more careful about letting in someone
> with Marburg's disease than with smallpox.

But your unrealistic loss function and callous utility function
assessment coupled with callous use of prior information all add up
to a self-deception smoke screen that fools mostly yourself!

KISS (Keep It Simple Stupid) is by far the realistic approach.

> >> One could call it "engineering Bayesian". This is
> >> what is used all the time in physics and other
> >> fields; one cannot do precise calculations taking
> >> into account all the factors, so one decides how
> >> to approximate.
>
> >I have no problem with this point of view about science, nor am
> >I arguing against approximations. What I am arguing against is
> >GROSS approximations that alters a real, sensible problem into
> >one that bears no resemblance to how the problem should be
> >approached, as a Bayesian!
>
> >We are just haggling about the admissible degree of approximation

-- Bob.

Herman Rubin

unread,

Apr 13, 2005, 2:52:21 PM4/13/05

In article <1113347275....@o13g2000cwo.googlegroups.com>,

....................

>< big snip >

I have given applied talks, although you might not
recognize them as applied, as you restrict "applied"
to using a recipe from a cookbook.

>> >I have a pragmatic discriminant function for Theoretical vs Applied
>> >Statisticians -- the former dwell mostly in the IMS journals and
>> >IMS activities in the Annual JMS (Joint Meetings inStatistics),
>while
>> >the latter in ASA journals and activities.

JASA is not the only journal where applied papers appear.

Also, I tend not to publish as much as I should. What
you might think an important applied procedure, I might
well consider trivial.

>> >That accounts for the fact that I've seen you at the Annual Meetings
>> >for decades, but never heard of seen you in any of the major talks
>> >sponsored by ASA, or in any talk for that matter. That's because I
>> >almost never attend the IMS sessions because my interests are
>APPLIED.

>> To apply something correctly, one should understand the
>> underlying theory. There is much applied material there.

>I always agreed totally on this aspect of applied statistics. There's
>plenty of good statistical practice in the ASA sessions and journals.

And plenty of bad.

Why is fitting data to a characteristic function any
more unreasonable than fitting moments or quantiles?

Just as linear combinations of order statistics can be
arbitrarily good (application of theory), so can
adjustments of fits of estimated characteristic functions.

>> I am a Fellow of IMS since 1952.

>Of course. So is EVERYONE of those Fellows of ASA I know personally
>who are well-known in applied statistics, Data Analysis, and the ASA.
>Of those I explicitly rated less than 5 in a scale of 1 to 100 on
>"applied statistics", Chernoff, Diaconis, Feller, and I.R.Savage --
>they are all Fellows of the IMS, and with even more prestigeous elected
>titles than mere both Fellows of the ASA and IMS.

>> >Speaking of rating other statisticians, it is certainly a subject is
>> >s PERSONAL prior, non-robust, and involves no mathematical "risk"
>> >considerations. :-) <Just bring back some of terms On-Topic>

....................

>> Just yesterday, some students asked me about how to estimate
>> the parameters in a particular applied problem. Global
>> maximum likelihood does not exist, and finding a starting
>> point for local maximum likelihood is difficult. Today,
>> one of them told me it was successful to fit a few points
>> on the characteristic function to the sample characteristic
>> function, which I suggested. Theory should be applied.

>Only a non-applied statistician who is a mathematical statistician
>could even think of suggesting such a needless tool of the
>characteristic function and empirical FITTING to it, rather than
>the DATA ANALYTIC diagnostic tools that are as theoretical and
>have been successfully and effectively used for over half a
>century!

I see no reason why this should not be used just as readily
as fitting moments. Fourier analysis is considered applied
by many in many fields. Someone who knows how to apply
statistics must be willing and able to develop methods
"on the spot".

Consider the paper of Neyman and Scott, who would be considered
applied statisticians, on estimating a regression line with
both variables subject to error. This uses lots of sample
characteristic functions.

BTW, Neyman had as strict theoretical standards for applied
students as most theorists have for theoretical students.

>Your paragraph, IMHO again, is just your self-indictment that you
>are NOT an applied statistician.

>It should be quite clear to everyone by now what YOU call application
>is quite different from what I, and most other applied statisticians
>or data analysts call "application" of statistical METHODS.

A statistical method is an application of theory to a
particular problem.

I heard one of his talks in which he considers odds in bets of
microcents. Now such bets may eliminate nonlinearity, but NOT
the loss function. It is not the case that an additional microcent
in state A is equivalent to the same amount in state B. This
means that the loss must be taken into account together with the
prior, and they cannot be separated. Get stuck on belief, and
you are now really stuck and cannot get off it. Application is
about action.

"Both" refers to "client and consultant". It is not a
misapplication.

>The context is that the statistician and the client are two DISTINCT
>persons, and for one whose field of substantive application is outside
>that of statistics, such as biostat and psychometics, they should
>keep the roles (whether client vs consultant or statistician vs
>non-stat subject matter) distinct.

No, the context of the fifth commandment refers to the case
where they are the same.

>In our present discussion, there is only ONE entity, ME, the
>applied statistician applying my own statistical problem.

>See, Herman, you can't even follow your OWN Commandment correctly. ;^)

>> >The trouble with the risk-function mathematical construct is that
>> >to be PERSONAL about it, you need to know what "widgets" your LOSS
>> >is measured in, what is your utility function of those widgets, and
>> >from my point of view, that's just a lot of smoke screen to divert
>> >one's attention from the REAL problem in the Real World context of
>> >making statistical sense, and making inference by incorporating
>one's
>> >prior information that is relevant AND non-negligible and clearly
>> >non-robust.

Utilities are subject to scale factors, and so one cannot
use "widgets" to measure them across states of nature. But
one can be asked, instead of the usual types of bets, what
probability of making this type of error in these states of
nature matches that probability of making that type of
error in those states of nature. These questions determine
the loss-prior combination. Bets cannot determine the prior,
unless the loss is constant across states.

>> No, the risk is often more important than the posterior.
>> You want to be much more careful about letting in someone
>> with Marburg's disease than with smallpox.

>But your unrealistic loss function and callous utility function
>assessment coupled with callous use of prior information all add up
>to a self-deception smoke screen that fools mostly yourself!

>KISS (Keep It Simple Stupid) is by far the realistic approach.

>> >> One could call it "engineering Bayesian". This is
>> >> what is used all the time in physics and other
>> >> fields; one cannot do precise calculations taking
>> >> into account all the factors, so one decides how
>> >> to approximate.

>> >I have no problem with this point of view about science, nor am
>> >I arguing against approximations. What I am arguing against is
>> >GROSS approximations that alters a real, sensible problem into
>> >one that bears no resemblance to how the problem should be
>> >approached, as a Bayesian!

>> >We are just haggling about the admissible degree of approximation

You are forcing comparisons which cannot be made. I am trying
to come up with usable procedures which will be close approximations
to Bayes procedures, knowing that a precise Bayes procedure requires
an infinitely large infinitely fast computer with zero cost.

Reef Fish

unread,

Apr 13, 2005, 4:33:45 PM4/13/05

I am hardly the cookbook type, though I have used Cook's book. :-)
That's Dennis Cook, who will be presenting the Invited 2005 Fisher
Lecture this year. I don't think you were ever invited by the
ASA to present anything, were you?

Actually, I am completely OPPOSITE of what you called "you
restrict "applied" to using a recepe from a cookbook." But I had
one doctoral student who did a dissertation on regression diagnostics,
which was very applied, but hardly "cookbookish".

There was a "mathematical statistician" on the committee (who wrote
many more applied papers than you did) but who made a lot of noise
because he didn't really understand anything about APPLIED statistics.
So, the department, at my suggestion, actually invited Dennis Cook
to be on the doctoral dissertation committee and came to the
department to give a seminar talk as well as appearing in the
doctoral student's "Ph.D. defense" session.

>
> >> >I have a pragmatic discriminant function for Theoretical vs
Applied
> >> >Statisticians -- the former dwell mostly in the IMS journals and
> >> >IMS activities in the Annual JMS (Joint Meetings inStatistics),
> >while
> >> >the latter in ASA journals and activities.
>
> JASA is not the only journal where applied papers appear.
>
> Also, I tend not to publish as much as I should. What
> you might think an important applied procedure, I might
> well consider trivial.

That may very well be true. By the same token, what you think is an
important application may be considered by me as irrelevant.

>
> >> >That accounts for the fact that I've seen you at the Annual
Meetings
> >> >for decades, but never heard of seen you in any of the major
talks
> >> >sponsored by ASA, or in any talk for that matter. That's because
I
> >> >almost never attend the IMS sessions because my interests are
> >APPLIED.
>
> >> To apply something correctly, one should understand the
> >> underlying theory. There is much applied material there.
>
> >I always agreed totally on this aspect of applied statistics.
There's
> >plenty of good statistical practice in the ASA sessions and
journals.
>
> And plenty of bad.

I agree on that too. But those are your STRAWMEN <tm>. :-)

"Fitting" moments? (as opposed to estimating?) Not sure I
know what you mean or what you're getting at.

>
> Just as linear combinations of order statistics can be
> arbitrarily good (application of theory), so can
> adjustments of fits of estimated characteristic functions.

I don't see any need of the moment generating function (when they
exist) in APPLIED problems. Why why should I mess with characteristic
functions that adds nothing except some useless complex sqrt (-1)?

NOW I see what you mean by your earlier "fitting moments". Let's
just say I do many many types of applied statistics which you don't
do; and Fourier analysis is not that I DON'T do.

>
> Consider the paper of Neyman and Scott, who would be considered
> applied statisticians, on estimating a regression line with
> both variables subject to error. This uses lots of sample
> characteristic functions.

That is only because they addressed it from a mathematical statistics
point of view of doing MATHEMATICS.

99% of the applied regression analyses are done on the CONDITIONAL
model of Y given X, their approach would not be of interest anyway.

>
> BTW, Neyman had as strict theoretical standards for applied
> students as most theorists have for theoretical students.

I am not here to judge Neyman's theoretical standards for applied
students. If Neyman were here to discuss applied problems as we
do, they I'll tell him whether his views about APPLIED statistics
are obsolete or not.

>
> >Your paragraph, IMHO again, is just your self-indictment that you
> >are NOT an applied statistician.
>
> >It should be quite clear to everyone by now what YOU call
application
> >is quite different from what I, and most other applied statisticians
> >or data analysts call "application" of statistical METHODS.
>
> A statistical method is an application of theory to a
> particular problem.

Not interested in getting into rhetorics. I am much more interested
in how SPECIFIC statistical problems are addressed, from an applied
and practical point of view.

One of that view is that "measure theory" is absolutely NOT necessary.

I'll add now that the estimation or fitting of the "characteristivc
function" is not necessary, in any of the APPLIED problems I've
seen -- and I've seen quite a few of those done by reputable
applied statstiticns, not your strawmen. :)

Then there would have been no distinction between client and user.
See below:

You can use "widgets" or "widget equivalents" or "robust
widgets" or "engineering widgets" ... in the end, you are in
the same needless soup that you got yourself into in the
first place.

>
> >> No, the risk is often more important than the posterior.
> >> You want to be much more careful about letting in someone
> >> with Marburg's disease than with smallpox.
>
> >But your unrealistic loss function and callous utility function
> >assessment coupled with callous use of prior information all add up
> >to a self-deception smoke screen that fools mostly yourself!
>
> >KISS (Keep It Simple Stupid) is by far the realistic approach.
>
> >> >> One could call it "engineering Bayesian". This is
> >> >> what is used all the time in physics and other
> >> >> fields; one cannot do precise calculations taking
> >> >> into account all the factors, so one decides how
> >> >> to approximate.
>
> >> >I have no problem with this point of view about science, nor am
> >> >I arguing against approximations. What I am arguing against is
> >> >GROSS approximations that alters a real, sensible problem into
> >> >one that bears no resemblance to how the problem should be
> >> >approached, as a Bayesian!
>
> >> >We are just haggling about the admissible degree of approximation
>
> You are forcing comparisons which cannot be made. I am trying
> to come up with usable procedures which will be close approximations
> to Bayes procedures, knowing that a precise Bayes procedure requires
> an infinitely large infinitely fast computer with zero cost.

I diagree. Here's another of those opinions/philosophies or wotnots
that we simply have to "agree to disagree".

Thanks for your very detailed reaponse. Trust me that I understood
EVERYWORD of yours (even in the "fitting moment" reference which
threw me off temporarily), but in the end, I simply have a strong
a conviction about "applied statistics" as you have about your
brand of statistics being "applied".

On those issues, we simply have to stand "AGREED TO DISAGREE".

-- Bob.

0 new messages