Induction of statistical models

Ross Clement

unread,

Dec 18, 2004, 6:13:11 AM12/18/04

to

Hi. I'm curious to know where I would find research on induction of
statistical models.

As an analogy, a bayesian net can be viewed as a probabilistic model
of some domain, that can be used for prediction (of the probability of
some event given known evidence). There are algorithms that can induce
a likely structure of a bayesian network from training data. Hence,
this could be viewed as induction of a probabilistic model. no? The
same description could apply to [Hidden] Markov models and similar.
However, these models include probabilities, not statistical
distributions as their component parts.

Where can I find research that deals with the induction of statistical
models. As a rough example of what I'd like to read about, imagine a
model of the time taken to get from place A to place B modelling the
time taken to walk to the train station as a normally distributed
number of minutes, then the time to wait for the train as Poisson,
etc. In the easiest case, learning such a model could be simply a case
of choosing distributions for each of a number of listed variables,
and selecting a subset of these variables for inclusion in a linear
equation. More complex approaches would make fewer assumptions about
the underlying model. What is this sort of thing called (searchable
key-words particularly appreciated).

I've tried searching for this, but get flooded by uses of statistical
models FOR induction, and "induction of statistical models" brought up
an interesting looking paper related to machine translation, but as
far as I saw nothing on the generalised induction of stastical models.

Thanks in anticipation,

Ross-c

Graham Jones

unread,

Dec 18, 2004, 8:39:14 AM12/18/04

to

In article <a719d9ee.04121...@posting.google.com>, Ross
Clement <RossC...@gmail.com> writes

[...]

>What is this sort of thing called (searchable
>key-words particularly appreciated).

I think "model selection" could be a good key phrase.

Graham
--
Graham Jones
http://www.visiv.co.uk
Emails to gra...@visiv.co.uk may be deleted as spam
Please add a j just before the @ to ensure delivery

George Kahrimanis

unread,

Dec 18, 2004, 3:48:44 PM12/18/04

to

Ross Clement wrote:
> Hi. I'm curious to know where I would find research on induction of

> statistical models. [...] What is this sort of thing called
(searchable
> key-words particularly appreciated).

Try including the term "model building" and variations.

If I understand you correctly your query relates to both cognitive
psychology and artificial intelligence. Therefore I imagine a search
like "cognitive processes in AI".

I would be interested to learn your top-20 list of related papers,
because I am a bit short in free time these days.

~ George Kahrimanis

RossC...@gmail.com

unread,

Dec 23, 2004, 10:20:36 AM12/23/04

to

Hi. Thanks for the answers in this thread.

I looked up "Model Selection" in various sources, and what I've found
(so far) is very similar to what we call "model selection" in Machine
Learning. I.e. of two alternative models, which is better. In
particular, Issue 44(1) of "The Journal of Mathematical Psychology" is
much more similar to what would be found in a Machine Learning book
such as "Elements of Statistical Learning" than I expected.

Unfortunately, "model selection" doesn't quite answer my question. If
there is a good method for saying that one model is better than
another, then this could be used for model induction (shall we call
this "automated model building") if we could generate a large number of
potential models, and select the best one. However, for more
interesting (and complex) problems, the number of potential models is
likely to be too large for simple generation and test. So, there needs
to be guidence in generating models.

A large number of Machine Learning models are based on some sort of
search based on some numerical method of evaluating partially complete
models. E.g. we might "add bits" to a incomplete decision tree*1 based
upon a heuristic measuring the improvement in a model, or might perform
a search for a suitable structure for a bayesian network as per
Heckerman's descriptions.*2 by assuming that a good way of finding good
structures is to look amongst variations of the best performing
structures we have so far.

*1 Elements of Statistical Learning (Hastie, Tibshirani et al) will do
as a ref here.
*2 http://citeseer.ist.psu.edu/heckerman96tutorial.html

In both these cases, there is a simple, fairly generic, search
algorithm guided by an evaluation function. These methods of "model
building" will work for some applications where simple "generate and
test" will be impractical. However, I am guessing that human
statisticians will still be able to solve problems requiring less
restricted models where these search algorithms will not work.

George mentioned that I was probably looking at this from an AI/Cog.
Sci. viewpoint. From my viewpoint as an AI weenie, I see AI and Cog.
Sci. as being quite different. The point of Cognitive Science is to
understand human intelligence and reasoning. IMHO the point of AI is to
build working "intelligent" systems, and whether these are similar to
human reasoning or not is a minor point. However, AI people and Cog.
Sci. people often end up doing very similar things because (as any
intro to knowledge engineering textbook will say) frequently the best
way for a computer program to approach the level of performance of a
human expert in some field is to understand their knowledge and
reasoning, and code this in a program.

This (finally) becomes relevant to the point of my original posting.
After George's comment, I'd like to rephrase my question to be: "Can
anyone recommend papers, books, or other resources that describe model
building, either from the point of view of somebody training to be a
statistician, or from the point of view of building systems that can
automate model building? Particularly for problems where the model
isn't just a matter of choosing a single distribution for some data
(where 'goodness of fit' tests should be a good method of model
selection), but where a more complex statistical model needs to be
built such that it is impossible to enumerate a small number of
potential models that are likely to include the best model. Returning
to the knowledge engineering viewpoint, it's often said that there is a
lot of knowledge that experts need that cannot be obtained from books.
Among the stats books I've seen, they seem to do a good job of
describing the suitable applications for method X. But, in real life,
we have an application Y for which we need to select a model. And, in
situations where the desired statistical model is likely to be complex,
some sort of (possibly implicit) method by which an expert statistician
will find a good (or the best) model. It's this type of knowledge I'd
be interested in reading about. There's no immediate project that I
wish to embark upon that requires this, but I think that in the long
run this is the kind of thing I want to be looking at.

I think that searching further on "model building" is still the way
ahead for me at the moment, but thought that I would reply here with
what I have here in case anyone's interested in commenting further. I
don't yet have a top 20 list of references, and it is likely to be some
time before I have them as I'm not sure what I'm searching for yet.
Both Graham's and George's answers were very useful in prompting me to
think more carefully about what it is that I'm asking.

Apologies if this is too pretentious a posting for a sci.math.*
newsgroup :-)

Cheers,

Ross-c

Aleks Jakulin

unread,

Dec 23, 2004, 11:13:38 AM12/23/04

to

Ross asked:

> After George's comment, I'd like to rephrase my question to be: "Can
> anyone recommend papers, books, or other resources that describe

> building, either from the point of view of somebody training to be a
> statistician, or from the point of view of building systems that can
> automate model building?

This question reminds me of someone asking for good books on
mathematics. :) Model building is a tremendously broad topic, with
several different schools of thought.

We've had a few discussions recently, though:

1. Why Occam's razor "If there are several hypotheses equally
consistent with the data, pick the simplest one". Malcolm Forster's
writing on the topic is quite lucid:
http://philosophy.wisc.edu/forster/ In a similar vein, there is a
whole book with "model selection" in the title. I've just read it, and
it's quite
reasonable:
Model Selection and Multi-Model Inference
http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-10129-22-2009034-0,00.html

2. However, Bayesians disagree with Occam's razor, and with model
selection in general:
http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/against_parsimo.html

3. On the Epicurean principle "It would be unscientific to choose an
arbitrary hypothesis if several are consistent with the data", and a
possible synthesis with Occam's razor:
http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/wacky_computer_1.html

--
mag. Aleks Jakulin
http://www.ailab.si/aleks/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana, Slovenia.

Herman Rubin

unread,

Dec 23, 2004, 11:45:51 AM12/23/04

to

In article <cqeqrl$j5k$1...@planja.arnes.si>,

Aleks Jakulin <a_jakulin@@hotmail.com> wrote:
>Ross asked:
>> After George's comment, I'd like to rephrase my question to be: "Can
>> anyone recommend papers, books, or other resources that describe
>> building, either from the point of view of somebody training to be a
>> statistician, or from the point of view of building systems that can
>> automate model building?

>This question reminds me of someone asking for good books on
>mathematics. :) Model building is a tremendously broad topic, with
>several different schools of thought.

>We've had a few discussions recently, though:

>1. Why Occam's razor "If there are several hypotheses equally
>consistent with the data, pick the simplest one". Malcolm Forster's
>writing on the topic is quite lucid:
>http://philosophy.wisc.edu/forster/ In a similar vein, there is a
>whole book with "model selection" in the title. I've just read it, and
>it's quite
>reasonable:
>Model Selection and Multi-Model Inference
>http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-10129-22-2009034-0,00.html

>2. However, Bayesians disagree with Occam's razor, and with model
>selection in general:
>http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/against_parsimo.html

There is no basic disagreement between Bayesian inference
and Occam's razor, if the Bayesian inference is done
properly, not rashly. One will never get the true model,
nor could one use the true model if by some miracle it can
be found, so the real question is, what somewhat wrong
model minimizes the combined aspects of the loss? The
behavioral Bayes approach does not look at the probability
of the right action, nor does it even require prior
probability in the usual sense, but minimization of the
Bayes risk, and even this can only be approximated in
practice. Some of the aspects of risk are the complexity
of the model, the error in predicting from the model,
computational costs, whether the model can improve
understanding, etc.

>3. On the Epicurean principle "It would be unscientific to choose an
>arbitrary hypothesis if several are consistent with the data", and a
>possible synthesis with Occam's razor:
>http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/wacky_computer_1.html

>--
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

George Kahrimanis

unread,

Dec 23, 2004, 7:02:33 PM12/23/04

to

Herman Rubin wrote:
>There is no basic disagreement between Bayesian inference
>and Occam's razor, if the Bayesian inference is done
>properly, not rashly.

Being a nonexpert, I just suspect that the point of view in the
following article is in that ballpark.

"Statistical Inference, Occam's Razor and Statistical Mechanics
on The Space of Probability Distributions"
by Vijay Balasubramanian, in <http://www.arXiv.org>
(either in the "physics" or in the "math" archive)
and in other websites.

>>3. On the Epicurean principle "It would be unscientific to choose an
>>arbitrary hypothesis if several are consistent with the data", and a
>>possible synthesis with Occam's razor:
>>http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/wacky_computer_1.html

I seems to me that I am an Epicurean wrt this principle. Yet I
detest infinities; therefore I am happiest with a small bunch of
models, to minimize forseeable trouble (all things considered).

(In a similar vein Aristotle instructs that maximum happiness
requires that you keep a very small number of friends.)

Many thanks for the references. I am sure that I shall enjoy
reading them, whether on model building or model selection.
~ George Kahrimanis

Herman Rubin

unread,

Dec 24, 2004, 3:26:02 PM12/24/04

to

In article <1103846553.6...@f14g2000cwb.googlegroups.com>,

George Kahrimanis <anak...@hol.gr> wrote:
>Herman Rubin wrote:
>>There is no basic disagreement between Bayesian inference
>>and Occam's razor, if the Bayesian inference is done
>>properly, not rashly.

>Being a nonexpert, I just suspect that the point of view in the
>following article is in that ballpark.

>"Statistical Inference, Occam's Razor and Statistical Mechanics
>on The Space of Probability Distributions"
>by Vijay Balasubramanian, in <http://www.arXiv.org>
>(either in the "physics" or in the "math" archive)
>and in other websites.

>>>3. On the Epicurean principle "It would be unscientific to choose an
>>>arbitrary hypothesis if several are consistent with the data", and a
>>>possible synthesis with Occam's razor:
>>>http://www.stat.columbia.edu/~cook/movabletype/archives/2004/12/wacky_computer_1.html

>I seems to me that I am an Epicurean wrt this principle. Yet I
>detest infinities; therefore I am happiest with a small bunch of
>models, to minimize forseeable trouble (all things considered).

One must distinguish between the model which is accepted
and the model which is true. Remember that the model must
explain the observations, not just what is happening in
nature. This MAY be simple enough that the difference
can be ignored, or simplified, or it may not.

There are many examples where a Bayesian approach can
be taken to the real problem, even though a posterior
distribution is beyond calculation. This may even
result in accepting a model which is usually rejected
from the grounds of statistical significance. It is
not necessary to compute posterior probabilities to
compute the decision to be taken, fortunately.

This is the position faced in the social sciences, in
the biological sciences, and even to some extent in the
physical sciences. Do not oversimplify the problem.