To use an informative prior for a multinomial distribution, does the
following make sense?
1. Start with a uniform prior.
2. Compute the posterior.
3. The posterior is filtered by a signal processing method. The
outcome of the method is used to build a new prior, which is NOT
uniform.
4. Go to step 2.
Thanks!
bahoo
If you want an informative prior, sit down _before you see the data_,
and work out what a sensible prior looks like. The whole point of a
prior is that it is the distribution you would ascribe to the parameters
prior to seeing the data.
Bob
--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland
Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax: +358-9-191 51400
WWW: http://www.RNI.Helsinki.FI/~boh/
Blog: http://deepthoughtsandsilliness.blogspot.com/
Journal of Negative Results - EEB: www.jnr-eeb.org
Thanks Bob. But what is wrong with using the data twice? If every time
we are getting closer to the "truth" by cleverly using the data, isn't
that a right thing to do?
Flip a coin three times, and consider the posterior distribution if
two heads and a tail come up. Use the data again, and again, and
again, and again, and again, ... and you will eventually convince
yourself that the probability of heads must be extremely close to 2/3,
when a sample of three is insufficient to come to anything like that
conclusion.
>Thanks!
>bahoo
There prior should come from the user's ASSUMPTIONS,
and not from the distribution of the data. However,
there is the idea of robustness if one is unsure of
the prior, and this is a tricky subject. It is not
the case that robustness is merely a function of how
close the assumed prior is to the true prior as a
distribution, and is highly asymmetric.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558
However, if the prior comes from another model, is it OK then?
That is, one model uses the data to compute a posterior, then supplies
it as a prior to another model. This seems to avoid the situation
where a model uses the data twice.
How else should one combine experiments or observations?
Something which often helps is to use unnormalized
measures. Any conditional measure is a probability
measure, because one divides by the measure of the whole
space. But for most purposes, it is not necessary to make
the division, and for computing prior risks, it is
generally unwise, as one has to multiply by exactly what
one is using to divide.
There are other advantages to not normalizing, but one
should know what is being done. Also, for Bayes actions,
it is only the product of the prior and loss, and coherence
arguments do not require that they even be separated.
I wonder if the last post is being addressed to the original post or
perhaps a different thread?
Could you explain what is the definition a true prior?
It's still using the data twice (if you're using the same data to inform
the second model).
Duncan
>>> Thanks!
>>> bahoo
The true prior is the one you should be using according to
your assumptions. Unfortunately, the computing power of
the human mind, or in fact of the largest computers, is
usually quite incapable of calculating this. So one has to
use an assumed prior; the same holds for the loss function
in general. In addition, the cost of computing can be
quite high for a "complicated" loss-prior combination.
One cannot get around this merely by adding the cost of
computing to the loss, as the cost of computing the
cost of computing is usually FAR greater than the cost of
computing. But one might be able to find a convenient
prior for which the prior risk can be analyzed, and shown
to be close to what could be obtained. That is a good
approximation can be shown in many cases.