[off issue and on dev list]
I'm trying to stop the GitHub issues from straying off into
long philosophical/mathematical discussions.
My understanding (pending revision, as always) is that the frequentists
insist on p(y ; theta) specifically because they don't treat theta as a
random variable and they are perfectly happy to write p(y | theta)
if both y and theta are random variables. Actually, the careful ones would
write
p_{Y | Theta}(y | theta)
when Y and Theta are random variables --- y and theta are just
arbitrary locally scoped variables written with the same sloppy
notation for binding as mathematicians use everywhere else when
talking about functions (boy were my eyes ever opend about that
dx notation in calc when I learned mathematica and saw it spelled
out in lambda calculus). Many Bayesians are even sloppier (or more
concise if you want to put a more positive spin on it),
writing p(y | theta) and leaving the actual random
variables Y and Theta unspoken, or sometimes overloaded and written
as y and theta. That's why it's so hard to write the usual CDF
F_Y(y) = Pr[y < Y]
in BDA-style notation. I found this all very confusing when first trying
to understand stats because the intro math stats books never precisely
define random variables.
My understanding (also pending revision) is that a (real-valued)
random variable Y was a total function Y:Omega -> R where Omega
is the sample space and R is the set of real numbers. I don't
see how context of use changes the notion of what a random variable
is.
- Bob
> On Aug 22, 2016, at 8:50 PM, Michael Betancourt <
notifi...@github.com> wrote:
>
> That’s not quite the argument.
>
> The semicolon is purely frequentist and is meant to avoid
> interpreting the likelihood as a conditional probability distribution,
> and I agree that this has no place in Bayesian inference (in some
> sense the difference between | and ; distills the fundamental
> differences between frequentist and Bayesian inference).
>
> Instead the difference is in the definition and use of a random
> variable. When using a conditional probability distribution, any
> variables to the left of the | are having their distribution _defined_
> by the conditional distribution. Any variables to the right of the
> |, however, are just be queried by value independent of their
> distribution. Sequential generative modeling with conditional
> probability distributions is just a way to isolate how a random
> variable is defined and how it is used to influence other random
> variables.
>
> In other words, the interpretation of what is a random variable
> and what isn’t is being locally scoped just to the definition of
> that conditional probability distribution. I understand how this
> would be confusing to users not familiar with probability, and
> even to those who are as the error messages would be
> referring to this very local scope and not the global scope of
> the program where everything is random variable. It was just
> a suggestion. Ultimately anything we choose is either going to
> be ambiguous at some level, inconsistent with the actual math,uses
> or confusing to most users.
>
> On Aug 22, 2016, at 11:03 AM, Bob Carpenter <
notifi...@github.com> wrote:
>
> > Understood, but it's the "anymore" that's problematic. They are
> > random variables from the Stan program's perspective. And if you
> > talk to Andrew (or read BDA), then everything's a random variable
> > in Bayesian stats, even the predictors in a regression that come in
> > as data and get no distribution. That's why he refuses to let us use
> > the semicolon notation (the frequentists use that precisely to distinguish
> > the random variables from other variables). Going against BDA, even with
> > technical correctness on our side, seems like a mug's game --- it'll just
> > confuse users and annoy Andrew without much upside.
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub, or mute the thread.
> >
>
> —
> You are receiving this because you were assigned.
> Reply to this email directly, view it on GitHub, or mute the thread.
>