> On Jun 18, 2015, at 10:03 PM, Erikson Kaszubowski <
erik...@gmail.com> wrote:
>
> Dear Bob,
>
> I'm interested in making a query in the sense of the use of Bayesian Networks in expert systems (like the 'Asia' BUGS model, HUGIN-style, or the classical Sprinkler-Rain graphical model), as discussed in Koller and Friedman's "Probabilistic Graphical Models", chapter 9.
> In my case, instead of using conditional distribution tables, as usual in those models, I'm modeling conditional distributions as GLMs, as suggested by the 'additive bayesian network' package.
I'm afraid I don't know any of this material.
> I'm using Stan to represent and fit the parameters of the DAG discovered by the 'abn' package. I want to use the fitted parameters to query the marginal probabilities of specific nodes, given the evidence in other nodes.
> E.g., using a sug-graph from the attached model:
>
> Speciesversicolor(SVE) -> Petal.Width(PW) <- Speciesvirginica(SVI)
I don't understand this notation. What is Speciesversicolor() do
as a wrapper?
> In this simple sub-graph, the joint distribution is given by:
>
> P(SVE, PW, SVI) = P(SVE)P(SVI)P(PW|SVE,SVI)
> If a have a piece of evidence about petal width, I would like to know the updated probabilities of the parent nodes. Suppose I have the information that the petal width with mean value (0 for standardized data). What is the prob. it's an Iris Versicolor?
I'm a bit confused, though, because SVE and SVI look like
species indicators. How can they both be variables?
I'd think there'd be one boolean variable z[i] that would
indicate with z[i] = 0 if item i is plant type 1 and z[i] = 1
if item i is plant type 2. Then you'd condition things like
the observables on the type of plant. Then the problem's
a simple classification.
> The priors on SVE and SVI
I don't know what that means, either.
> are given by the first level logistic regression. As parent nodes, those regression have just an intercept: -.49 and -.72 in one version of the fitted model, or P(SVI=1) = inv_logit(-.49) = 0.38; P(SVE=0) = 1 - inv_logit(-.72) = 0.67.
> The conditional probability for the child node is given by the linear regression PW = Intercept + B1*SVE + B2*SVI + error, so: P(PW=0|SVE=0,SVI=1) = dnorm(0, mean=-1.26 + 2.37*1 + 1.44*0, sd=0.27). So the joint is:
>
> P(SVE=1, SVI, PW=0) = (0.38 * 0.67 * 0.00031) + (0.38 * 0.33 * 0) ### Marginalizing over SVI, normalize for P(SVE=1|PW=0)
> I don't know if I made it clerarer now.
A bit, but I'm still very confused. Is it consistent to have
SVE=1 and SVI=1? or SVE=0 and SVI=0? And what's this B1 and B2 variable
that gets introduced?
> Thanks for the modeling tips! I'm using exp(log ()) because I'm summing over various log-probabilities and only then exponentiating. I want to compute the marginals for the nodes that represent the species, using the evidence from the other nodes.
No matter what your intent, you never ever ever want to use exp(log())
or log(exp()) or any other pair of inverse functions applied to each other.
It'll only hurt your speed and arithmetic precision and robustness.
> I understand that it would be easier to just use a classifier for this purpose (in this example, a simple linear discriminant analysis or a naive bayes), but I'm doing it this way to understand the 'additional bayesian networks' better.
I just meant build the classifier in Stan. I think that's
what you're trying to do, we're just not communicating notationally.
What I want to see is:
data y: variable types and dimensions and constraints
parameters theta: ditto
joint probablity function: p(y, theta)
At that point, all the inference you want to do is turn-the-crank.
- Bob