Soft evidence

18 views
Skip to first unread message

Bill Raynor

unread,
Apr 4, 2013, 10:38:32 AM4/4/13
to tetrad-us...@googlegroups.com
Hello Joe,
I'd like to revisit a question that we weren't able to cover in last March's tutorial (CCASA 2012) on what I think is called "soft" or "virtual" evidence. I mentioned this and interventions at the same time. You and Richard handled the inefficient intervention question nicely, and we put the soft evidence issue off till later.
 
If you recall, my data comes from a (pooled) sample and I would like to project it to a different population where I have marginal information on some of the variables.
Suppose, for example, the model is entirely discrete with either binary or ternary variables, and the base population is centered at, say,  50/50 or 40/20/40, for the binary and ternary variables, respectively. Example variables could be Gender (binary), and Education or Race or Age as the ternary variables. The unobserved variables are habits and practices and outcomes.
 
I would like to "move" the graph based on marginal information I have on several nodes in the network and determine the marginals on other nodes for which I do not have evidence. How do I do that in TETRAD?   A brute force statistician approach is to do what Deming suggested back in the '40s: take the raw data and rake it (run an IPF) till convergence. Then push that through Tetrad to sudy interventions and influence. I would use the original network structure discovered on the original much larger sample, and possibly treat the original parameters as a prior...
 
Is there a "Causal Network" way to do this?
 
Thanks
Bill

cg...@andrew.cmu.edu

unread,
Apr 4, 2013, 11:39:34 AM4/4/13
to tetrad-us...@googlegroups.com
Bill,

If your marginal data is a single value for each of a set of nodes and you have a
prior model (graph and joint distribution)then just use the update program in the
update box to give you the conditional marginal distributions for nodes not in
evidence. If your prior model does not include a distribution for some variable not
in evidence, but does include its graphical relations with variables in evidence,
then use the estimator function in the estimator box.

This stuff should maybe be more convenient in TETRAD--for example permitting you to
update on a bunch of values for evidence variables.

Hope I understood the question.

Clark
> --
> You received this message because you are subscribed to the Google Groups "Tetrad
> Users Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> tetrad-users-gr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


Bill Raynor

unread,
Apr 4, 2013, 2:16:48 PM4/4/13
to tetrad-us...@googlegroups.com
Clark,
Thanks for your reply. My understanding of the updater box was that it allowed you to add a single (possibly multivariate) datum to the model and to study hard interventions. Looking at the example on page 72 of the new manual, one could set X1 to 2, X2 to 0 to represent a single case where X1 is in state 2 and X2 is in state 0.
 
The case I'm looking at is when the large sample, used to fit the model, has X1 = (0,1,2) with probabilities (0.5, 0.25, 0.25) and I wish to specialize that the a new sample that will have an X1 marginal of, say, (0.3, 0.5, 0.2). Likewise I have (independent) marginal information that the new sample should have a marginal of, say, (0.1, 0.2,0.3, 0.4). If this were a sample survey, the main sample could from a number of regions, and the "new" data would a small sample from some small area. The raking would rescale the interiors of the larger table to match the marginals of the small area, yielding a set of weights to apply to the data to get the small area estimates. In my case, I want to estimate the effects of manipulations on this reweighted sample.
 
So I think the difference is that the Updater box allows me to enter a single case, when I am looking to match marginal distributions.
Is this distinction correct?
 
Bill

cg...@andrew.cmu.edu

unread,
Apr 4, 2013, 3:32:46 PM4/4/13
to tetrad-us...@googlegroups.com
Correct. Updater would have to be reprogrammed either to take a whole bunch of
values or a probability distribution.

Clark

Bill Raynor

unread,
Apr 4, 2013, 5:03:08 PM4/4/13
to tetrad-us...@googlegroups.com
Clark
Thanks. So I'll do it using the survey raking approach. I found a reference Peng et.al. (2012) "Bayesian network revision with Probabilistic Constraints" in Intl. J. of Uncertainty. Fuzziness and Knowledge-Based Systems that seems to address this to0, using some variants of raking.
 
Bill
Reply all
Reply to author
Forward
0 new messages