Hi, I'm working on a project for implementing Bayesian networks on financial data
My goals is to create a framework where you can construct a directed graph, being the Bayesian network, and fit it to historical data, in the form of events. Think of it as cloudy/sprinkler/wetgrass. Rather than defining probabilities, it would use it's last posteriori as prior like MCMC, but with real observations rather than generated samples, hopefully converging if given informative conditions.
I came across TFP and I think implementing it in it would be perfect right?
I'm not sure what the best way is to model this. My POC uses an event/node class, which contains a CPT for itself and its parents. Each observation would add a count in the correct row, and update probabiliy based on that:
Here is a POC of what I want, and want to convert to TFP:
a truth-table of 2^(nparents+1) to denote all possible states of the parents and itself.
... par1..par2..self...count..prob
0 True True True 5028 0.900591
1 True True False 555 0.099409
2 True False True 729 0.931034
3 True False False 54 0.068966
4 False True True 62 0.089985
5 False True False 627 0.910015
6 False False True 620 0.145677
7 False False False 3636 0.854323
....
Can someone give me some guidance to achieving this within TFP?
Should I create joint distributions for each node? or a batch of Bernoulli's?
My goal was to be able to cut down on the size of parameters (opposed to a joint dist of all features, or having a fully-connected NN), by giving constraints to the network in the form of dependencies in the enviroment.
I hope I was clear, thanks in advance.
If TF team is somewhat interested, I'd like to stay in contact.
Let me know here or at
Thanks!