Could you please explain why naive bayes is not a bayesian model?
Den torsdagen den 11:e april 2013 kl. 04:07:19 UTC+2 skrev John Salvatier:Also, it would be somewhat non-trivial because Naive bayes is not a Bayesian model (despite the name IIRC), so the implementation would not be completely straightforward.
However, you might try doing logistic regression, which is somewhat similar to Naive bayes and which is a bayesian model.
You received this message because you are subscribed to the Google Groups "PyMC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pymc+uns...@googlegroups.com.
To post to this group, send email to py...@googlegroups.com.
Visit this group at http://groups.google.com/group/pymc?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
As Chris said it PyMC is not necessary for naïve Bayes, but I agree with Tommy that this is no reason not to use it. Here is a simple model that does the trick:
n_test = len(data['X_test'])
y = [mc.Bernoulli('y_%d'%i, .5) for i in range(n_test)]
alpha = empty(p)
beta = empty(p)
for j in range(p):
# alpha[j] is Pr[X_j = 1 | y = 1] in training data
alpha[j] = (data['X_train'][:,j] * data['y_train']).sum() / data['y_train'].sum()
# beta[j] is Pr[X_j = 1 | y = 0] in training data
beta[j] = (data['X_train'][:,j] * (1-data['y_train'])).sum() / (1-data['y_train']).sum()
X = [mc.Bernoulli('X_%d_%d'%(i,j), alpha[j]*y[i]+beta[j]*(1-y[i]), value=data['X_test'][i,j], observed=True) for i in range(n_test) for j in range(p)]
And here is an ipython notebook that takes it for a test drive:
Abraham D. Flaxman
Institute for Health Metrics and Evaluation | University of Washington
2301 5th Avenue, Suite 600 | Seattle, WA 98121| USA
Tel: +1-206-897-2800 | Fax: +1-206-897-2899 UW
Thank you for your answer Chris.
I realize that there are other tools that are much easier to use for implementing NB. My idea however was to implement it in part as an exercise and in part as a base model to compare future models against. I indent to improve the model later on.
Reading the PyMC tutorial and skimming the wiki I still don't understand how to "bind" observations together. Do need to make something like an observation function that takes a random number representing a sample and sets the individual attribute-observation-variables? Is there a smarter way?