I'm sorry in advance if a posting like this is not welcome on this forum. I wanted to thank the developers of tensorflow-probability for the great tool that they have developed. The team that I'm on leaned on tfp as the workhorse behind our entry in the CMS AI challenge. This challenge is on predicting hospital readmission and other adverse events using medicare billing data. Over the past year or so, I, and the rest of the team, have had a bit of a crash course in using tfp/tensorflow. I thought I would share our experience as it might be useful as a case study.
First some background - Our team is composed mostly of people who are most-accurately classified as Applied Mathematicians. We do not come from the machine learning world though most of us had worked on data-driven projects in the past. Over a year ago, we started putting together an entry for the CMS AI Challenge based on replicating some properties of artificial neural networks in more-structured Bayesian hierarchical models, for building expressive yet fully-interpretable models. Our proposal was one of the 25 chosen to compete in the contest - of the 25, our team was the only non-entity (mederrata, we still only barely exist).
Our solution necessitated a scalable and flexible modeling framework. I'm a big fan of Stan personally, but I didn't think it was flexible enough for our purposes. Additionally, Stan's implementation of ADVI is a bit of an uncustomizable black box. Finally, I wanted to find a solution that was fully Pythonic. So, I went looking for other frameworks and came across PyMC and eventually TFP because of PyMC4.
So, we adapted TFP for our project - there were some hiccups along the way. A lot of our difficulties related to the change over between TF1.x and 2.x and the documentation for TFP/TF being sort of a mess. However, I greatly enjoy the elegance of the TFP approach (despite it being quite verbose). I also like how easy it is to customize variational approximations, by simply creating a JointDistribution object of any desired structure.
I like TFP overall but I think it is worthwhile to comment on where things got hairy:
Saving models: This was probably the biggest headache for us. At the commencement of the contest we had to send a saved model to the organizers. We thought it would be as easy as inheriting tf.Module and using tensorflow's saved model capabilities. We were wrong. We relied a lot on using JointDistributionNamed objects and these objects completely refused to be saved. In order to get our model object to save at all, I had to exclude all JointDistributionNamed objects using NoDependency which in essence meant that our model wasn't saved. We started developing the project before JointDistributionCoroutine was included in TFP - no idea if that has the same issue. Additionally, other non-tensorflow attributes such as lists and dictionaries with model attributes didn't get saved when using saved model. Eventually, we developed our own serialization using pickle.
Distributed Programming: We were able to get this working though we had to modify the TransformedVariable class to pass in the name scope used for MirrorStrategy After training, we ran into the saved model issue still.
Some basic operations such as bucketizing: The issue here is with TF API documentation and not TFP. We often had to scour StackExchange to figure out how to do various things - many common operations are not present in the official documentation, or easily accessible within the API (for instance having to use things from math_ops).
Anyways, I am very thankful to the developers and also to the user community who very quickly answered questions that I had posted on this forum. We couldn't have developed our contest entry without this tool and I look forward to using TFP in the future.
--
You received this message because you are subscribed to the Google Groups "TensorFlow Probability" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfprobabilit...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/6be10f50-d48c-4bfd-884f-f891343aeab6o%40tensorflow.org.
• • • • | Paige Bailey Product Manager (TensorFlow) @DynamicWebPaige
|
+David Smalling +Paige Bailey +Colin CarrollSuper interesting. David or Paige or Colin, any thoughts on how we can best take advantage of Josh's awesome feedback and work?rif
To unsubscribe from this group and stop receiving emails from it, send an email to tfprob...@tensorflow.org.
To unsubscribe from this group and stop receiving emails from it, send an email to tfprobabilit...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/74464a3c-c378-4785-aed8-61b1e4dd335eo%40tensorflow.org.
To unsubscribe from this group and stop receiving emails from it, send an email to tfprob...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/6be10f50-d48c-4bfd-884f-f891343aeab6o%40tensorflow.org.
To unsubscribe from this group and stop receiving emails from it, send an email to tfprobabilit...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/CAD_VcXnQjHFEWNWbvkixZYZNFiBAkiJxH8bJeoBPFgaSYrXztQ%40mail.gmail.com.