Hi All-
My name is William Lyon. I am a graduate student in the Computer Science department at the University of Montana. I would like to introduce myself and express my interest in working with the Center for Computational Medicine for Google Summer of Code 2014.
Specifically, I was drawn to the suggested project “Big Data backend for MedSavant”. The description mentions you are considering a NoSQL solution to replace the current relational database backend. I’m curious if you’ve considered using a graph database. My thesis research is focused on the use of graph databases (specifically Neo4j and Titan, a distributed graph database) for data mining and machine learning applications on massive data sets. The nature of the data in MedSavant seems perfect for a graph data model (highly connected data) and data mining can be very efficient using local graph traversals in a graph database.
I'm still learning about MedSavant - is there a publication list / other documentation you could point me to that might highlight some of the specific use-cases relevant to this project? Is there a sample dataset available that could help me get a better understanding of the typical data model for MedSavant users?
I look forward to learning more about the project.
Thanks,
William Lyon