Blog Post: GraphConnect SFO Talk from Daniel Himmelstein

22 views

Skip to first unread message

Michael Hunger

unread,

Jun 24, 2017, 5:06:08 PM6/24/17

to neo4j-biotech

https://neo4j.com/blog/integrating-biology-public-neo4j-database

Includes full transcript, slides and video recording of the talk.

Have a look, really impressive work.

My personal favorite is how Daniel generated Neo4j-Browser guides for each of the proteins in the database on hetionet: http://het.io/

Cheers, Michael

Summary

Himmelstein started his PhD research with the question: How do you teach a computer biology? He found the answer in a heterogenous network (a.k.a., “HetNet”), which turned out to be another term for a labelled property graph.

After an attempt to create his own Python package for querying HetNets, Himmelstein turned to Neo4j. By importing open source drug and genetic information, he has developed a graph with more than 2 million relationships that can be mined for drug repurposing – in other words, finding new treatment uses for drugs that are already on the market – via a growing dataset of matching compound-disease pairs.

For each of the current 200,000 compound-disease pairs, his project computes the prevalence of many different types of paths and then uses a machine learning classifier to identify the patterns of the network, or the paths, that are predictive of treatment or efficacy. As an example, Himmelstein shows you how his HetNet project helped identify bupropion as a drug that not only treats depression but also nicotine dependence.

Daniel Himmelstein

unread,

Jun 25, 2017, 3:50:05 AM6/25/17

to Michael Hunger, neo4j-biotech

Greetings List,

Thanks Michael. Since I'm new to this mailing list, I thought I'd introduce myself. I'm a data scientist at UPenn in the Greene Lab. I started using Neo4j when working on Project Rephetio to predict new uses for exiting drugs. For this project, we created an integrative network of biomedical knowledge called Hetionet. We host the neo4j instance publicly at https://neo4j.het.io.

I saw Michael's earlier email about the proceedings from the Life Sciences Meetup in Berlin. Was really excited to see so many applications of Neo4j for biomedical hetnets.

Going forward, I'm hoping to transition a bit from constructing hetnets to algorithm development. As many here know, most traditional graph algorithms are oblivious to node/relationship types, rendering them useless for hetnets. One current project we have along these lines is called hetmech where we aim to be able to translate between nodes of different types. For example, a user could provide a set of disease symptoms and we would translate those to biological pathways.

Anyways I'm sure many of us will cross paths in the future if we haven't already. Glad to be on this list!

Best,

Daniel

--
You received this message because you are subscribed to the Google Groups "neo4j-biotech" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j-biotec...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Davide Mottin

unread,

Jun 25, 2017, 2:22:36 PM6/25/17

to Daniel Himmelstein, Michael Hunger, neo4j-biotech

Hi Daniel,

very interesting work.

In terms of graph algorithms actually there is a vast literature in heterogeneous information networks from the data mining, database, and web communities, although most of the the applications are not targeted to biological networks.

Most of the people used different names for the same kind of graphs, e.g. attributed graphs, labeled graphs, colored graphs and so on, therefore you could search for these as well.

In terms of "translations" what we typically do in graph mining is to abstract the concept of similarity and then apply techniques that work on multi-dimensional spaces in the case of graphs (e.g., clustering on embedded spaces). There is quite a few work in this regard but I'm not sure it will fit your case, again.

Other approaches consider metapaths as means for similarity among nodes, however with two different node types this might not work.

It seems a kind of interesting thing, do you have an idea how to formulate the problem that you want to solve, since there might be other solutions out there.

Best,

Davide

Dr. Davide Mottin, PhD

Postdoc @ Knowledge Discovery and Data Mining Group

Hasso-Plattner-Institut an der Universität Potsdam

Prof.-Dr.-Helmert-Str. 2-3, 14482 Potsdam

Tel +49 331 5509 1374

Hasso-Plattner-Institut für Softwaresystemtechnik GmbH, Potsdam

Amtsgericht Potsdam, HRB 12184, Geschäftsführung: Prof. Dr. Christoph Meinel

On Sun, Jun 25, 2017 at 12:58 AM, Daniel Himmelstein <daniel.hi...@gmail.com> wrote:

Greetings List,

Thanks Michael. Since I'm new to this mailing list, I thought I'd introduce myself. I'm a data scientist at UPenn in the Greene Lab. I started using Neo4j when working on Project Rephetio to predict new uses for exiting drugs. For this project, we created an integrative network of biomedical knowledge called Hetionet. We host the neo4j instance publicly at https://neo4j.het.io.

I saw Michael's earlier email about the proceedings from the Life Sciences Meetup in Berlin. Was really excited to see so many applications of Neo4j for biomedical hetnets.

Going forward, I'm hoping to transition a bit from constructing hetnets to algorithm development. As many here know, most traditional graph algorithms are oblivious to node/relationship types, rendering them useless for hetnets. One current project we have along these lines is called hetmech where we aim to be able to translate between nodes of different types. For example, a user could provide a set of disease symptoms and we would translate those to biological pathways.

Anyways I'm sure many of us will cross paths in the future if we haven't already. Glad to be on this list!

Best,
Daniel

On Sat, Jun 24, 2017 at 5:06 PM 'Michael Hunger' via neo4j-biotech <neo4j-biotech@googlegroups.com> wrote:

https://neo4j.com/blog/integrating-biology-public-neo4j-database

Includes full transcript, slides and video recording of the talk.

Have a look, really impressive work.

My personal favorite is how Daniel generated Neo4j-Browser guides for each of the proteins in the database on hetionet: http://het.io/

Cheers, Michael

Summary

Himmelstein started his PhD research with the question: How do you teach a computer biology? He found the answer in a heterogenous network (a.k.a., “HetNet”), which turned out to be another term for a labelled property graph.

After an attempt to create his own Python package for querying HetNets, Himmelstein turned to Neo4j. By importing open source drug and genetic information, he has developed a graph with more than 2 million relationships that can be mined for drug repurposing – in other words, finding new treatment uses for drugs that are already on the market – via a growing dataset of matching compound-disease pairs.

For each of the current 200,000 compound-disease pairs, his project computes the prevalence of many different types of paths and then uses a machine learning classifier to identify the patterns of the network, or the paths, that are predictive of treatment or efficacy. As an example, Himmelstein shows you how his HetNet project helped identify bupropion as a drug that not only treats depression but also nicotine dependence.

--
You received this message because you are subscribed to the Google Groups "neo4j-biotech" group.

To unsubscribe from this group and stop receiving emails from it, send an email to neo4j-biotech+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "neo4j-biotech" group.

To unsubscribe from this group and stop receiving emails from it, send an email to neo4j-biotech+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward

0 new messages