Modeling hierarchies for relationships

221 views
Skip to first unread message

Benny Kneissl

unread,
Apr 16, 2014, 8:09:48 AM4/16/14
to ne...@googlegroups.com

Hi,

as far as I know the smartest way to store hierarchies for node entities is to use the new label feature. Lets's suppose an entity is of type B where B is a subclass of A. Then the node is labeled by both A and B, right?

But what about hierarchies for relationships? Should several relationships be stored between two entities to model hierarchies for relationships? Should the type of the relationship differ or is it more meaningful to have the same type but different properties?

A possible example is that "isDaughterOf", "isSonOf" are subtypes of "isChildOf" when modeling a family tree. Or from biology when having a BiochemicalReaction you might want to model "isParticipantOf", "isEductOf", "isProductOf".

In this simple hierarchy I think it is sufficient when asking for all children to traverse both relationship types, but the hierarchy might become more complex and then, it is likely that you forget one relationship type in Cypher (  (x)-[r:IS_DAUGHTER_OF | IS_SON_OF]->(y)   ). If you use only one type ((x)-[r:IS_CHILD_OF]->(y)) you have to add a property daughter / son to ask only for daughter/son. So what is a good way (performance, complexity in formulating a query) to do it in Neo4j? Adding more relationships, or adding more properties?

Currently I don't know what are the advantages for the different approaches, in particular, with respect to formulate queries afterwards.

Thank you for some ideas you have in mind,

Benny

Lundin

unread,
Apr 19, 2014, 12:18:49 PM4/19/14
to ne...@googlegroups.com
Hi Benny,

In your examples, which seems to have an very finite numbers of relationships types, i would go for adding relationship vs properties. Thus the traversal can be done cheap rather than involve properties that would be needed in the look-up. This is the best design performance wise. But of course if your domain-model involves nodes that becomes dense with millions of outgoing relationship and the number of relationship cant so easily be forseen and you want query from that node i would think adding a properties make sense.

Here is actually a good blog post on the topic:

It is very hard without further insight to say exactly how to model your domain.

And dont fortget that you can also limit the serach result by a type as well, as in 

(x)-[r]->(y) where type(r)="IS_DAUGHETR_OF"

Mabey you could test some CSV data of a known domain, import it and try some models and find out ? I would be happy to read such a report.
Message has been deleted

Benny Kneissl

unread,
Apr 22, 2014, 7:40:57 AM4/22/14
to ne...@googlegroups.com
Hi Lundin,

thanks for your suggestion and the linked article.

You're right, the example I posted is relatively easy and I guess, different relationship types are meaningful for this example. To give you an example of a more complex ontology, have a look at BioPAX (http://www.biopax.org/owldoc/Level3/).

Let me explain a little bit more: BioPAX is an ontology - normally used to exchange biological network data - which consists of several biological Interaction classes like Conversion, TemplateReaction, Catalysis, BiochemicalReaction, Transport, and so on and so forth. Based on the Interaction class, entities (a protein, gene) are attached by different relationship types, e.g. in a Conversion by LEFT and RIGHT, in a TemplateReaction by TEMPLATE and PRODUCT or in a Catalysis by CONTROLLER, CONTROLLED and COFACTOR.

Let's suppose you want to find a path between two biological entities (e.g. a protein or gene). Since you don't know if the path consists of BiochemicalReactions or Transports or Conversions etc., you have to allow all relationship types mentioned above, which is a list of at least 8 different types.

At this stage, I thought about hierarchies for relationship types. For example the relationship superclass could be the type PARTICIPANT. The first subclasses might be EDUCT, which has itself the two subclasses LEFT (in case of a conversion) and TEMPLATE (in case of a TemplateReaction), and PRODUCT (Conversion: RIGHT, TemplateReaction: PRODUCT). This would allow to find paths between entities allowing to traverse all relationship of type PARTICIPANT.

It is not possible to say "allow all directions and types" since there are a lot of other relationship (types) that should not be traversed not mentioned here.

I think for this case it might be easier to have only the relationship type PARTICIPANT and use the property map for the subclasses in case you wan to specify the path more precisely. 

But my question is, is there a more elegant way to do this? I heard of an approach, storing the relationship types as nodes in the graph itself. The queries will be of two steps: first get all relationship types that are subclasses and second, use this retrieved list as input for the query. 

It would be nice if you (or someone else) can comment this.

Thanks,
Benny

Benny Kneissl

unread,
Jul 14, 2014, 4:48:24 AM7/14/14
to ne...@googlegroups.com
Hi Lundin,


I found now an real-world example and maybe you can comment this. In this gene-disease association ontology you have to connect nodes representing genes with nodes representing diseases by their association type. Let's suppose gene A has a GeneticVariationAssociation to disease B, then I have to add 4 relationships (GeneticVariationAssociation, BiomarkerAssociation, GeneDiseaseAssociation, Association) between A and B. Is it recommended to do it this way or are there smarter possibilities?

Michael Hunger

unread,
Jul 14, 2014, 9:03:19 AM7/14/14
to ne...@googlegroups.com
Hi Benny, 

perhaps it makes sense to cross-post this model to the neo4j-biotech group whose members are more involved with biological models?

Cheers,

Michael

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages