Datamodeling problem when combination of edges matter

33 views
Skip to first unread message

Rasik Fulzele

unread,
Nov 6, 2015, 6:49:13 PM11/6/15
to Neo4j
Hi,

I'm new to neo4j and don't know what this problem is called as. So posting here without much exploration of previous posts.

I'm modeling data in graph but the combination of edges in paths become significant. How to model in such scenario?

for example,
create 
(node1)
,(node2)
,(node3)
,(node4)
,(node5)
,(node6)
,(node7)
,(node1)-[:precedes]->(node5)
,(node2)-[:precedes]->(node5)
,(node3)-[:precedes]->(node5)
,(node4)-[:precedes]->(node5)
,(node5)-[:precedes]->(node6)
,(node5)-[:precedes]->(node7)

when I try to find out list of all paths I'll get total 8 paths. But for my data, only few paths are significant and that should be my output.
ie. only 4 paths should be in output because data (of nodes) has dependencies. 
(node1)-[:precedes]->(node5)-[:precedes]->(node6)
(node3)-[:precedes]->(node5)-[:precedes]->(node6)
(node3)-[:precedes]->(node5)-[:precedes]->(node7)
(node4)-[:precedes]->(node5)-[:precedes]->(node7)

whereas path like (node1)-[:precedes]->(node5)-[:precedes]->(node7) is invalid combination.

Question is how to model such cases so that I always get proper combination of edges.

Thanks,
Rasik

Michael Hunger

unread,
Nov 6, 2015, 7:17:57 PM11/6/15
to ne...@googlegroups.com
perhaps you can be a bit more concrete?

which data has which dependencies

You can also specify predicates on node and relationship-properties for your path

e.g. where a.time < b.time < c.time

Michael

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rasik Fulzele

unread,
Nov 7, 2015, 1:18:52 AM11/7/15
to Neo4j
Thanks for your reply.
Sorry, if I was not clear.
I'll try to elaborate again with other example.

There are 3 under-graduate courses (A,B,C) and 2 graduate courses (M and N). The courses are nodes. also, the transition from undergrad to grad happens through an application process E which is also a node. 

So here are relationships

A precedes E
B precedes E
C precedes E
E precedes N
E precedes M

Now the problem is graduate course N is allowed only for those who did under-grad courses A and B. similarly graduate course M is allowed for under-grad courses B and C. so following paths are invalid and would return wrong results.
A precedes E precedes M
C precedes E precedes N

I think, rather taking care in query, the data modeling should be correct. How to arrange above data in graph so that I get valid paths?

Thanks,
R

Liliana Ziolek

unread,
Nov 7, 2015, 4:21:20 PM11/7/15
to Neo4j
It's hard to say what kind of queries do you want to ask of such graph - from what you said so far I don't see the need for the node "application process". If you want to know from which under-graduate course to which graduate course a student can apply to, then the simplest representation would be:
nodes are A,B,C, M, N as you said, the relationship is pointing directly from A-> N (e.g. A -[:FOLLOWED_BY]->N, B->N, B->M and C->M (then you can ask "which courses allow me to apply for grad course N).

Can you explain a bit more on the place of the application process in your queries? It might also turn out to be a node, but you'll probably still want to keep the direct relationship from under-grad courses into grad courses, simply to indicate which transitions are allowed and which aren't.

Rasik Fulzele

unread,
Nov 7, 2015, 8:52:16 PM11/7/15
to Neo4j
Application process (node E) represent a stage between courses (courses also being stages). The problem is in my app, I can't build a generic query to show series of stages because 90% of times, the transition stages like Application Process (node E) are without any specifics but in 10% of times, selective cases like certain undergrad transition to grad pop up.

Thanks,
R

Liliana Ziolek

unread,
Nov 8, 2015, 6:56:49 AM11/8/15
to Neo4j
Hi Rasik

Assuming that you indeed need to represent the application itself (not just the possible transitions from one stage of education to the next) then remember that relations in a graph are cheap. It might very well be that having A->E (representing that someone applies), E->M (representing that someone is accepted) as well as A->M (representing that students of course A are allowed to apply to course M) might be totally fine. Then to find paths you can do something like MATCH (m:GradCourse) <-- (a:UndergradCourse) --> (x:Application) --> (m) 
This would make sure that you get your x Application node and path (if you indeed need some special properties of the application process) but it would filter out anything where such transition isn't correct.

If your concern is more around "90% of cases don't have restrictions" and the fact that you don't want to link every undergrad course to every grad course then maybe rather than thinking of "application" node being in the middle, you should represent the "any (grad/undergrad) course" as a node, and then link on both sides as appropriate (i.e. link to all undergrad courses which have no restrictions where you can go afterwards, and to all grad courses that have no restrictions on who can apply). On top of that, the 10% you'd link directly for any allowed transitions. If you use the same relationship between that special node and the restricted cases then you should be able to fairly easily find all paths with [*1..2] cardinality for the relationship.
If the restriction is one-sided e.g. undegrad A can generally go anywhere they want, but grad M requires someone to finish A, then there'd be a relation from A -> "any" node (which then links onto all unrestricted courses, i.e. it would NOT link to M), as well as a relation from A -> M directly to represent that special case.

If the application process itself is indeed important, then you could apply a similar concept there - rather have a single "application" node between stages, have "unrestricted application" node and restricted ones, which are specific to whatever limitations you have, and then only appropriate nodes are linked to them.

Hope this helps a little bit with your thinking. :)


You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/ZjS2Eu1LRns/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

Michael Hunger

unread,
Nov 9, 2015, 2:53:44 AM11/9/15
to ne...@googlegroups.com
I think your model is a good start.

I'd probably add a label for :Graduate and :UnderGraduate to :Course nodes.

I think you have two models, the requirement model with a :REQUIRES relationship.
And the application model which involves the student their courses

(:Student)-[:TOOK]->(:Course)
(:Student)-[:APPLIES]->(:Application)-[:APPLIES_FOR]->(:Course)
(:Course)-[:REQUIRES]->(:Course)

I did the example here, hope it helps

Rasik Fulzele

unread,
Nov 11, 2015, 6:55:53 PM11/11/15
to Neo4j
Thanks Michael for putting this thing in lucid way. It would be interesting when I start implementing queries to get my desired paths.

Thanks,
R

Rasik Fulzele

unread,
Nov 11, 2015, 6:58:45 PM11/11/15
to Neo4j
Thanks Liliana for putting the 3 models here. Which one would be more satisfying will depend upon my reporting requirements and next I'll start writing relevant queries. 

Thanks,
R
Reply all
Reply to author
Forward
0 new messages