Query performance on distinct paths

61 views
Skip to first unread message

tomasz....@closeit.cz

unread,
Jan 16, 2017, 9:16:48 PM1/16/17
to Neo4j
Hi, I have a very expensive query that I'm trying to figure out how to optimise.

match (ea:nodeType1 {name:"something1"})<-[:maps_to*]-(eb:nodeType1) with distinct ea, eb match (eb) where eb.fullname starts with "something" return ea.nameeb.name;

I've used the profiler and as expected the expand all is the most expensive part of the operation

+Filter                            |   128 | 334812480 |   669618290 | |
+VarLengthExpand(All) |   128 | 669618290 | 1494585385 | | (ea)<-[:maps_to*]-(eb)  
                                                                                                       
I've used the profiler and brought down the running time from 3 hours to 2,15. I then tried the enterprise version hoping that the query would use the available processors but it used only one so I'm assuming that cypher is not parallelizable. 

Is there some quick win to speed up a distinct path query or do I have to write my own using the java api?

p.s: the query might not compile as I've tried to summarise the gist of the issue. 

Michael Hunger

unread,
Jan 16, 2017, 9:31:59 PM1/16/17
to ne...@googlegroups.com, Andres Taylor
I presume you have an index / constraint on nodeType1.fullname  AND nodeType1.name?!

not sure why you do the pattern match first and only then the property lookup.

What version are you using? in Neo4j 3.1 for the query below you should see much better performance:

match (ea:nodeType1 {name:"something1"})<-[:maps_to*]-(eb:nodeType1) 
where eb.fullname starts with "something"
with distinct ea, eb 
return ea.nameeb.name;

I would also think that shortest-path is much more sensible here:

match (ea:nodeType1 {name:"something1"})
match (eb:nodeType1)  where eb.fullname starts with "something"
match shortestPath( (ea)<-[:maps_to*]-(eb) )
with distinct ea, eb 
return ea.nameeb.name;

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tomasz....@closeit.cz

unread,
Jan 17, 2017, 1:59:52 AM1/17/17
to Neo4j, andres...@neotechnology.com
Yes, there are indexes.

I was wondering about the shortest path but I need all the paths that end at specific nodes, usually they will not be the shortests. I will try to see if the results are the same with this modification.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.

tomasz....@closeit.cz

unread,
Jan 17, 2017, 4:46:33 AM1/17/17
to Neo4j, andres...@neotechnology.com
Did the change and the time went down to 20 seconds!

Incredible. 

Thanks for the hint!

Michael Hunger

unread,
Jan 17, 2017, 8:47:55 AM1/17/17
to ne...@googlegroups.com, Andres Taylor
Which one did the change and would you be able to share the query plan for that fast query with us?

Michael

To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscribe@googlegroups.com.

tomasz....@closeit.cz

unread,
Jan 17, 2017, 11:51:51 AM1/17/17
to Neo4j, andres...@neotechnology.com
The shortestPath: shortestPath( (ea)<-[:maps_to*]-(eb) )

before:
+VarLengthExpand(All) |   128 | 669618290 | 1494585385 | | (ea)<-[:maps_to*]-(eb)

after:
| +ShortestPath         |       28884702 |       678 |  169955 |
| +CartesianProduct  |       28884702 | 169955 |            0 |  

Since I have a lot of duplicate paths I assume the shortest path only traverses a path once giving me a distinct path. Before I would traverse all different path combinations (some of them multiple times) and then filter them out with a distinct. 

I think I misunderstood what the shortest path algorithm does. 

Marwa Elabri

unread,
Jan 28, 2017, 2:25:20 PM1/28/17
to Neo4j, andres...@neotechnology.com
Hi pleae I can not use shortestPath when I want to return relationship properties??

for example I made like this 
MATCH shortestPath((n:movie)<-[r:rating]-(m:user))
with distinct n,m,r
return n.genre, m.age,r.likes

but I got an error  
Expected `r` to be a Map but it was a Collection<Relationship>
thanks in advance


thanks in advance
Reply all
Reply to author
Forward
0 new messages