Very slow cypher query, how to optimize it ?

70 views
Skip to first unread message

Michael Azerhad

unread,
Mar 15, 2016, 5:30:57 PM3/15/16
to Neo4j
Using Neo4J 2.1.5.

Data: 
  • 2000 Persons
  • KNOWS relationships between some of them

Goal of the query: 

For each person, display her fullname + amount of friends + amount of friends' friends +  amount of friends' friends' friends. 

MATCH (person:Person)
WITH person
OPTIONAL MATCH person
-[:KNOWS]-(p:Person)
WITH person
, count(p) as f1
OPTIONAL MATCH path
= shortestPath(person-[:KNOWS*..2]-(f2:Person))
WHERE length
(path) = 2
WITH count
(nodes(path)[-1]) AS f2, person, f1
OPTIONAL MATCH path
= shortestPath(person-[:KNOWS*..3]-(f3:Person))
WHERE length
(path) = 3
WITH count
(nodes(path)[-1]) AS f3, person, f2, f1
RETURN person
._firstName + " " + person._lastName, f1, f2, f3


This query lasts long ! : 60 seconds ! 

Is there a tips to optimize it?

Thanks,

Michael

Michael Hunger

unread,
Mar 15, 2016, 7:03:52 PM3/15/16
to ne...@googlegroups.com
try this:

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(f1:Person)-[:KNOWS]-(f2:Person)-[:KNOWS]-(f3:Person)
WITH person, count(distinct f1) as f1, count(distinct f2) as f2,count(distinct f3) as f3

RETURN person._firstName + " " + person._lastName, f1, f2, f3

alternatively

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(f1:Person)-[:KNOWS]-(f2:Person)
WITH person, count(distinct f1) as f1, count(distinct f2) as f2,collect(distinct f2) as p2
WITH person, f1,f2, reduce(x=0, p in p2 | x + size((p2)-[:KNOWS]-()) ) as f3

RETURN person._firstName + " " + person._lastName, f1, f2, f3

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Azerhad

unread,
Mar 15, 2016, 7:25:49 PM3/15/16
to Neo4j
Hi Michael,

I think your query would result wrong results. 

Michael - KNOWS - Julia - KNOWS - Robert - KNOWS - Michael (same Michael as previous one)

It would consider Robert as a friend's friend for Michael (Michael - Julia - Robert). 
However, it is his direct friend ... (Robert - Michael)
I think your solution can't work with cyclic graph ;) 

That's why I used shortestPath function, that clearly avoids that.

What do you think?

Michael Hunger

unread,
Mar 15, 2016, 7:28:44 PM3/15/16
to ne...@googlegroups.com
your shortest-path doesn't imply that it's not the same person.

Robert would also be counted as direct friend.
Message has been deleted

Michael Azerhad

unread,
Mar 15, 2016, 7:30:51 PM3/15/16
to Neo4j
Yes, but it would never be counted as a friend of friend. 

Michael Azerhad

unread,
Mar 15, 2016, 7:42:23 PM3/15/16
to Neo4j
shortestPath returns the right results. 
Without it, wrong results.

Michael Hunger

unread,
Mar 22, 2016, 9:25:55 PM3/22/16
to ne...@googlegroups.com
you can also use the path directly then you can specify a minimum length too and you should use a newer version of Neo4j.

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(p:Person)
WITH person, count(p) as f1
OPTIONAL MATCH (
person)-[:KNOWS*2]-(f2:Person)
WITH count(distinct f2) as
f2, person, f1
OPTIONAL MATCH
(person)-[:KNOWS*3]-(f3:Person)
WITH count
(distinct f3) AS f3, person, f2, f1
RETURN person
._firstName + " " + person._lastName, f1, f2, f3

you should also profile your query, then you see how much data it touches.


Something else that should work too is:

MATCH (person:Person)
OPTIONAL MATCH path = (person)-[:KNOWS*..3]-(:Person)
WITH person, count(distinct nodes(path)[1]) as f1, count(distinct nodes(path)[2]) as f2, count(distinct nodes(path)[3]) as f3
RETURN person._firstName + " " + person._lastName,f1,f2,f3

Michael

Am 16.03.2016 um 00:42 schrieb Michael Azerhad <michael...@gmail.com>:

shortestPath returns the right results. 
Without it, wrong results.

Michael Azerhad

unread,
Mar 22, 2016, 9:29:12 PM3/22/16
to ne...@googlegroups.com
Thanks for your suggestion Michael.

However, I managed to do the trick with this query: 

MATCH (person:Person)
WITH person
OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person)
WITH person, COALESCE(COLLECT(p1),[]) AS p1s 
WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s
UNWIND p1s AS p1
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
WHERE NOT ((p2 = person) OR (p2 IN p1s))
WITH person, p1s, COALESCE(COLLECT(distinct p2),[]) AS p2s
WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s 
UNWIND p2s AS p2
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s))
WITH person,
  CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1,
  CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2,
  COUNT(distinct p3) AS f3
RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total
ORDER BY f1 desc

Very fast :) 

You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/5YkOeCylHEY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages