Very slow cypher query, how to optimize it ?

Michael Azerhad

unread,

Mar 15, 2016, 5:30:57 PM3/15/16

to Neo4j

Using Neo4J 2.1.5.

Data:

2000 Persons
KNOWS relationships between some of them

Goal of the query:

For each person, display her fullname + amount of friends + amount of friends' friends + amount of friends' friends' friends.


MATCH (person:Person) 
WITH person 
OPTIONAL MATCH person-[:KNOWS]-(p:Person) 
WITH person, count(p) as f1 
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..2]-(f2:Person)) 
WHERE length(path) = 2 
WITH count(nodes(path)[-1]) AS f2, person, f1 
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..3]-(f3:Person)) 
WHERE length(path) = 3 
WITH count(nodes(path)[-1]) AS f3, person, f2, f1 
RETURN person._firstName + " " + person._lastName, f1, f2, f3

This query lasts long ! : 60 seconds !

Is there a tips to optimize it?

Thanks,

Michael

Michael Hunger

unread,

Mar 15, 2016, 7:03:52 PM3/15/16

to ne...@googlegroups.com

try this:

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(f1:Person)-[:KNOWS]-(f2:Person)-[:KNOWS]-(f3:Person)
WITH person, count(distinct f1) as f1, count(distinct f2) as f2,count(distinct f3) as f3

RETURN person._firstName + " " + person._lastName, f1, f2, f3

alternatively

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(f1:Person)-[:KNOWS]-(f2:Person)
WITH person, count(distinct f1) as f1, count(distinct f2) as f2,collect(distinct f2) as p2

WITH person, f1,f2, reduce(x=0, p in p2 | x + size((p2)-[:KNOWS]-()) ) as f3

RETURN person._firstName + " " + person._lastName, f1, f2, f3

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Azerhad

unread,

Mar 15, 2016, 7:25:49 PM3/15/16

to Neo4j

Hi Michael,

I think your query would result wrong results.

Michael - KNOWS - Julia - KNOWS - Robert - KNOWS - Michael (same Michael as previous one)

It would consider Robert as a friend's friend for Michael (Michael - Julia - Robert).

However, it is his direct friend ... (Robert - Michael)

I think your solution can't work with cyclic graph ;)

That's why I used shortestPath function, that clearly avoids that.

What do you think?

Michael Hunger

unread,

Mar 15, 2016, 7:28:44 PM3/15/16

to ne...@googlegroups.com

your shortest-path doesn't imply that it's not the same person.

Robert would also be counted as direct friend.

Message has been deleted

Michael Azerhad

unread,

Mar 15, 2016, 7:30:51 PM3/15/16

to Neo4j

Yes, but it would never be counted as a friend of friend.

Michael Azerhad

unread,

Mar 15, 2016, 7:42:23 PM3/15/16

to Neo4j

shortestPath returns the right results.

Without it, wrong results.

Michael Hunger

unread,

Mar 22, 2016, 9:25:55 PM3/22/16

to ne...@googlegroups.com

you can also use the path directly then you can specify a minimum length too and you should use a newer version of Neo4j.

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(p:Person)

WITH person, count(p) as f1
OPTIONAL MATCH (person)-[:KNOWS*2]-(f2:Person)
WITH count(distinct f2) as f2, person, f1
OPTIONAL MATCH (person)-[:KNOWS*3]-(f3:Person)
WITH count(distinct f3) AS f3, person, f2, f1
RETURN person._firstName + " " + person._lastName, f1, f2, f3

you should also profile your query, then you see how much data it touches.

Something else that should work too is:

MATCH (person:Person)

OPTIONAL MATCH path = (person)-[:KNOWS*..3]-(:Person)

WITH person, count(distinct nodes(path)[1]) as f1, count(distinct nodes(path)[2]) as f2, count(distinct nodes(path)[3]) as f3

RETURN person._firstName + " " + person._lastName,f1,f2,f3

Michael

Am 16.03.2016 um 00:42 schrieb Michael Azerhad <michael...@gmail.com>:

shortestPath returns the right results.
Without it, wrong results.

Michael Azerhad

unread,

Mar 22, 2016, 9:29:12 PM3/22/16

to ne...@googlegroups.com

Thanks for your suggestion Michael.

However, I managed to do the trick with this query:

MATCH (person:Person)

WITH person

OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person)

WITH person, COALESCE(COLLECT(p1),[]) AS p1s 

WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s

UNWIND p1s AS p1

OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)

WHERE NOT ((p2 = person) OR (p2 IN p1s))

WITH person, p1s, COALESCE(COLLECT(distinct p2),[]) AS p2s

WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s 

UNWIND p2s AS p2

OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)

WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s))

WITH person,

  CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1,

  CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2,

  COUNT(distinct p3) AS f3

RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total

ORDER BY f1 desc

Very fast :)

You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/5YkOeCylHEY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com.

Reply all

Reply to author

Forward