If we break down this query a bit. You want Neo4j to find all the paths
between one node and all other nodes in your graph, that are connected by
up to 5 steps. And, the same node might appear multiple times - you are
asking for all the paths to all the nodes 5 steps away. How connected are
your nodes? Take that number, and raise it by 5. I your nodes each have ~10
connections, we're quickly building up a lot of stuff to look at.
Next, you use the aggregate function min(), which forces your query to be
eager, and not lazy. So all the stuff has to be kept in the heap until the
query is finished.
Cypher is doing what you asked it to do, and you asked for something that
takes a whole lot of work. No surprise there, right?
> I think the issue comes from the 4th line with the question mark "?".
> Indeed, if I try to execute the following query it works. (Except that the
> results does not fit my need) :
> n = node:names(email='t...
> p = n-[r:connect*1..5]->m,
> m <> n return distinct m,
> order by min(length(p))
> skip 0 limit 5
What's happening here is that, since you removed the question mark, Cypher
finds all nodes that are between 1-5 relationships away, and throws out any
that isn't directly connected to n. Much less stuff to keep in heap, and so
much less to aggregate on.
> So I wonder if it is something which can be fixed on the Neo4j side, or my
> query is simply bad designed ?
We're working on Cypher performance, but this a heavy query, and will
probably never be very fast. Do you really need to look five steps out?
There's a reason facebook and linkedin and the like don't look so many