Count(*) is very slow on a big database

38 views
Skip to first unread message

matias...@brinqa.com

unread,
Apr 19, 2016, 6:41:32 PM4/19/16
to Neo4j
I'm using Neo4j 2.3.3.

The query is very simple:

neo4j-sh (?)$ MATCH (n:`label1`:`label2`), n-[r1:rel1]->n1 WHERE (n1.`name` IN ["name1","name2"] OR NOT(r1 IS NOT NULL)) RETURN COUNT(n) AS count;
+--------+
| count  |
+--------+
| 462059 |
+--------+
1 row
2894 ms


Any ideas on how to improve it?

Michael Hunger

unread,
Apr 20, 2016, 2:57:47 AM4/20/16
to ne...@googlegroups.com
Double Negation ?
r1 is never null
Add a label to n1
Use a Union
Look at the query plan with profile

Try this:


MATCH (n:`label1`:`label2`) WHERE size( (n)-[:rel1]->() ) = 0
RETURN COUNT(n) AS count;
UNION
MATCH (n:`label1`:`label2`)-[r1:rel1]->(n1:Label3) WHERE n1.`name` IN ["name1","name2"] 
RETURN COUNT(n) AS count;


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

matias...@brinqa.com

unread,
Apr 20, 2016, 9:55:59 AM4/20/16
to Neo4j
Thanks Michael. 
Yeah, double negation because it's part of a more complex logic that has that result, but even removing that condition is slow.
We'll try your proposal anyway.

mvi...@brinqa.com

unread,
Apr 22, 2016, 5:33:32 AM4/22/16
to Neo4j
Michael,

the label in the relation's target, sped up the query a little bit, but count looks slow if I do a simple count just by label, it takes too much, and I tried to run it several times but I don't see any improvement in query execution time, like the query is not cached at all

neo4j-sh (?)$ MATCH (n:`label`) RETURN count(n) as count;
+---------+
| count   |
+---------+
| 1122727 |
+---------+
1 row
760 ms

mvi...@brinqa.com

unread,
Apr 22, 2016, 5:33:32 AM4/22/16
to Neo4j
Michael,

Including the label in the relationship's target, makes the cypher query run faster, but a count with just the relationship is taking 496ms, it looks too much.
  1. neo4j-sh (?)$ MATCH (n:`label1`)-[r1:rel1]->(n1:`label2`) RETURN COUNT(n);
  2. +----------+
  3. | COUNT(n) |
  4. +----------+
  5. | 462059   |
  6. +----------+
  7. 1 row
  8. 496 ms

Michael Hunger

unread,
Apr 22, 2016, 5:48:20 AM4/22/16
to ne...@googlegroups.com
these ones will be single digit ms in Neo4j 3.0

  1.  MATCH (:`label1`)-[:rel1]->() RETURN COUNT(*);
  1.  MATCH ()-[:rel1]->(:`label2`) RETURN COUNT(*);
Reply all
Reply to author
Forward
0 new messages