Hi,
I am experimenting with 2.0.0 and notice that it is approx. an order of magnitude slower to add nodes when a unique constraint is defined, versus an index on the same label and property. I've attached a simple program that demonstrates the behavior, adding increasing numbers of simple, unconnected nodes, with either an index or a constraint.
I'm pretty sure the index and constraint are working properly (in the sense that they improve query performance), based on other tests I've done.
I'm getting org.neo4j:neo4j:2.0.0 from maven central, and running on jdk 1.7.0_40 on Fedora with no particular flags (but apparently max heap of 3GB).
Here's the output of the test program:
indexing...
waiting...
adding 1,000 nodes...
added...
committing...
...done; 0.8s
adding constraint...
adding 1,000 nodes...
added...
committing...
...done; 3.3s
indexing...
waiting...
adding 10,000 nodes...
added...
committing...
...done; 2.9s
adding constraint...
adding 10,000 nodes...
added...
committing...
...done; 44.6s
indexing...
waiting...
adding 100,000 nodes...
added...
committing...
...done; 6.3s
adding constraint...
adding 100,000 nodes...
It runs for quite a while before even getting to the commit on the last transaction, at 100% CPU and with the heap climbing to about 1 GB.
Maybe 100k is too many nodes in a single transaction? But as you can see it's more than 10x slower even at much more modest sizes. I also tried adding the same number of nodes, but spread across many smaller transactions, and it's still much slower with the constraint.
Hopefully I'm doing something dumb here. Can anyone suggest a fix or confirm that this isn't working the way it should?
Thanks,
- moss