Speed of Neo4jGraph

Nicolas Delsaux

unread,

Feb 17, 2012, 3:15:04 AM2/17/12

to Gremlin-users

Hi,

For an open-source project I'm working on (a kind of "lightweight" JPA
without JPA), I've written a graph persistence layer relying upon
Blueprint's IndexableGraph implementation.
For each object, I'm writing one node per property, and one edge per
link between property and parent object.
To test my persistence layer, I've created a small class that gets
persisted using three implementations of IndexableGraph. I obtain the
following durations (all in seconds)

TinkerGraph 0.226 0.218 0.219 0.219 0.237
OrientGraph 0.66 0.674 0.647 0.709 0.67
Neo4jGraph (default) 13.114 13.71 15.664 14.405 13.79
Neo4jGraph (max buffer size set to 1000000) 3.225 2.774 2.789 2.996
2.784

Obviously, one couldn't understand these figures without an access to
the code.
Well, all source code is available here : https://svn.origo.ethz.ch/gaedo/
as a part of gaedo project (http://gaedo.origo.ethz.ch/). Hte very
test i'm using can be seen there :
http://svn.origo.ethz.ch/wsvn/gaedo/trunk/gaedo-blueprints/src/test/java/com/dooapp/gaedo/tag/GraphBackedTagFinderServiceTest.java

Given those elements, can one explain me the performance differences ?
(I mean ... how can one see neo4j as an efficient DB if all tests show
it is the slower implementation ?)

Thanks

Peter Neubauer

unread,

Feb 17, 2012, 3:31:45 AM2/17/12

to gremli...@googlegroups.com

Nicolas,
are you running several runs? Neo4j is very conservative in its
loading behavior and does not prefetch per default ... can take a look
at it if you want ...

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Neo4j 1.6 released - dzone.com/6S4K
The Neo4j Heroku Challenge - http://neo4j-challenge.herokuapp.com/

Nicolas Delsaux

unread,

Feb 17, 2012, 3:46:26 AM2/17/12

to Gremlin-users

On 17 fév, 09:31, Peter Neubauer <peter.neuba...@neotechnology.com>
wrote:

> Nicolas,
> are you running several runs? Neo4j is very conservative in its
> loading behavior and does not prefetch per default ... can take a look
> at it if you want ...

Time to detail a few things, it seems.

These tests (and the very gaedo-blueprints project) has been created
due to previous issues (on which we alreaady exchanged mails on neo4j-
users ML, I think) with Empire-RDF running atop of Sesame/
SailGraph/.../Blueprints/neo4j. As we had very low control over that
stack, we tried (with this project) to attack the problem at a lower
level, which revealed these figures.

Each test run loads an instance of graph, performs test, then shutdown
graph and delete its folder.

The goal here is to perform all tests in fresh environment. I
perfectly understand it may not sound sensible, as I guess neo4j may
be optimized for higher loads.

However, I find the order of magnitude in speed difference very
disturbing.

Nevertheless, I think I'll try to add a more realistic test, loading
hundreds/thousands of objects to see what happens.

Do you think it could make any difference ?

Peter Neubauer

unread,

Feb 17, 2012, 6:41:30 AM2/17/12

to gremli...@googlegroups.com

Yes,
also, if you do just one run, as mentioned, Neo4j will not fetch
eagerly to the caches, so you are essentially testing cold
performance.

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Neo4j 1.6 released - dzone.com/6S4K
The Neo4j Heroku Challenge - http://neo4j-challenge.herokuapp.com/

Peter Neubauer

unread,

Feb 20, 2012, 4:21:01 PM2/20/12

to gremli...@googlegroups.com

Just as an update,

running the test with higher loads and more runs DOES make a difference. Also, Lucene is (as usual) not the fastest index when doing micro-benchmarks, but one of the most constant performing which is why we use if often with Neo4j.

I think Nicolas will come back with more results as things are progressing.

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Neo4j 1.6 released - dzone.com/6S4K
The Neo4j Heroku Challenge - http://neo4j-challenge.herokuapp.com/

Nicolas Delsaux

unread,

Feb 24, 2012, 9:28:10 AM2/24/12

to Gremlin-users

On 20 fév, 22:21, Peter Neubauer <peter.neuba...@neotechnology.com>
wrote:

> Just as an update,
> running the test with higher loads and more runs DOES make a difference.
> Also, Lucene is (as usual) not the fastest index when doing
> micro-benchmarks, but one of the most constant performing which is why we
> use if often with Neo4j.
>
> I think Nicolas will come back with more results as things are progressing.
>

Yes, indeed.

With the help of Peter and Michael, I discovered (running some tests
with 10k objects and more) that in fact the writing of one node could
go as fast as 1 ms per vertex write, which is kinda cool.

So thanks to Michael and Peter for their help.

Reply all

Reply to author

Forward