Benchmarking

43 views
Skip to first unread message

Nigel Small

unread,
Apr 28, 2012, 7:37:59 PM4/28/12
to ne...@googlegroups.com
Since there has been a quite a lot of discussion around performance recently, particularly with regards to REST, I have pushed a few scripts to GitHub that I am currently using to help tune py2neo. The current tests that I'm running perform a comparison between batch insertion and Cypher insertion. If anyone's interested in having a look, the repository is at:

https://github.com/nigelsmall/neobench

Nige

P.S. the results from these tests are also hopefully going to form the basis of an upcoming blog article!

Michael Hunger

unread,
Apr 28, 2012, 8:13:32 PM4/28/12
to ne...@googlegroups.com
Nigel,

nice work.

Could you please change the following things for cypher:

#1 submit node data through parameters
#2 as cypher caches it's parsed queries you shouldn't create a new gdb and with that a new execution-engine for each run 
#3 try to use a foreach operation, submit as parameter a collection of maps foreach (p in {all} : create {p})

Cheers

Michael

Michael Hunger

unread,
Apr 28, 2012, 8:39:36 PM4/28/12
to ne...@googlegroups.com
Ok,

there seems to be an issue with the foreach but the node creation with parameters seems to be pretty fast

+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1000
Properties set: 1000
440 ms

Creating 1000 nodes took 467 ms.

here is the (ugly) test code. creating the query once and reuse it.

create parameters as often as you'd like.

    @Test
    public void testCreateNodePerformance2() throws Exception {
        Map params=new HashMap(COUNT);
        StringBuilder query=new StringBuilder("create ");
        for (int i=0;i<COUNT;i++) {
            params.put("_" + i, map("id", i));
            query.append(" n = {_"+i+"} ");
            if (i<COUNT-1) query.append(", ");
        }
        final ExecutionEngine executionEngine = new ExecutionEngine(gdb);
        // warmup
        final ExecutionResult result = executionEngine.execute(query.toString(), params);
        long time=System.currentTimeMillis();
        executionEngine.execute(query.toString(), params);
        System.out.println(result);
        System.out.println("Creating "+COUNT+" nodes took "+(System.currentTimeMillis()-time)+" ms.");
    }


Am 29.04.2012 um 01:37 schrieb Nigel Small:

Nigel Small

unread,
Apr 29, 2012, 4:29:44 AM4/29/12
to ne...@googlegroups.com
Hi Michael

#1 Yep, no problem - will use your sample code
#2 I'm deliberately *not* doing this - each test run creates exactly the same number of nodes and I want that to be under identical conditions - if I allow caching to affect results, later tests will be skewed by the tests which ran before them
#3 I'll wait to hear more on the issue you found

Nige

Michael Hunger

unread,
Apr 29, 2012, 4:34:00 AM4/29/12
to ne...@googlegroups.com
#2 will hit you with a massive performance impact then as cypher has
to parse the query every time

Nigel Small

unread,
Apr 29, 2012, 4:54:11 AM4/29/12
to ne...@googlegroups.com
Is there any documentation available on how and when Cypher caching kicks in? If I understand exactly what I'm aiming for, I can try to come up with a fair test which exploits it.

Michael Hunger

unread,
Apr 29, 2012, 5:14:13 AM4/29/12
to ne...@googlegroups.com
I'm not sure. It caches the lat 100 parsed queries in a LRU cache. This happens per execution engine instance.

Sent from mobile device
Reply all
Reply to author
Forward
0 new messages