BatchInserter Deprecated?

86 views
Skip to first unread message

Marko Rodriguez

unread,
Apr 28, 2012, 1:35:47 PM4/28/12
to ne...@googlegroups.com
Hello,

I noticed that BatchInserter and BatchInserterIndexProvider and both deprecated. What are the new classes for this functionality?

Thank you,
Marko.

http://markorodriguez.com

Peter Neubauer

unread,
Apr 28, 2012, 3:54:41 PM4/28/12
to ne...@googlegroups.com
Marko,
since the BatchInserter is more of a unsafe utility to use in initial
imports and has far less guarantees than the full API, we moved it to
https://github.com/neo4j/community/tree/master/kernel/src/main/java/org/neo4j/unsafe/batchinsert
where it better signals the risks you are taking ;)

Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

If you can write, you can code - @coderdojomalmo
If you can sketch, you can use a graph database - @neo4j

Pablo Pareja

unread,
Apr 28, 2012, 4:20:10 PM4/28/12
to ne...@googlegroups.com
Does that mean that maybe after a few more releases it won't be supported anymore??
Cheers,

Pablo

Michael Hunger

unread,
Apr 28, 2012, 4:23:52 PM4/28/12
to ne...@googlegroups.com
Right now that's not on the horizon. It was only deprecating the code in the old package and having it delegate to the place in the new package.

Michael

Pablo Pareja

unread,
Apr 28, 2012, 4:32:04 PM4/28/12
to ne...@googlegroups.com
Ok.
By the way, do you have any numbers of how much is actually the difference (in terms of insertion time) between the standard API and the Batch insertion mode? 

Pablo

Michael Hunger

unread,
Apr 28, 2012, 4:39:47 PM4/28/12
to ne...@googlegroups.com
depends on many factors:

- tx size
- do you add to indexes
- do you query indexes during insert
- available memory
- disk performance (for tx-flushes)

- transactions have to be kept in memory (thread local) the use much more concurrency aware datastructures (e.g. CHM) and they must be flushed to disk during commit

I had no issues inserting 3M nodes/s with the batch inserter on my Mac.

Pablo Pareja

unread,
Apr 28, 2012, 4:48:21 PM4/28/12
to ne...@googlegroups.com
In my case:

- tx size --> (It would be very variable mainly because of the index retrieving/flushing operations)
- do you add to indexes --> YES
- do you query indexes during insert --> YES
- available memory --> I could have up to 32 GB RAM
- disk performance (for tx-flushes) --> I don't know....

Pablo

Michael Hunger

unread,
Apr 28, 2012, 5:01:17 PM4/28/12
to ne...@googlegroups.com
Pablo,

perhaps you can find the time to write a quick data generator (in memory no file reading) and test it with the batch-inserter and the core-API and publish it as a GH project (and blog post)?

That would be awesome.

Cheers

Michael

Pablo Pareja

unread,
Apr 28, 2012, 5:53:30 PM4/28/12
to ne...@googlegroups.com
Hi Michael,

Yeah, that'd be a good idea. I'll get into it if I can find some time.

Cheers,

Pablo
Reply all
Reply to author
Forward
0 new messages