I am importing 2.3 Billion relationship from a table, The import is not very fast getting a speed on 5Million per hour that will take 20 days to complete the migration. I have heard about the neo4j batch insert and and batch insert utility. The utility do interesting stuff by importing from a csv file but the latest code is some how broken and not running.
I have about 100M relations in neo4j and I have to all check that there shall be no duplicate relationship.
How can I fast the things in neo4j
By current code is like
begin transaction
for 50K relationships
create or get user node for user A
create or get user node for user B
check there is relationship KNOW between A to B if not create the relationhsip
end transactionI have also read the following stuff.
http://docs.neo4j.org/chunked/milestone/batchinsert.htmlhttp://stackoverflow.com/questions/13686850/how-to-speed-up-insertion-in-neo4j-from-mysql
I am importing 2.3 Billion relationship from a table, The import is not very fast getting a speed on 5Million per hour that will take 20 days to complete the migration. I have heard about the neo4j batch insert and and batch insert utility. The utility do interesting stuff by importing from a csv file but the latest code is some how broken and not running.
I have about 100M relations in neo4j and I have to all check that there shall be no duplicate relationship.
How can I fast the things in neo4j
By current code is like
begin transaction for 50K relationships create or get user node for user A create or get user node for user B check there is relationship KNOW between A to B if not create the relationhsip end transactionI have also read the following stuff.
http://docs.neo4j.org/chunked/milestone/batchinsert.htmlhttp://stackoverflow.com/questions/13686850/how-to-speed-up-insertion-in-neo4j-from-mysql
--
--
--
java -server -Xmx30G -jar ../batch-import/target/batch-import-jar-with-dependencies.jar neo4j/data/graph.db nodes.csv rels.csv node_index index exact nodes_index.csvdump_configuration=false cache_type=none use_memory_mapped_buffers=true neostore.propertystore.db.index.keys.mapped_memory=5M neostore.propertystore.db.index.mapped_memory=5M neostore.nodestore.db.mapped_memory=5G neostore.relationshipstore.db.mapped_memory=20G neostore.propertystore.db.mapped_memory=5G neostore.propertystore.db.strings.mapped_memory=100M
--
--