Hello,
I am following the advice of Rik,
it is real promising!
I have still issues when using my own custom csv with the batch importer:
Exception in thread "main" org.neo4j.graphdb.NotFoundException: id=39
at org.neo4j.unsafe.batchinsert.BatchInserterImpl.getNodeRecord(BatchInserterImpl.java:1215)
at org.neo4j.unsafe.batchinsert.BatchInserterImpl.createRelationship(BatchInserterImpl.java:777)
at org.neo4j.batchimport.Importer.importRelationships(Importer.java:154)
at org.neo4j.batchimport.Importer.doImport(Importer.java:232)
at org.neo4j.batchimport.Importer.main(Importer.java:83)
I think i am closer to this mega-import! (Hope really so :P)
Could you please help in figuring out what may the problem be?
My hypothesis
1. I thought it is because it cannot find a node, while it is written as start/end of a relationships.
So I checked my nodes.csv and rel.csv, make trivial files with two nodes and one relationships, but still got the error.
2. on the batch importer documention, it is written that
- have to know max # of rels per node, properties per node and relationship
where and how should this be specified? in nodes.csv or rels.csv?
Does the number of relationships be specified in the column 'rels' of nodes.csv as in the test.db example?
But it is not written in the documention example on git. I'm confused!
3. The documention paragraph about schema index is not clear to me: does it means I can use the files node.csv and rels.csv used for the test.db, and modify the header and batch.properties file according to my own custom structure?
What does counter:int property refer to?
Here what i've done!
1. headers in nodes.csv and rels.csv
Nodes.csv headers:
id:int mynamelabel:label name:string:mynodeindex
Rels.csv headers
id:int id:int type proximity counter:int
2. indexes
I want to use my own indexes:
node.id (specified as int) are unique
but not in progressive order. Is it an issue?
E.g. my nodes' list is like:
25 mark
39 julie
What is the difference between an exact index and a fulltext index?
3. My batch.importer
dump_configuration=false
cache_type=none
use_memory_mapped_buffers=true
neostore.propertystore.db.index.keys.mapped_memory=5M
neostore.propertystore.db.index.mapped_memory=5M
# 14 bytes per node
neostore.nodestore.db.mapped_memory=200M
# 33 bytes per relationships
neostore.relationshipstore.db.mapped_memory=4G
# 38 bytes per property
neostore.propertystore.db.mapped_memory=200M
neostore.propertystore.db.strings.mapped_memory=500M
batch_array_separator=,
#batch_import.csv.quotes=true
#batch_import.csv.delim=,
batch_import.keep_db=true
#
batch_import.node_index.mynodeindex=exact
batch_import.node_index.id=exact
batch_import.node_index.node_auto_index=exact
P.s.
Once the db is loaded in neo, are constraint on properties and indexes already present?
I was trying to match a node on test.db, but could not find it with simple query
MATCH (a {label:254782})-[r]-b Return r Limit 25
and the query takes a very long time to compute, making me suspect if the indexes were properly created.
Really thank you for your help!