batch importer arrays + 2.0 VS 2.1

14 views
Skip to first unread message

gg4u

unread,
Aug 22, 2014, 10:59:14 AM8/22/14
to ne...@googlegroups.com
Hello,

I am following documentation of batch importer 2.0 and 2.1.

I got different errors in the two version, I cannot interpret

I am also trying to upload an array of strings as properties of nodes, comma separated, and a full description.

Here's some results:
V.2.0

The importer does not start and failed with:

Total import time: 1 seconds
Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.String; cannot be cast to java.lang.Comparable
       at org.mapdb.Fun$Tuple2.compareTo(Fun.java:75)
 at org.mapdb.Utils$1.compare(Utils.java:44)
    at org.mapdb.Utils$1.compare(Utils.java:41)
    at org.mapdb.BTreeMap.findChildren(BTreeMap.java:543)
  at org.mapdb.BTreeMap.put2(BTreeMap.java:652)
  at org.mapdb.BTreeMap.put(BTreeMap.java:607)
   at org.mapdb.BTreeMap$KeySet.add(BTreeMap.java:1588)
   at org.neo4j.batchimport.index.MapDbCachingIndexProvider$CachingBatchInserterIndex.add(MapDbCachingIndexProvider.java:76)
      at org.neo4j.batchimport.Importer.importNodes(Importer.java:109)
       at org.neo4j.batchimport.Importer.doImport(Importer.java:228)
  at org.neo4j.batchimport.Importer.main(Importer.java:83)

V. 2.1 (superfast)

The importer DOES start with errors, but it goes incredbly slow, going on line by line:
Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.String; cannot be cast to java.lang.Comparable
at org.mapdb.Fun$Tuple2.compareTo(Fun.java:75)
at org.mapdb.Utils$1.compare(Utils.java:44)
at org.mapdb.Utils$1.compare(Utils.java:41)
at org.mapdb.BTreeMap.findChildren(BTreeMap.java:543)
at org.mapdb.BTreeMap.put2(BTreeMap.java:652)
at org.mapdb.BTreeMap.put(BTreeMap.java:607)
at org.mapdb.BTreeMap$KeySet.add(BTreeMap.java:1588)
at org.neo4j.batchimport.index.MapDbCachingIndexProvider$CachingBatchInserterIndex.add(MapDbCachingIndexProvider.java:76)
at org.neo4j.batchimport.Importer.importNodes(Importer.java:109)
at org.neo4j.batchimport.Importer.doImport(Importer.java:228)
at org.neo4j.batchimport.Importer.main(Importer.java:83)
Luigi-Assoms-MacBook-Pro:batch-import-20 gg4u$ cd ..
Luigi-Assoms-MacBook-Pro:TD gg4u$ cd batch_importer_21/Luigi-Assoms-MacBook-Pro:batch_importer_21 gg4u$ ./import.sh taste_it2.db -nodes node_td2.csv -rels rels_td.csv
Neo4j Data Importer
Importer -db-directory <graph.db> -nodes <nodes.csv> -rels <rels.csv> -debug <debug config>

Using Existing Configuration File
[Current time:2014-08-22 16:54:38.716][Compile Time:Importer $ batch-import-2.1.0 $ 31/05/2014 04:12:24]
Node Import: [1] Property[22417] Node[14632] Relationship[0] Label[1] Disk[1 mb,Node Import: [2] Property[28339] Node[14632] Relationship[0] Label[1] Disk[1 mb,Node Import: [3] Property[32107] Node[16537] Relationship[0] Label[1] Disk[1 mb,Node Import: [4] Property[32107] Node[16537] Relationship[0] Label[1] Disk[1 mb, 0 mb/sec] FreeMem[3468 mb]
[2014-08-22 16:54:45.597] Node file [node_td2.csv] imported in 6 secs - [Property[32107] Node[16537] Relationship[0] Label[1]]
[2014-08-22 16:54:45.597]Node Import complete in 6 secs - [Property[32107] Node[16537] Relationship[0] Label[1]]org.neo4j.kernel.api.Exceptions.BatchImportException: Index out of bounds:16537
at org.neo4j.batchimport.importer.structs.ExtendableLongCache.get(ExtendableLongCache.java:40)
at org.neo4j.batchimport.importer.structs.NodesCache.getField(NodesCache.java:47)
at org.neo4j.batchimport.importer.structs.NodesCache.changeCount(NodesCache.java:57)
at org.neo4j.batchimport.importer.structs.NodesCache.incrementCount(NodesCache.java:65)
at org.neo4j.unsafe.batchinsert.BatchInserterImplNew.accumulateNodeCount(BatchInserterImplNew.java:746)
at org.neo4j.batchimport.importer.stages.NodeStatsAccumulatorStage$2.execute(NodeStatsAccumulatorStage.java:24)
at org.neo4j.batchimport.importer.stages.ImportWorker.processData(ImportWorker.java:144)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:196)
Invoke stage method failed:ImportNode_Stage1:[Error in accumulateNodeCount - Index out of bounds:16537]:1
org.neo4j.kernel.api.Exceptions.BatchImportException: [Error in accumulateNodeCount - Index out of bounds:16537]
at org.neo4j.batchimport.importer.stages.ImportWorker.processData(ImportWorker.java:152)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:196)
Import worker:ImportNode_Stage1:[Error in accumulateNodeCount - Index out of bounds:16537]
Uncaught exception: java.lang.RuntimeException: [Error in accumulateNodeCount - Index out of bounds:16537]
Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
org.neo4j.kernel.api.Exceptions.BatchImportException: Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
at org.neo4j.batchimport.importer.stages.ImportWorker.readData(ImportWorker.java:127)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:194)
Import worker:ImportNode_Stage4:Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)nullorg.neo4j.kernel.api.Exceptions.BatchImportException: Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null

Uncaught exception: java.lang.RuntimeException: Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
at org.neo4j.batchimport.importer.stages.ImportWorker.readData(ImportWorker.java:127)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:194)
Import worker:ImportNode_Stage4:Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
Uncaught exception: java.lang.RuntimeException: Exception in getBuffer:java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)null
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:321)
at org.neo4j.batchimport.importer.structs.DataBufferBlockingQ.putBuffer(DataBufferBlockingQ.java:243)
at org.neo4j.batchimport.importer.stages.ImportWorker.writeData(ImportWorker.java:160)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:198)
Import worker:ImportNode_Stage0:null
Uncaught exception: java.lang.RuntimeException
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:321)
at org.neo4j.batchimport.importer.structs.DataBufferBlockingQ.putBuffer(DataBufferBlockingQ.java:243)
at org.neo4j.batchimport.importer.stages.ImportWorker.writeData(ImportWorker.java:160)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:198)
Import worker:ImportNode_Stage0:null
Uncaught exception: java.lang.RuntimeException
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:321)
at org.neo4j.batchimport.importer.structs.DataBufferBlockingQ.putBuffer(DataBufferBlockingQ.java:243)
at org.neo4j.batchimport.importer.stages.ImportWorker.writeData(ImportWorker.java:160)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:198)
Import worker:ImportNode_Stage0:null
Uncaught exception: java.lang.RuntimeException
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:321)
at org.neo4j.batchimport.importer.structs.DataBufferBlockingQ.putBuffer(DataBufferBlockingQ.java:243)
at org.neo4j.batchimport.importer.stages.ImportWorker.writeData(ImportWorker.java:160)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:198)
Import worker:ImportNode_Stage0:null
Uncaught exception: java.lang.RuntimeException
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:321)
at org.neo4j.batchimport.importer.structs.DataBufferBlockingQ.putBuffer(DataBufferBlockingQ.java:243)
at org.neo4j.batchimport.importer.stages.ImportWorker.writeData(ImportWorker.java:160)
at org.neo4j.batchimport.importer.stages.ImportWorker.run(ImportWorker.java:198)
Import worker:ImportNode_Stage0:null
Uncaught exception: java.lang.RuntimeException
Relationship Prescan: [1] Property[32107] Node[16537] Relationship[0] Label[1] DRelationship Prescan: [2] Property[32107] Node[16537] Relationship[0] Label[1] DRelationship Prescan: [3] Property[32107] Node[16537] Relationship[0] Label[1] DRelationship Prescan: [4] Property[32107] Node[16537] Relationship[0] Label[1] DRelationship Prescan: [5] Property[32107] Node[16537] Relationship[0] Label[1] Disk[1 mb, 0 mb/sec] FreeMem[3024 mb]



About properties: by header is like:

id:int:myindex M24:label name:string:mynodename   properties1:string_array:myindexforproperties1 properties2: string:myindexforproperties2


Can you help figure out what am i doing wrong?

Michael Hunger

unread,
Aug 28, 2014, 6:12:11 PM8/28/14
to ne...@googlegroups.com
Seems you have and index on an array property which is not recommended and currently also not supported.

You can circumvent the current limitation by configuring the 
batch_import.mapdb_cache.disable=true
Why do you index everything????
id:int:myindex<tab>M24:label<tab>name:string:mynodename<tab>properties1:string_array:myindexforproperties1<tab>properties2: string:myindexforproperties2

Can you also share a few lines of your data?

Please use the current branch of the 2.1 batch-importer the one you used was an experimental version not for public consumption


The superfast importer doesn't support external id's or indexes afaik. So probably it's not for you.

Cheers,

Michael

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages