USING PERIODIC COMMIT error

62 views

Skip to first unread message

cui r

unread,

Mar 19, 2015, 10:46:12 AM3/19/15

to ne...@googlegroups.com

Hi,

I am trying to use USING PERIODIC COMMIT to load a csv with 0.5 million rows, but never get it work. The error is:

USING PERIODIC COMMIT 1000 load csv with headers from "file:///.../data.csv" as csvline

merge (h:Holding{holdingId:csvline.holdingId}) set ...

The statement has been closed.

Neo.DatabaseError.Statement.ExecutionFailure

I couldn't find any error in the logs (message.log, std err/out, or http). I do see this statement in the http log with return code 200.

Without using periodic commit, it takes about 50 seconds to load the file. With the periodic commit, the error returns within 10 seconds.

Thanks.

Rick

Michael Hunger

unread,

Mar 20, 2015, 8:43:19 AM3/20/15

to ne...@googlegroups.com

Hey Rick, which version did you use for this?

I think it was fixed after that version you have.

Michael

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

cui r

unread,

Mar 23, 2015, 3:45:45 PM3/23/15

to ne...@googlegroups.com

Hi Michael,

I was using M04. I just tried RC01 as you suggested, it worked.

However, it still does not solve my real problem. (Sorry for asking more, :-)).

I have 5.3 million records that I need to load into Neo4j (a cluster of 3 in dev and 6 in prod). With one single file, it's too much. So I chunk it into half-million-record files, 11 files.

Cluster node has 24G memory.

With or without periodic commit, load csv uses total memory quickly and breaks down near 7th or 8th files.

If I reboot the cluster after 6th file, the rest works fine. (On startup, neo4j uses around 3G mem)

At first sight, looks like there is a memory leak?

Also, once it goes back, the db is kind of corrupted - after reboot, if I do another load csv, the error is:

Unable to begin transaction.

Neo.DatabaseError.Statement.ExecutionFailure

From the log, it seems that the server thinks the data is out of sync and then does a copying from the master.

Any suggestions?

Thanks in advance.

Rick

Michael Hunger

unread,

Mar 23, 2015, 4:00:13 PM3/23/15

to ne...@googlegroups.com

Just send it to the master or single machine only

Don't send writes to slaves

Von meinem iPhone gesendet

cui r

unread,

Mar 23, 2015, 5:22:43 PM3/23/15

to ne...@googlegroups.com

Yes, I did send them to the master node. Since I am monitoring mem from "top" so I can see the cpu jumps when the req is coming in.

I can see the mem steadily going up, once it reaches the 24G ceiling I set, it throws out the error. That's when the 7th or 8th file loading starts.

Once the error is thrown, neo4j master somehow notices that its data store is corrupted and stops being the master, then copies the data from another node. Here is the relevant printout:

2015-03-23 21:04:03.238+0000 INFO [Cluster] The store is inconsistent. Will treat it as branched and fetch a new one from the master

2015-03-23 21:04:03.246+0000 INFO [Cluster] Instance 822760924 (this server) is unavailable as backup

2015-03-23 21:04:08.365+0000 INFO [Cluster] ServerId 822760924, moving to slave for master ha://10.49.220.86:6001?serverId=822761180

2015-03-23 21:04:08.426+0000 INFO [Cluster] Copying store from master

2015-03-23 21:04:08.938+0000 INFO [Cluster] Copying index.db

2015-03-23 21:04:08.938+0000 INFO [Cluster] Copied index.db 239.00 B

If you need more info, let me know.

Appreciate it.

Rick

Michael Hunger

unread,

Mar 23, 2015, 7:05:23 PM3/23/15

to ne...@googlegroups.com

Can you also check messages.log and share the load csv statement

The memory should not grow to exceed the heap, esp. with periodic commit

M

Von meinem iPhone gesendet

cui r

unread,

Mar 24, 2015, 9:27:15 AM3/24/15

to ne...@googlegroups.com

I could see the memory increase, with or without periodic commit, starts with 4G, eventually reach 24G.

The jvm setting is:

-server
-XX:NewSize=2g -XX:MaxNewSize=2g
-Xms12g -Xmx24g
-XX:PermSize=512M -XX:MaxPermSize=512M
-verbose:gc -Xloggc:${NEO4J_LOG}/gc.log
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=70

The load csv is:

USING PERIODIC COMMIT load csv with headers from "file:///home/rcui/plexus/plexus-holdings8.csv" as csvline

merge (h:Holding{holdingId:csvline.holdingId}) set h.reportDate=csvline.reportDate,h.fundId=csvline.fundId,h.fundName=csvline.fundName,h.firmName=csvline.companyName,h.cusip=csvline.cusip,h.secGroup=csvline.secGroup,h.positionDate=csvline.positionDate,h.marketValue=toFloat(csvline.marketValue),h.netChange=toFloat(csvline.marketValueChange),h.originalFace=toFloat(csvline.par)

where the number 8 is changed from 1 to 11.

The message log is clean up to number 7, then error is printed:

2015-03-23 20:58:48.220+0000 INFO [o.n.k.i.s.StoreFactory]: Successfully rotated counts store at transaction 51881 to [/local/kozo/neo4j/data/graph.db/neostore.counts.db.b], from [/local/kozo/neo4j/data/graph.db/neostore.counts.db.a].

2015-03-23 20:58:48.749+0000 INFO [o.n.k.LogRotationImpl]: Log Rotation [59]: Preparing new log file...

2015-03-23 20:58:48.749+0000 INFO [o.n.k.NeoStoreDataSource]: Opened logical log [/local/kozo/neo4j/data/graph.db/neostore.transaction.db.60] version=60, lastTxId=51881 (clean)

2015-03-23 20:58:48.750+0000 INFO [o.n.k.NeoStoreDataSource]: Finished rotating log version:59

2015-03-23 20:58:48.750+0000 INFO [o.n.k.i.t.l.p.LogPruning]: Log Rotation [59]: [84:qtp1788679549-84] Starting log pruning.

2015-03-23 20:58:48.751+0000 INFO [o.n.k.i.t.l.p.LogPruning]: Log Rotation [59]: [84:qtp1788679549-84] Log pruning complete.

2015-03-23 20:58:54.573+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 1418ms.

2015-03-23 20:58:59.654+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 576ms.

2015-03-23 20:59:09.621+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 458ms.

2015-03-23 20:59:12.236+0000 DEBUG [o.n.k.h.c.m.SlaveClient]: Thread[154, Neo4j HighlyAvailableGraphDatabase[/local/kozo/neo4j/data/graph.db]-3] Opened a new channel to /10.49.220.86:6001

2015-03-23 20:59:12.236+0000 DEBUG [o.n.k.h.c.m.SlaveClient]: Thread[153, Neo4j HighlyAvailableGraphDatabase[/local/kozo/neo4j/data/graph.db]-2] Opened a new channel to /10.49.220.87:6001

2015-03-23 20:59:12.236+0000 DEBUG [o.n.k.h.c.m.SlaveClient]: ResourcePool create resource ChannelContext{channel=[id: 0x216997a3, /10.49.220.85:54146 => /10.49.220.86:6001], output=DynamicChannelBuffer(ridx=0, widx=0, cap=256), input=java.nio.HeapByteBuffer[pos=0 lim=1048576 cap=1048576]}

2015-03-23 20:59:12.236+0000 DEBUG [o.n.k.h.c.m.SlaveClient]: ResourcePool create resource ChannelContext{channel=[id: 0x3424a938, /10.49.220.85:46193 => /10.49.220.87:6001], output=DynamicChannelBuffer(ridx=0, widx=0, cap=256), input=java.nio.HeapByteBuffer[pos=0 lim=1048576 cap=1048576]}

2015-03-23 20:59:15.494+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 2370ms.

2015-03-23 20:59:21.293+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 2569ms.

2015-03-23 20:59:22.493+0000 INFO [o.n.k.i.a.i.s.OnlineIndexSamplingJob]: Sampled index :Holding(firmName) with 6108 unique values in sample of avg size 6108 taken from index containing 6108 entries

2015-03-23 20:59:29.135+0000 WARN [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor: Application threads blocked for 335ms.

2015-03-23 20:59:30.527+0000 INFO [o.n.k.i.a.i.s.OnlineIndexSamplingJob]: Sampled index :Holding(fundName) with 81048 unique values in sample of avg size 81048 taken from index containing 81048 entries

2015-03-23 20:59:32.238+0000 ERROR [o.n.k.h.HighlyAvailableGraphDatabase]: Slave 822761180: Replication commit threw communication exception:

org.neo4j.com.ComException: org.jboss.netty.handler.queue.BlockingReadTimeoutException

at org.neo4j.com.DechunkingChannelBuffer.readNext(DechunkingChannelBuffer.java:75) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.com.DechunkingChannelBuffer.readNextChunk(DechunkingChannelBuffer.java:93) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.com.DechunkingChannelBuffer.<init>(DechunkingChannelBuffer.java:59) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.com.Protocol.deserializeResponse(Protocol.java:234) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.com.Client.sendRequest(Client.java:225) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.com.Client.sendRequest(Client.java:207) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.kernel.ha.com.master.SlaveClient.pullUpdates(SlaveClient.java:69) ~[neo4j-ha-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.kernel.ha.transaction.CommitPusher.askSlaveToPullUpdates(CommitPusher.java:192) ~[neo4j-ha-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.kernel.ha.transaction.CommitPusher.access$000(CommitPusher.java:39) ~[neo4j-ha-2.2.0-RC01.jar:2.2.0-RC01]

at org.neo4j.kernel.ha.transaction.CommitPusher$1.run(CommitPusher.java:160) ~[neo4j-ha-2.2.0-RC01.jar:2.2.0-RC01]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_25]

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) ~[na:1.7.0_25]

at java.util.concurrent.FutureTask.run(FutureTask.java:166) ~[na:1.7.0_25]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_25]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_25]

at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

Caused by: org.jboss.netty.handler.queue.BlockingReadTimeoutException: null

at org.jboss.netty.handler.queue.BlockingReadHandler.readEvent(BlockingReadHandler.java:232) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.handler.queue.BlockingReadHandler.read(BlockingReadHandler.java:162) ~[netty-3.6.3.Final.jar:na]

at org.neo4j.com.DechunkingChannelBuffer.readNext(DechunkingChannelBuffer.java:66) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

... 16 common frames omitted

2015-03-23 20:59:32.238+0000 DEBUG [o.n.k.h.HighlyAvailableGraphDatabase]: Transaction 51882 couldn't commit on enough slaves, desired 2, but could only commit at 0

This is from master's message log. There is no printout in the slave logs. But I do see the same line:

[/local/kozo/neo4j/data/graph.db/neostore.transaction.db.60] version=60, lastTxId=51881 (clean)

there too.

In the slave logs, a few minutes later, I saw this:

2015-03-23 21:03:02.725+0000 INFO [o.n.c.p.h.HeartbeatContext]: 822761436(me) is now suspecting 822760924

2015-03-23 21:03:02.725+0000 INFO [o.n.c.p.h.HeartbeatContext]: 822761180 is now suspecting 822760924

2015-03-23 21:03:02.726+0000 WARN [o.n.c.p.e.HeartbeatReelectionListener]: instance 822760924 is being demoted since it failed

2015-03-23 21:03:02.726+0000 INFO [o.n.k.h.HighAvailabilityConsoleLogger]: Instance 822760924 has failed

2015-03-23 21:03:02.728+0000 DEBUG [o.n.k.h.c.HighAvailabilityMemberStateMachine]: Got memberIsFailed(822760924)

2015-03-23 21:03:02.729+0000 INFO [o.n.c.p.c.ClusterConfiguration]: Removed role coordinator from instance 822760924

Master log prints:

2015-03-23 21:04:03.072+0000 INFO [o.n.c.p.h.HeartbeatContext]: 822760924(me) is now suspecting 822761436

2015-03-23 21:04:03.076+0000 INFO [o.n.c.p.h.HeartbeatContext]: 822760924(me) is now suspecting 822761180

2015-03-23 21:04:03.077+0000 WARN [o.n.k.h.c.m.MasterServer]: Exception from Netty

java.nio.channels.ClosedChannelException: null

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:409) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:349) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:81) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.Channels.close(Channels.java:812) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.AbstractChannel.close(AbstractChannel.java:197) ~[netty-3.6.3.Final.jar:na]

at org.neo4j.com.ChunkingChannelBuffer.operationComplete(ChunkingChannelBuffer.java:590) ~[neo4j-com-2.2.0-RC01.jar:2.2.0-RC01]

at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:413) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:380) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:245) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromSelectorLoop(AbstractNioWorker.java:157) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:113) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[netty-3.6.3.Final.jar:na]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_25]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_25]

at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:99) ~[neo4j-kernel-2.2.0-RC01.jar:2.2.0-RC01]

2015-03-23 21:04:03.084+0000 WARN [o.n.k.h.c.m.MasterServer]: Exception from Netty

java.io.IOException: Broken pipe

at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_25]

at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.7.0_25]

at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94) ~[na:1.7.0_25]

at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_25]

at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:466) ~[na:1.7.0_25]

at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:203) ~[netty-3.6.3.Final.jar:na]

at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:198) ~[netty-3.6.3.Final.jar:na]