Neo4j 4.0.3 timeout with 4,000 transaction size sometimes

341 views
Skip to first unread message

rfc...@gmail.com

unread,
Sep 2, 2017, 9:37:13 PM9/2/17
to Neo4j
Hi,

Our Python 3 program using version 3.1.2 py2neo on Neo4j Community Edition version 3.0.3. with transaction size about 4000, i.e. we do a transaction commit after we put 4,000 Cypher statements to a transaction.

We open the py2neo Graph as this

from py2neo import Graph

g = Graph(bolt = False, user = .., password = ..)


Once in a long while we will get timeout error from Neo4j. Any idea why this kind of timeout happens? Are there any way to prevent this timeout errors from happening? 


(If I increase transaction size to 8000 or 9000, timeout occurs more often, at transaction size 4000 timeout happens only occasionally. But it we use transaction size less than 4000, the insertion throughput is affected, and we are not sure if timeout will not happen with transaction size 3000.)


Stack trace of py2neo, exception log from neo4j.log and debug.log of when the timeout happened are included below. 


Note that our Python log  time zone is 7 hours behind UTC.


Python 3 log including py2neo stack trace:

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,886 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/flow_processor.py, line 603: stack_list = traceback.extract_stack(limit=10) + \

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,886 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/flow_processor.py, line 4048: tx_b.commit()

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,886 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/venv-flow-processor/lib/python3.4/site-packages/py2neo/database/__init__.py, line 1059: self._post(commit=True)

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,886 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/venv-flow-processor/lib/python3.4/site-packages/py2neo/database/__init__.py, line 1220: rs = resource.post({"statements": self.statements})

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,886 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/venv-flow-processor/lib/python3.4/site-packages/py2neo/database/http.py, line 203: response = self.__base.post(body, headers, **kwargs)

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,887 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/venv-flow-processor/lib/python3.4/site-packages/py2neo/packages/httpstream/http.py, line 984: return rq.submit(**kwargs)

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,887 - read_qosmos - INFO - /opt/trendmicro/ddna/engine/venv-flow-processor/lib/python3.4/site-packages/py2neo/packages/httpstream/http.py, line 362: raise SocketError(code, description, host_port=uri.host_port)

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,887 - read_qosmos - WARNING - mp 0, ctx 1 Got an exception from commit(): timed out

Sep  1 12:01:34 ddna-sjc 2017-09-01 12:01:34,887 - read_qosmos - INFO - Type of Exception is <class 'py2neo.packages.httpstream.http.SocketError'>



from neo4j.log

2017-09-01 18:38:38.733+0000 WARN  badMessage: java.lang.IllegalStateException: too much data after closed for HttpChannelOverHttp@4b2cc33{r=5,c=false,a=IDLE,uri=-}

2017-09-01 19:01:35.111+0000 ERROR Failed to generate JSON output. java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms

java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms

at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:234)

at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:136)

at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:350)

at com.sun.jersey.spi.container.servlet.WebComponent$Writer.write(WebComponent.java:300)

at com.sun.jersey.spi.container.ContainerResponse$CommittingOutputStream.write(ContainerResponse.java:135)

at org.neo4j.server.rest.web.TransactionalService$InterruptingOutputStream.write(TransactionalService.java:283)

at org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754)

at org.codehaus.jackson.impl.Utf8Generator._writeFieldName(Utf8Generator.java:432)

at org.codehaus.jackson.impl.Utf8Generator.writeFieldName(Utf8Generator.java:282)

at org.codehaus.jackson.JsonGenerator.writeNumberField(JsonGenerator.java:956)

at org.neo4j.server.rest.transactional.ExecutionResultSerializer.writeStats(ExecutionResultSerializer.java:209)

at org.neo4j.server.rest.transactional.ExecutionResultSerializer.statementResult(ExecutionResultSerializer.java:115)

at org.neo4j.server.rest.transactional.TransactionHandle.executeStatements(TransactionHandle.java:316)

at org.neo4j.server.rest.transactional.TransactionHandle.commit(TransactionHandle.java:150)

at org.neo4j.server.rest.web.TransactionalService.lambda$executeStatementsAndCommit$28(TransactionalService.java:202)

at org.neo4j.server.rest.web.TransactionalService$$Lambda$169/499200393.write(Unknown Source)

at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71)

at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:57)

at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:302)

at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1510)

at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)

at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)

at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)

at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)

at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)

at org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handle(Server.java:497)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)

at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)

at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms

at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:161)

at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

... 1 more



from debug.log


2017-09-01 18:58:33.634+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [8720]:  Starting log pruning.

2017-09-01 18:58:33.635+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [8720]:  Log pruning complete.

2017-09-01 19:03:33.641+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Starting check pointing...2017-09-01 19:03:33.641+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Starting store flush...2017-09-01 19:03:33.778+0000 INFO  [o.n.k.i.s.c.CountsTracker] About to rotate counts store at transaction 100911780 to [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.a], from [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.b].2017-09-01 19:03:33.780+0000 INFO  [o.n.k.i.s.c.CountsTracker] Successfully rotated counts store at transaction 100911780 to [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.a], from [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.b].2017-09-01 19:03:37.653+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Store flush completed2017-09-01 19:03:37.653+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Starting appending check point entry into the tx log...2017-09-01 19:03:37.654+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Appending check point entry into the tx log completed2017-09-01 19:03:37.654+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911780]:  Check pointing completed2017-09-01 19:03:37.654+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [8720]:  Starting log pruning.2017-09-01 19:03:37.655+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [8720]:  Log pruning complete.

2017-09-01 19:08:37.662+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911792]:  Starting check pointing...2017-09-01 19:08:37.662+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [100911792]:  Starting store flush...

2017-09-01 19:08:37.747+0000 INFO  [o.n.k.i.s.c.CountsTracker] About to rotate counts store at transaction 100911792 to [/data/ddna/neo4j/data/databases/graph.d

b/neostore.counts.db.b], from [/data/ddna/neo4j/data/databases/graph.db/neostore

.counts.db.a].

2017-09-01 19:08:37.749+0000 INFO  [o.n.k.i.s.c.CountsTracker] Successfully rotated counts store at transaction 100911792 to [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.b], from [/data/ddna/neo4j/data/databases/graph.db/neostore.counts.db.a].


Thanks!

Ray

Michael Hunger

unread,
Sep 3, 2017, 12:22:39 AM9/3/17
to ne...@googlegroups.com
Would you mind raising an issue at github.com/neo4j/neo4j ?
Seems like a server side http issue.

Can you try to use bolt instead of http?

Michael


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

rfc...@gmail.com

unread,
Sep 5, 2017, 10:23:52 PM9/5/17
to Neo4j, michael...@neo4j.com
> Would you mind raising an issue at github.com/neo4j/neo4j ?
Just raised an issue at github.com/neo4j/neo4j .

> Can you try to use bolt instead of http?
Trying bolt in our environment now. Will report back if timeout happens with bolt or does not happen after a couple of days.

Thanks Michael.

Ray
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.

Ray Chengr

unread,
Sep 6, 2017, 10:52:34 PM9/6/17
to Neo4j, michael...@neo4j.com
Three findings:
1) Tried to use bolt instead of http. The same error still happened.
2) Using bolt resulted in slower Neo4j insertion by three times than http in our busy insertion Neo4j-py2neo Python 3 case, i.e. the time period for exactly the same input data took three times longer than using http. So, we went back to http.
3) See similar error message from neo4j.log, the part above neo4j.server is different from the one I posted, but the part from neo4j.server is the same:
2017-09-07 00:24:50.554+0000 ERROR Failed to generate JSON output.
org.eclipse.jetty.io.EofException
    at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:192)
    at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:408)
    at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:302)
    at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:129)
    at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:684)
    at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:246)
    at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:208)
    at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:480)
    at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:768)
    at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:801)
    at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)
    at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:135)

    at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:350)
    at com.sun.jersey.spi.container.servlet.WebComponent$Writer.write(WebComponent.java:300)
    at com.sun.jersey.spi.container.ContainerResponse$CommittingOutputStream.write(ContainerResponse.java:135)
    at org.neo4j.server.rest.web.TransactionalService$InterruptingOutputStream.write(TransactionalService.java:283)
    at org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754)
    at org.codehaus.jackson.impl.Utf8Generator._writeFieldName(Utf8Generator.java:420)

    at org.codehaus.jackson.impl.Utf8Generator.writeFieldName(Utf8Generator.java:282)
    at org.codehaus.jackson.JsonGenerator.writeNumberField(JsonGenerator.java:956)
    at org.neo4j.server.rest.transactional.ExecutionResultSerializer.writeStats(ExecutionResultSerializer.java:209)
    at org.neo4j.server.rest.transactional.ExecutionResultSerializer.statementResult(ExecutionResultSerializer.java:115)
    at org.neo4j.server.rest.transactional.TransactionHandle.executeStatements(TransactionHandle.java:316)
    at org.neo4j.server.rest.transactional.TransactionHandle.commit(TransactionHandle.java:150)
    at org.neo4j.server.rest.web.TransactionalService.lambda$executeStatementsAndCommit$28(TransactionalService.java:202)
    at org.neo4j.server.rest.web.TransactionalService$$Lambda$164/1266049947.write(Unknown Source)
Caused by: java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
    at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
    at sun.nio.ch.IOUtil.write(IOUtil.java:148)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504)
    at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:172)
    ... 55 more
2017-09-07 00:24:50.838+0000 ERROR Unable to reply to request, because the client has closed the connection (Broken pipe).
2017-09-07 00:24:50.839+0000 ERROR Unable to reply to request, because the client has closed the connection (Broken pipe).

Any idea?

Thanks,
Ray

Michael Hunger

unread,
Sep 12, 2017, 7:17:01 PM9/12/17
to Ray Chengr, Neo4j, Nigel Small
Can you try to run the same workload with a different driver? e.g. Javascript or Java ?

It seems that the client aborts the connection.

Michael

Ray Chengr

unread,
Sep 14, 2017, 4:33:41 AM9/14/17
to Neo4j
Our program/script is written in Python 3 using py2neo v3.1.2 to access Neo4j. Converting it to Java is possible but takes a lot of re-coding.

(Before I modify our Python 3 program to Java, I'll ask py2neo creator Nigel to take a look at this problem.)

Ray
Reply all
Reply to author
Forward
0 new messages