Unit test randomly fails: dropwizard-core/src/test/java/io/dropwizard/server/DefaultServerFactoryTest.java

342 views
Skip to first unread message

Brent Ryan

unread,
Jul 7, 2014, 12:10:12 PM7/7/14
to dropwiz...@googlegroups.com

If I run this test manually it works fine, but if I run it with "mvn release:prepare" then it fails randomly.  I can't figure out why yet... anyone hit this?



testGracefulShutdown(io.dropwizard.server.DefaultServerFactoryTest)  Time elapsed: 0.759 sec  <<< ERROR!

    java.util.concurrent.ExecutionException: java.net.SocketException: Connection reset

    at java.util.concurrent.FutureTask.report(FutureTask.java:122)

    at java.util.concurrent.FutureTask.get(FutureTask.java:188)

    at io.dropwizard.server.DefaultServerFactoryTest.testGracefulShutdown(DefaultServerFactoryTest.java:153)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:606)

    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)

    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)

    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)

    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)

    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)

    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)

    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)

    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)

    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)

    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)

    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)

    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)

    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)

    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:606)

    at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)

    at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)

    at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)

    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)

    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

    Caused by: java.net.SocketException: Connection reset

    at java.net.SocketInputStream.read(SocketInputStream.java:196)

    at java.net.SocketInputStream.read(SocketInputStream.java:122)

    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

    at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

    at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552)

    at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)

    at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)

    at java.io.FilterInputStream.read(FilterInputStream.java:133)

    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3053)

    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)

    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)

    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)

    at java.io.InputStreamReader.read(InputStreamReader.java:184)

    at java.io.Reader.read(Reader.java:100)

    at com.google.common.io.CharStreams.copy(CharStreams.java:182)

    at com.google.common.io.CharStreams.toStringBuilder(CharStreams.java:229)

    at com.google.common.io.CharStreams.toString(CharStreams.java:200)

    at io.dropwizard.server.DefaultServerFactoryTest$2.call(DefaultServerFactoryTest.java:125)

    at io.dropwizard.server.DefaultServerFactoryTest$2.call(DefaultServerFactoryTest.java:119)

    at java.util.concurrent.FutureTask.run(FutureTask.java:262)

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

    at java.lang.Thread.run(Thread.java:744)

Brent Ryan

unread,
Jul 7, 2014, 2:19:12 PM7/7/14
to dropwiz...@googlegroups.com
Sometimes it takes a while to repro, but you can run something like this for a while and you should see some failures:

bryan:dropwizard-core bryan$ while true; do mvn test | grep FAIL; done;

Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.123 sec <<< FAILURE!

[INFO] BUILD FAILURE

It took several minutes for me to see the failure this last time...

arte...@gmail.com

unread,
Jul 7, 2014, 6:44:30 PM7/7/14
to dropwiz...@googlegroups.com
I confirm this issue. It happens about 7-8 times per 100 tests. The rate increases if debug logging is enabled.
As I believe there some race condition between a server shutdown and a thread waiting for a response to a submitted request.

02:38:35.919 [dw-844] DEBUG org.eclipse.jetty.io.WriteFlusher - write exception
org.eclipse.jetty.io.EofException: null
    at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:186) ~[jetty-io-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:335) ~[jetty-io-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:125) [jetty-io-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpConnection$ContentCallback.process(HttpConnection.java:804) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:81) [jetty-util-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:373) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:654) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:684) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpOutput.close(HttpOutput.java:119) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.Response.closeOutput(Response.java:797) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:342) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:246) [jetty-server-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358) [jetty-io-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) [jetty-util-9.0.7.v20131107.jar:9.0.7.v20131107]
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) [jetty-util-9.0.7.v20131107.jar:9.0.7.v20131107]
    at java.lang.Thread.run(Thread.java:722) [na:1.7.0_10-ea]
Caused by: java.nio.channels.ClosedChannelException: null
    at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:249) ~[na:1.7.0_10-ea]
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:440) ~[na:1.7.0_10-ea]
    at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:165) ~[jetty-io-9.0.7.v20131107.jar:9.0.7.v20131107]

And latches don't help for some reason.

Artem

arte...@gmail.com

unread,
Jul 9, 2014, 12:22:45 PM7/9/14
to dropwiz...@googlegroups.com
Figure it out.
Seems to be a bug in Jetty. https://bugs.eclipse.org/bugs/show_bug.cgi?id=420142
Current Jetty version in Dropwizard 9.0.7.v20131107 so it affects this test.
Jetty 9 doesn't respect requests awaiting for execution and just closes connectors immediately. Graceful period is used only for responses in process.
Jetty 8 had another policy, and this test was written at time when Dropwizard 0.6.2 was the last version (it used Jetty 8).
https://github.com/dropwizard/dropwizard/commit/787988fe8d95579349ae590a26ff4747bb7f0740

So in a current situation test success is really depends on who will win the race (a shutdown thread or a Jetty worker thread, is waiting on shutdownInvoked latch). 

Success test:
https://gist.github.com/anonymous/77142133a02b39a4f5e4
See the lines 157-167. Worker thread won a race and the request was serviced

Fail test:
https://gist.github.com/anonymous/e113d6c5b92958d96233
See the line 166. Shutdown thread won the race. So when the response is writing to a socket, it's already closed as we see at the 186 line.

I think, it's best to remove this test from a suite while this bug is not be fixed in Jetty and Dropwizard is not update to an appropriate Jetty version.
I will try to create a pull request to the Dropwizard team for this.


Artem

On Monday, July 7, 2014 8:10:12 PM UTC+4, Brent Ryan wrote:

Brent Ryan

unread,
Jul 13, 2014, 1:23:02 PM7/13/14
to dropwiz...@googlegroups.com
Time to upgrade to Jetty Stable 9.2.1.v20140609.  Woot!


--
You received this message because you are subscribed to a topic in the Google Groups "dropwizard-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dropwizard-user/D1dXRyeeceM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dropwizard-us...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mark Heimann

unread,
Oct 29, 2014, 3:01:27 PM10/29/14
to dropwiz...@googlegroups.com
Do you guy see any way this could cause a running Dropwizard process to die? Because I think I have seen this twice now in the last couple of days, and the DEBUG logging is cranked up all the way.

The exception before it went silent was:

DEBUG [2014-10-29 15:45:01,503] [dw-27-acceptor-1-application@1c529aa0{HTTP/1.1}{xx.xx.xxx.xxx:8888}] org.eclipse.jetty.server.ServerConnector:  java.nio.channels.ClosedChannelException: null
    at sun
.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226)
    at org
.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:321)
    at org
.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:460)
    at org
.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601)
    at org
.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532)
    at java
.lang.Thread.run(Thread.java:745)



I saw a couple of those from different acceptor threads as well as a few AsynchronousCloseExceptions also from acceptor threads:

DEBUG [2014-10-29 15:45:01,504] [dw-28-acceptor-2-application@1c529aa0{HTTP/1.1}{xx.xx.xxx.xxx:8888}] org.eclipse.jetty.server.ServerConnector:  java.nio.channels.AsynchronousCloseException: null
    at java
.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
    at sun
.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:248)
    at org
.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:321)
    at org
.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:460)
    at org
.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601)
    at org
.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532)
    at java
.lang.Thread.run(Thread.java:745)


After those exceptions I see the following shutdown logs in the log file:
 
DEBUG [2014-10-29 15:45:01,504] [Thread-12] org.eclipse.jetty.util.component.AbstractLifeCycle: stopping org.eclipse.jetty.server.ServerConnector$ServerConnectorManager@45636429  
DEBUG
[2014-10-29 15:45:01,504] [Thread-12] org.eclipse.jetty.util.component.AbstractLifeCycle: stopping org.eclipse.jetty.io.SelectorManager$ManagedSelector@16cf2c2f keys=1 selected=0  
DEBUG
[2014-10-29 15:45:01,504] [Thread-12] org.eclipse.jetty.io.SelectorManager: Stopping org.eclipse.jetty.io.SelectorManager$ManagedSelector@16cf2c2f keys=1 selected=0  
DEBUG
[2014-10-29 15:45:01,505] [Thread-12] org.eclipse.jetty.io.SelectorManager: Queued change org.eclipse.jetty.io.SelectorManager$ManagedSelector$Stop@3cd2640  
... it goes on ...


I have another instance of this service running with the same Dropwizard version (0.7.1) and it has not yet crashed a single time, but it's not running with DEBUG logging.

Any thoughts on this?

Cheers,
Mark
Reply all
Reply to author
Forward
0 new messages