500 error code from /v1/task REST endpoint without http request log

642 views
Skip to first unread message

Changshu Liu

unread,
Jun 16, 2016, 1:06:03 AM6/16/16
to Presto
we are on version 1.43, from time to time, we see the following like errors

com.facebook.presto.operator.PageTransportErrorException:

but after checking the http-request log on 10.1.47.157
it seems that it never receives this request: v1/task/20160616_042548_00024_iyd5a.2.114/results/20160616_042548_00024_iyd5a.1.66/650,
the most recent related request is v1/task/20160616_042548_00024_iyd5a.2.114/results/20160616_042548_00024_iyd5a.1.66/649 which seems handled correctly.

what could be potential reason on why some other nodes get 500 error when accessing http://10.1.47.157:8080/v1/task/20160616_042548_00024_iyd5a.2.114/results/20160616_042548_00024_iyd5a.1.66/650?

Thanks,
Changshu

Changshu Liu

unread,
Jun 16, 2016, 1:20:12 AM6/16/16
to Presto
Just found a even more strange case, where the server received the http request and returned with 200 code, but the client still said it got 500 code:

The final error:
com.facebook.presto.operator.PageTransportErrorException: Expected response code to be 200, but was 500 Server Error: http://10.11.139.131:8080/v1/task/20160616_051012_00025_iyd5a.2.103/results/20160616_051012_00025_iyd5a.1.37/753

the http request log on 10.11.139.131:

2016-06-16T05:11:22.035Z        10.11.133.15      GET     /v1/task/20160616_051012_00025_iyd5a.2.103/results/20160616_051012_00025_iyd5a.1.37/753 null    null    200     0       17680   10      79dbee4a-3cce-4dd2-acdb-843152946a73

Changshu Liu

unread,
Jun 16, 2016, 2:49:10 AM6/16/16
to Presto

I retried the query a couple of times and the failure seems a little bit random (on when it happened), but it seems always happening on the same place: Jetty throws no http version exception.


Anyone has any insights on this?


Here is the call stack:


2016-06-16T06:01:22.542Z        ERROR   async-http-response-55  org.glassfish.jersey.server.ServerRuntime$Responder     An I/O error has occurred while writing a response message entity to the container output stream.

org.glassfish.jersey.server.internal.process.MappableException: io.airlift.slice.RuntimeIOException: java.io.IOException: org.eclipse.jetty.http.BadMessageException: 500: No version

        at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:91)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)

        at org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1154)

        at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:621)

        at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:377)

        at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:367)

        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:828)

        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)

        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)

        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)

        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)

        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)

        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:297)

        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:858)

        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:820)

        at io.airlift.http.server.AsyncResponseHandler.lambda$toCompletionCallback$3(AsyncResponseHandler.java:101)

        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)

        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)

        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)

        at io.airlift.concurrent.BoundedExecutor.executeOrMerge(BoundedExecutor.java:69)

        at io.airlift.concurrent.BoundedExecutor.access$000(BoundedExecutor.java:28)

        at io.airlift.concurrent.BoundedExecutor$1.run(BoundedExecutor.java:40)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        at java.lang.Thread.run(Thread.java:745)

Caused by: io.airlift.slice.RuntimeIOException: java.io.IOException: org.eclipse.jetty.http.BadMessageException: 500: No version

        at io.airlift.slice.OutputStreamSliceOutput.writeBytes(OutputStreamSliceOutput.java:168)

        at com.facebook.presto.spi.block.VariableWidthBlockEncoding.writeBlock(VariableWidthBlockEncoding.java:59)

        at com.facebook.presto.block.BlockSerdeUtil.writeBlock(BlockSerdeUtil.java:50)

        at com.facebook.presto.block.PagesSerde$PagesWriter.append(PagesSerde.java:83)

        at com.facebook.presto.block.PagesSerde.writePages(PagesSerde.java:54)

        at com.facebook.presto.block.PagesSerde.writePages(PagesSerde.java:47)

        at com.facebook.presto.server.PagesResponseWriter.writeTo(PagesResponseWriter.java:91)

        at com.facebook.presto.server.PagesResponseWriter.writeTo(PagesResponseWriter.java:41)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.invokeWriteTo(WriterInterceptorExecutor.java:265)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.aroundWriteTo(WriterInterceptorExecutor.java:250)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)

        at org.glassfish.jersey.server.internal.JsonWithPaddingInterceptor.aroundWriteTo(JsonWithPaddingInterceptor.java:106)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)

        at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:85)

        ... 24 more

Caused by: java.io.IOException: org.eclipse.jetty.http.BadMessageException: 500: No version

        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:173)

        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:415)

        at io.airlift.http.server.TimingFilter$TimedServletOutputStream.write(TimingFilter.java:162)

        at org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(ResponseWriter.java:298)

        at org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:233)

        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:299)

        at io.airlift.slice.CountingOutputStream.write(CountingOutputStream.java:56)

        at java.io.DataOutputStream.write(DataOutputStream.java:107)

        at io.airlift.slice.LittleEndianDataOutputStream.write(LittleEndianDataOutputStream.java:57)

        at io.airlift.slice.Slice.getBytes(Slice.java:546)

        at io.airlift.slice.OutputStreamSliceOutput.writeBytes(OutputStreamSliceOutput.java:165)

        ... 37 more

Caused by: org.eclipse.jetty.http.BadMessageException: 500: No version

        at org.eclipse.jetty.http.HttpGenerator.generateResponse(HttpGenerator.java:348)

        at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:681)

        at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)

        at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224)

        at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:512)

        at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:668)

        at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:722)

        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:179)

        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:163)

        ... 47 more

Changshu Liu

unread,
Jun 20, 2016, 11:56:02 PM6/20/16
to Presto
I Reported this to Jetty community: https://github.com/eclipse/jetty.project/issues/650

It seems to be a know issue that not fixed yet:

"On closer inspection, the problem appears to be an issue with jerseys asynchronous error handling. We have seen this kind of thing before from Jersey/presto before where they call asynchronous APIs after a request has been completed and recycled. The thread they have spawned needs to be excluded from writing to the response because something else has already completed the response and request cycle. We are actually working with the presto developers trying to work out the exact sequence that is causing this and improve the error handling."

Martin/Dan or other Presto developer, do you see this in Facebook use case? Any clue on how to mitigate this problem? It's a pretty often random errors in our cluster. Any clue on how to fix it is appreciated.

Thanks,
Changshu

Christopher Berner

unread,
Jun 21, 2016, 1:16:26 AM6/21/16
to Presto
Ya, I've been working with Greg from the Jetty team to try and track this down. I'm creating a simpler test case now, that I hope to be able to share later in the week.

--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Changshu Liu

unread,
Jun 21, 2016, 1:23:07 AM6/21/16
to Presto
Christopher, thanks a lot, let me know if there is anything I can help with this problem.

Thanks,
Changshu

Martin Traverso

unread,
Jun 21, 2016, 2:50:32 PM6/21/16
to Presto
We believe this is an issue in Jersey. We've filed https://java.net/jira/browse/JERSEY-3125

Martin

Changshu Liu

unread,
Jun 21, 2016, 3:59:13 PM6/21/16
to Presto
Thanks Martin and Christopher for the quick followup. Do you have any suggestion and temp workaround to mitigate this problem before we have it fixed? 

Thanks,
Changshu

Christopher Berner

unread,
Jun 21, 2016, 4:18:14 PM6/21/16
to Presto
If you want a really ugly workaround you can probably hack PageResponseHandler.handle() to ignore 500 errors and blindly retry. I'm looking into replacing Jersey with RESTEasy which will hopefully fix this

chang...@gmail.com

unread,
Jun 21, 2016, 5:37:40 PM6/21/16
to Presto
Great, thanks a lot.

Thanks,
Changshu

Changshu Liu

unread,
Jun 26, 2016, 9:22:30 PM6/26/16
to Presto
> I'm looking into replacing Jersey with RESTEasy which will hopefully fix this.

Christopher, this seems like a big change. Did we make the decision to go this way? If so, when do we plan to ship Presto with RESTEasy?

Thanks,
Changshu

David Phillips

unread,
Jun 26, 2016, 10:50:59 PM6/26/16
to presto...@googlegroups.com
Chris wrote a test program to reproduce the error which led to the Jetty team finding and fixing the issue (it turned out to be a Jetty bug). The fix will be in the next Presto release.

Changshu Liu

unread,
Jun 27, 2016, 3:21:15 AM6/27/16
to presto...@googlegroups.com
Nice, thanks for the update David!


Thanks,
Changshu

--
You received this message because you are subscribed to a topic in the Google Groups "Presto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/presto-users/AMm4bIUizTQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to presto-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
- Changshu


Reply all
Reply to author
Forward
0 new messages