java.io.IOException: Unknown error. response: RPCFileReadResponse...

414 views
Skip to first unread message

Aaquib Khwaja

unread,
Mar 16, 2017, 4:50:08 AM3/16/17
to Alluxio Users
I'm running Presto over Alluxio. I have certain queries that are scheduled and which generally run fine, but sometimes fail with the following error.
I looked into Alluxio code, in the class NettyUnderFileSystemFileReader.java that's causing this error. Looks like its related to some timeout, but i'm not able to come to a conclusion.
Any help on this would be great !

2017-03-16T07:27:22.745Z        ERROR   remote-task-callback-5219       com.facebook.presto.execution.StageStateMachine Stage 20170316_071409_00062_qxzzt.3 failed
com.facebook.presto.spi.PrestoException: HDFS error reading from alluxio://alluxiomaster:19998/s3/d=14/h=05/min=30/part-r-00014-2d91cc29-35ab-4fef-988a-e495a3a35276.zlib.orc at position 32322522
        at com.facebook.presto.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:61)
        at com.facebook.presto.orc.AbstractOrcDataSource.readFully(AbstractOrcDataSource.java:94)
        at com.facebook.presto.orc.AbstractOrcDataSource.readFully(AbstractOrcDataSource.java:85)
        at com.facebook.presto.orc.OrcReader.<init>(OrcReader.java:91)
        at com.facebook.presto.hive.orc.OrcPageSourceFactory.createOrcPageSource(OrcPageSourceFactory.java:159)
        at com.facebook.presto.hive.orc.OrcPageSourceFactory.createPageSource(OrcPageSourceFactory.java:105)
        at com.facebook.presto.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:133)
        at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:88)
        at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:44)
        at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
        at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:222)
        at com.facebook.presto.operator.Driver.processInternal(Driver.java:378)
        at com.facebook.presto.operator.Driver.processFor(Driver.java:301)
        at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:622)
        at com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:555)
        at com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:691)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.io.IOException: Unknown error. response: RPCFileReadResponse{tempUfsFileId=1712055275208847097, offset=16777216, length=0, status=UFS_READ_FAILED}
        at alluxio.client.netty.NettyUnderFileSystemFileReader.read(NettyUnderFileSystemFileReader.java:127)
        at alluxio.client.file.UnderFileSystemFileInStream.directRead(UnderFileSystemFileInStream.java:200)
        at alluxio.client.file.UnderFileSystemFileInStream.updateBuffer(UnderFileSystemFileInStream.java:224)
        at alluxio.client.file.UnderFileSystemFileInStream.read(UnderFileSystemFileInStream.java:133)
        at alluxio.client.block.UnderStoreBlockInStream.read(UnderStoreBlockInStream.java:149)
        at alluxio.client.file.FileInStream.read(FileInStream.java:220)
        at alluxio.client.file.FileInStream.readCurrentBlockToPos(FileInStream.java:703)
        at alluxio.client.file.FileInStream.seekInternalWithCachingPartiallyReadBlock(FileInStream.java:648)
        at alluxio.client.file.FileInStream.seek(FileInStream.java:303)
        at alluxio.hadoop.HdfsFileInputStream.readWithoutPacketStreaming(HdfsFileInputStream.java:278)
        at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:224)
        at alluxio.hadoop.HdfsFileInputStream.readFully(HdfsFileInputStream.java:342)
        at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:95)
        at com.facebook.presto.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:50)
        ... 18 more
Caused by: java.io.IOException: Unknown error. response: RPCFileReadResponse{tempUfsFileId=1712055275208847097, offset=16777216, length=0, status=UFS_READ_FAILED}
        at alluxio.client.netty.NettyUnderFileSystemFileReader.read(NettyUnderFileSystemFileReader.java:110)
        ... 31 more


Thanks,
Aaquib

Gene Pang

unread,
Mar 16, 2017, 9:29:56 AM3/16/17
to Alluxio Users
Hi Aaquib,

How many Alluxio workers are you running? Could you go into the worker logs and see if there are any log messages there?

Thanks,
Gene

Aaquib Khwaja

unread,
Mar 18, 2017, 1:59:57 AM3/18/17
to Alluxio Users
Hi Gene,

Thanks for the response. These are the logs that i'm getting in worker.log file. I've attached 3 different files.

-
Aaquib
log1.log
log2.log
log3.log

Aaquib Khwaja

unread,
Mar 20, 2017, 2:38:31 AM3/20/17
to Alluxio Users
Also, i'm using a 4 node cluster with one master and 3 workers.

Aaquib Khwaja

unread,
Mar 20, 2017, 8:04:48 AM3/20/17
to Alluxio Users
This is also something we run into, but this happens in the subsequent runs of the query after the first failed run.

ERROR logger.type (UnderFileSystemDataServerHandler.java:handleFileReadRequest) - Failed to read ufs file, may have been closed due to a client timeout.
javax.net.ssl.SSLProtocolException: Data received in non-data state: 6
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1109)
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at org.apache.commons.httpclient.ContentLengthInputStream.read(ContentLengthInputStream.java:170)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:108)
        at alluxio.org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
        at alluxio.org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:136)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at alluxio.underfs.s3.S3InputStream.read(S3InputStream.java:101)
        at com.google.common.io.CountingInputStream.read(CountingInputStream.java:62)
        at alluxio.underfs.ObjectUnderFileInputStream.read(ObjectUnderFileInputStream.java:75)
        at alluxio.worker.netty.UnderFileSystemDataServerHandler.handleFileReadRequest(UnderFileSystemDataServerHandler.java:83)
        at alluxio.worker.netty.DataServerHandler.channelRead0(DataServerHandler.java:78)
        at alluxio.worker.netty.DataServerHandler.channelRead0(DataServerHandler.java:43)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:831)
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:346)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

Yupeng Fu

unread,
Mar 24, 2017, 4:13:07 PM3/24/17
to Aaquib Khwaja, Alluxio Users
Hi Aaquib,

What's the size of your file in s3? 

One possibility is that the default value of Presto's split size is too small and does not match Alluxio's block size. That may lead to multiple concurrent writes into the same block and therefore create conflicts.
Can you try to increase the value of "hive.max-split-size" in presto to match Alluxio's block size ( described here)?

Best,


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aaquib Khwaja

unread,
Mar 27, 2017, 2:41:08 AM3/27/17
to Alluxio Users, khwaja...@gmail.com
Hi Yupeng,

The average file size in s3 is around 45 MB. In Alluxio, the default block size (alluxio.user.block.size.bytes.default) is 512MB.
Should i increase the "hive.max-split-size" or should i bring down the default block size in Alluxio ?

Thanks,
Aaquib
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

Aaquib Khwaja

unread,
Mar 31, 2017, 6:24:48 AM3/31/17
to Alluxio Users, khwaja...@gmail.com
I tried increasing the "hive.max-split-size" to 512MB (equal to the default block size in Alluxio). Looks like the issue got resolved.
Thanks.

Yupeng Fu

unread,
Mar 31, 2017, 4:35:34 PM3/31/17
to Aaquib Khwaja, Alluxio Users
Great. Glad to hear you solved the problem :)
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages