I am able to easily set up Hive with S3 using the "s3://" scheme. However, with Presto 0.178, org.apache.http.HttpException is thrown with "s3 protocol is not supported".
So, question #1: Does Presto 0.178+ officially support the "s3://" scheme, or not?
I've seen a number of conflicting reports on this and would like to know what's really supported and what's not.
I've attempted the same with the "s3a:// scheme" in Hive. However, setting this up in Hive is elusive as I keep hitting "Scheme 's3a' not registered". I have not found anything that explains this error.
So, question #2: Has anyone successfully set up Presto with Hive and S3 using the "s3a://" scheme?
Any help or insight you can provide is greatly appreciated.
I also tested with "s3n://" and I get the same org.apache.http.HttpException is thrown with "s3 protocol is not supported".
Has anyone seen this before?
com.facebook.presto.spi.PrestoException: Unable to execute HTTP request: null
at com.facebook.presto.hive.HiveSplitSource.propagatePrestoException(HiveSplitSource.java:139)
at com.facebook.presto.hive.HiveSplitSource.isFinished(HiveSplitSource.java:117)
at com.facebook.presto.split.ConnectorAwareSplitSource.isFinished(ConnectorAwareSplitSource.java:72)
at com.facebook.presto.split.BufferingSplitSource.fetchSplits(BufferingSplitSource.java:67)
at com.facebook.presto.split.BufferingSplitSource.lambda$fetchSplits$1(BufferingSplitSource.java:73)
at com.google.common.util.concurrent.AbstractTransformFuture$AsyncTransformFuture.doTransform(AbstractTransformFuture.java:211)
at com.google.common.util.concurrent.AbstractTransformFuture$AsyncTransformFuture.doTransform(AbstractTransformFuture.java:200)
at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:130)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:399)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:902)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:813)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:655)
at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.setResult(AbstractTransformFuture.java:245)
at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:177)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:399)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:902)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:813)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:655)
at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:48)
at io.airlift.concurrent.MoreFutures.lambda$toListenableFuture$10(MoreFutures.java:445)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:580)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP request: null
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:724)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4039)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3976)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:776)
at com.facebook.presto.hive.PrestoS3FileSystem.listPrefix(PrestoS3FileSystem.java:474)
at com.facebook.presto.hive.PrestoS3FileSystem.access$000(PrestoS3FileSystem.java:108)
at com.facebook.presto.hive.PrestoS3FileSystem$1.<init>(PrestoS3FileSystem.java:266)
at com.facebook.presto.hive.PrestoS3FileSystem.listLocatedStatus(PrestoS3FileSystem.java:264)
at com.facebook.presto.hadoop.HadoopFileSystem.listLocatedStatus(HadoopFileSystem.java:30)
at com.facebook.presto.hive.HadoopDirectoryLister.list(HadoopDirectoryLister.java:32)
at com.facebook.presto.hive.util.HiveFileIterator.getLocatedFileStatusRemoteIterator(HiveFileIterator.java:113)
at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:86)
at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:41)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:235)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:86)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:187)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
... 4 more
Caused by: org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:875)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
... 26 more
Caused by: org.apache.http.HttpException: s3 protocol is not supported
at org.apache.http.impl.conn.DefaultRoutePlanner.determineRoute(DefaultRoutePlanner.java:88)
at org.apache.http.impl.client.InternalHttpClient.determineRoute(InternalHttpClient.java:124)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:183)
... 31 more
connector.name=hive-hadoop2
hive.metastore.uri=thrift://[metastore-ip-addr]:9083
hive.s3.endpoint=s3://[bucket-name]
hive.s3.aws-access-key=[aws-bucket-user-access-key]
hive.s3.aws-secret-key=[aws-bucket-user-secret-key]
instead of
connector.name=hive-hadoop2
hive.metastore.uri=thrift://[metastore-ip-addr]:9083
hive.s3.endpoint=http://[bucket-name].s3-us-west-2.amazonaws.com
hive.s3.aws-access-key=[aws-bucket-user-access-key]
hive.s3.aws-secret-key=[aws-bucket-user-secret-key]
After making that correction, I get the following error:
Query 20170628_012604_00003_qcg43 failed: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 29EAB48EA367F1BE)
com.facebook.presto.spi.PrestoException: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 29EAB48EA367F1BE)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 29EAB48EA367F1BE)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1387)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:940)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4039)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3976)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:776)
at com.facebook.presto.hive.PrestoS3FileSystem.listPrefix(PrestoS3FileSystem.java:474)
at com.facebook.presto.hive.PrestoS3FileSystem.access$000(PrestoS3FileSystem.java:108)
at com.facebook.presto.hive.PrestoS3FileSystem$1.<init>(PrestoS3FileSystem.java:266)
at com.facebook.presto.hive.PrestoS3FileSystem.listLocatedStatus(PrestoS3FileSystem.java:264)
at com.facebook.presto.hadoop.HadoopFileSystem.listLocatedStatus(HadoopFileSystem.java:30)
at com.facebook.presto.hive.HadoopDirectoryLister.list(HadoopDirectoryLister.java:32)
at com.facebook.presto.hive.util.HiveFileIterator.getLocatedFileStatusRemoteIterator(HiveFileIterator.java:113)
at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:86)
at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:41)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:235)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:86)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:187)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
... 4 more
Query 20170628_012604_00003_qcg43 failed: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 29EAB48EA367F1BE)
com.facebook.presto.spi.PrestoException: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 29EAB48EA367F1BE)
--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Try removing the endpoint property entirely. I think you only need it for special circumstances like talking to non-US buckets (possibly only those in China since it's separate).
--
You received this message because you are subscribed to a topic in the Google Groups "Presto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/presto-users/alzldiG8beQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to presto-users+unsubscribe@googlegroups.com.
Thanks. I have my queries working with 's3://' and 's3n://' since removing that property.I am still seeing problems with 's3a://' in Hive. I will post that on a separate thread.On Tue, Jun 27, 2017 at 7:34 PM, David Phillips <da...@acz.org> wrote:Try removing the endpoint property entirely. I think you only need it for special circumstances like talking to non-US buckets (possibly only those in China since it's separate).
--
You received this message because you are subscribed to a topic in the Google Groups "Presto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/presto-users/alzldiG8beQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to presto-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.