Using Presto to query data located in S3

1.529 visualizações
Pular para a primeira mensagem não lida

stefan.sc...@gmail.com

não lida,
29 de nov. de 2013, 11:54:1029/11/2013
para presto...@googlegroups.com
Hi,

I would like to use Presto running on a Amazon EMR cluster to query data in S3. The idea is to define Hive tables pointing to S3 and then having presto do the work.

While presto works fine on AWS with data in the local HDFS, I can't get it to work nicely with S3.

After trying to get presto to be able to handle S3 location in the hive metadata, I changed the fs.default.name in core-site.xml to "s3://bucket-name".

"hadoop fs -ls /" now shows the S3 bucket and Hive works fine with LOCATION '/directory-in-s3-bucket/'.


However, when I then query that table in Presto, the SqlStageExecution executes in some environment where the "fs.s3.awsAccessKeyId" and "fs.s3.awsSecretAccessKey" properties are not available.

I tried to supply these properties with -D in jvm.config and in node.properties, config.properties and catalog/hive.properties.

That. however, doesn't help or even leads to "properties not used" errors.

Does anyone have an idea how I can get these property values to the right process within presto?


Thanks a lot!
Stefan

2013-11-29T16:43:31.016+0000 ERROR Stage-20131129_164330_00003_4aw9g.1-129 com.facebook.presto.execution.SqlStageExecution Error while starting stage 20131129_164330_00003_4aw9g.1
java.lang.RuntimeException: java.io.IOException: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-15.0.jar:na]
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:433) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:392) ~[na:na]
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-15.0.jar:na]
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-15.0.jar:na]
at com.facebook.presto.execution.SqlStageExecution.startTasks(SqlStageExecution.java:463) [presto-main-0.54.jar:0.54]
at com.facebook.presto.execution.SqlStageExecution.access$300(SqlStageExecution.java:80) [presto-main-0.54.jar:0.54]
at com.facebook.presto.execution.SqlStageExecution$5.run(SqlStageExecution.java:435) [presto-main-0.54.jar:0.54]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_40]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_40]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_40]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40]
at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]
Caused by: java.io.IOException: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).
at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:71) ~[na:na]
at com.facebook.presto.hive.ForwardingPath.getFileSystem(ForwardingPath.java:47) ~[na:na]
at com.facebook.presto.hive.FileSystemWrapper$1.getFileSystem(FileSystemWrapper.java:78) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable.loadPartitionSplits(HiveSplitIterable.java:181) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable.access$100(HiveSplitIterable.java:73) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:154) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:149) ~[na:na]
... 4 common frames omitted
Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).
at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:66) ~[na:na]
at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82) ~[na:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_40]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_40]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_40]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_40]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) ~[na:na]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) ~[na:na]
at com.sun.proxy.$Proxy151.initialize(Unknown Source) ~[na:na]
at org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:77) ~[na:na]
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446) ~[na:na]
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) ~[na:na]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) ~[na:na]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) ~[na:na]
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) ~[na:na]
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:92) ~[na:na]
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:87) ~[na:na]
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3932) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) ~[guava-15.0.jar:na]
at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:68) ~[na:na]
... 10 common frames omitted

stefan.sc...@gmail.com

não lida,
2 de dez. de 2013, 12:38:5502/12/2013
para presto...@googlegroups.com, stefan.sc...@gmail.com
Hi!

I debugged this a bit further and could get the credentials in by simply using the federation hint at the github repo, i.e adding the following to jvm.config:

-Dhive.config.resources=/home/hadoop/conf/core-site.xml,/home/hadoop/conf/hdfs-site.xml

Note, that you need to change /etc/hadoop to /home/hadoop on EMR.

However, the next issue is that the JetS3 libs are missing.

I can copy jets3t-0.9.0.jar from the EMR instance to the presto lib or plugin dirs, but that version is incompatible with the filesystem driver in the hadoop package. Using the version before the JetS3 API changes, i.e. jets3t-0.7.4 fixes that issue, but then other imcompatibilities arise.

I think, replacing the hadoop-apache*jar with a jar that uses the EMR hadoop version should work, but there are some custom classes in hadoop-apache*jar. As there are two versions of the hive plugin, one for CDH4 and for Apache Hadoop, I'd like to create another one especially for the Amazon distribution. But I don't know why I need these specific presto-hadoop-* subprojects and can't just replace a dependency setting.

Any hints on how to compile the hive plugin for an arbitrary hadoop distribution?

Thanks and best regards,
Stefan

David Phillips

não lida,
4 de dez. de 2013, 19:45:0704/12/2013
para presto...@googlegroups.com
On Mon, Dec 2, 2013 at 9:38 AM, <stefan.sc...@gmail.com> wrote:
However, the next issue is that the JetS3 libs are missing.

Our shaded build of Hadoop removes the JetS3t library and its dependencies. We will fix this in a future release. If you want to hack on this yourself, you can find the code here:


You can do a "mvn clean install" on that project then change the dependency version to use the local snapshot version in the Presto root pom.xml.
 
I can copy jets3t-0.9.0.jar from the EMR instance to the presto lib or plugin dirs, but that version is incompatible with the filesystem driver in the hadoop package. Using the version before the JetS3 API changes, i.e. jets3t-0.7.4 fixes that issue, but then other imcompatibilities arise.

One problem here is that Amazon has changes to their Hadoop and Hive versions that are not open source. You can see a list here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html#emr-hive-patches

There might be other issues or performance problems with S3 after we fix the missing classes. We can't ship the Amazon EMR Hadoop library with Presto because it's not open source (they would need to release it under the Apache 2 license).

What other incompatibilities did you run into?
 
I think, replacing the hadoop-apache*jar with a jar that uses the EMR hadoop version should work, but there are some custom classes in hadoop-apache*jar. As there are two versions of the hive plugin, one for CDH4 and for Apache Hadoop, I'd like to create another one especially for the Amazon distribution. But I don't know why I need these specific presto-hadoop-* subprojects and can't just replace a dependency setting.

Any hints on how to compile the hive plugin for an arbitrary hadoop distribution?

There are several reasons we have custom packagings of Hadoop:

1) The Hadoop Maven artifacts are a mess. They contain way more functionality and dependencies than what Presto needs. Many of these dependencies are old and conflict with libraries that we need to use in the Hive plugin. We use the Maven Shade plugin to re-package the dependencies into a different Java package so that they can be used side-by-side with the versions used in Presto.

2) The way Hadoop handles native libraries for compression codecs is a mess. It requires setting the java.library.path system property when the JVM is started. Clearly, this doesn't work for plugins that are loaded at runtime and makes deployment very fragile. They also don't have pre-built libraries for Mac OS X, which makes development difficult. We solve this by including native libraries Mac OS X and Linux along with a shim that loads them at runtime. The code for this is in HadoopNative. It's not pretty, but it makes everything "just work".

3) There are API differences between Hadoop versions. For example, newer versions of Hadoop have an API to list an HDFS directory and return the all the block locations in one all. Hadoop 1.x does not have this, so the block locations must be fetched individually for each file. The Hive plugin has a DirectoryLister class that encapsulates this and uses the fast version when available.

It would be great if this were simpler and if newer Hadoop client libraries worked with older Hadoop installations, but the reality is that supporting multiple Hadoop versions requires a lot of work.

stefan.sc...@gmail.com

não lida,
10 de dez. de 2013, 12:20:5010/12/2013
para presto...@googlegroups.com, da...@acz.org
Hi David,

thanks a lot for your reply and your explaination of shading.

I hacked around and included jets3t-0.9.0 into the presto-hadoop-apache1 project. I managed to get presto to compile against it, however, accessing S3 causes related "NoSuchMethod" errors, indicating that the org.apache.hadoop.fs classes attempt to use the old 0.7.4 jets3t interface. Also, the calling classes are not using the shaded namespace. What I don't understand is where the "old" hadoop is coming from that still uses the 0.7.* interface while all dependencies in the projects and the libs on the emr machines all point to be using the new 0.8+ api. So, maybe my version of presto-hadoop-apache1 includes a hadoop version incompatible with amazon emr or a rather old one, and additionally, maybe the shading is incomplete and should shade org.apache.hadoop.fs as well.

Do you have any opinions on what could be wrong? Also noteworthy is the point that this NoSuchMethod error is exactly the same as before when dirtly copying the respective jars files into the presto lib dirs, the only difference being the jets3t classes not being shaded.


> There might be other issues or performance problems with S3 after we fix the missing classes. We can't ship the Amazon EMR Hadoop library with Presto because it's not open source (they would need to release it under the Apache 2 license).
>
>
> What other incompatibilities did you run into?
>

> ...


>
> It would be great if this were simpler and if newer Hadoop client libraries worked with older Hadoop installations, but the reality is that supporting multiple Hadoop versions requires a lot of work.


Yes, that's quite a problem. I was, however, under the impression that the Amazon EMR hadoop version should be sufficiently similar to the plain apache hadoop 1.0.3 so that other third-party software can be run on EMR.

I'll try further, in the meantime some of my output.


Thanks and cheers!
Stefan

Shaded jars:

[INFO] --- maven-shade-plugin:1.6:shade (default) @ hadoop-apache1 ---
[INFO] Including org.apache.hadoop:hadoop-client:jar:1.2.1 in the shaded jar.
[INFO] Including org.apache.hadoop:hadoop-core:jar:1.2.1 in the shaded jar.
[INFO] Including commons-io:commons-io:jar:2.1 in the shaded jar.
[INFO] Including org.apache.commons:commons-math:jar:2.1 in the shaded jar.
[INFO] Including commons-configuration:commons-configuration:jar:1.6 in the shaded jar.
[INFO] Including commons-collections:commons-collections:jar:3.2.1 in the shaded jar.
[INFO] Including commons-lang:commons-lang:jar:2.4 in the shaded jar.
[INFO] Including commons-digester:commons-digester:jar:1.8 in the shaded jar.
[INFO] Including commons-net:commons-net:jar:1.4.1 in the shaded jar.
[INFO] Including commons-el:commons-el:jar:1.0 in the shaded jar.
[INFO] Including net.java.dev.jets3t:jets3t:jar:0.9.0 in the shaded jar.
[INFO] Including commons-codec:commons-codec:jar:1.4 in the shaded jar.
[INFO] Including com.jamesmurty.utils:java-xmlbuilder:jar:0.4 in the shaded jar.
[INFO] Including org.slf4j:jcl-over-slf4j:jar:1.7.5 in the shaded jar.
[INFO] Including org.slf4j:slf4j-api:jar:1.7.5 in the shaded jar.
[INFO] Including org.slf4j:slf4j-nop:jar:1.7.5 in the shaded jar.

The trace of the NoSuchMethod error:


2013-12-09T12:06:58.666+0000 ERROR Stage-20131209_120657_00002_4qvm6.1-113 com.facebook.presto.execution.SqlStageExecution Error while starting stage 20131209_120657_00002_4qvm6.1
java.lang.RuntimeException: java.io.IOException: java.lang.NoSuchMethodError: com.facebook.presto.hadoop.shaded.org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lcom/facebook/presto/hadoop/shaded/org/jets3t/service/security/AWSCredentials;)V


at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-15.0.jar:na]
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:433) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:392) ~[na:na]
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-15.0.jar:na]
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-15.0.jar:na]

at com.facebook.presto.execution.SqlStageExecution.startTasks(SqlStageExecution.java:463) [presto-main-0.55-SNAPSHOT.jar:0.55-SNAPSHOT]
at com.facebook.presto.execution.SqlStageExecution.access$300(SqlStageExecution.java:80) [presto-main-0.55-SNAPSHOT.jar:0.55-SNAPSHOT]
at com.facebook.presto.execution.SqlStageExecution$5.run(SqlStageExecution.java:435) [presto-main-0.55-SNAPSHOT.jar:0.55-SNAPSHOT]


at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_40]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_40]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_40]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40]
at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]

Caused by: java.io.IOException: java.lang.NoSuchMethodError: com.facebook.presto.hadoop.shaded.org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lcom/facebook/presto/hadoop/shaded/org/jets3t/service/security/AWSCredentials;)V


at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:71) ~[na:na]
at com.facebook.presto.hive.ForwardingPath.getFileSystem(ForwardingPath.java:47) ~[na:na]
at com.facebook.presto.hive.FileSystemWrapper$1.getFileSystem(FileSystemWrapper.java:78) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable.loadPartitionSplits(HiveSplitIterable.java:181) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable.access$100(HiveSplitIterable.java:73) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:154) ~[na:na]
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:149) ~[na:na]
... 4 common frames omitted

Caused by: java.lang.NoSuchMethodError: com.facebook.presto.hadoop.shaded.org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lcom/facebook/presto/hadoop/shaded/org/jets3t/service/security/AWSCredentials;)V
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:55) ~[na:na]


at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_40]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_40]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_40]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_40]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) ~[na:na]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) ~[na:na]

at org.apache.hadoop.fs.s3native.$Proxy151.initialize(Unknown Source) ~[na:na]
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:234) ~[na:na]


at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446) ~[na:na]
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) ~[na:na]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) ~[na:na]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) ~[na:na]
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) ~[na:na]
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:92) ~[na:na]
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:87) ~[na:na]
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3932) ~[guava-15.0.jar:na]
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) ~[guava-15.0.jar:na]
at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:68) ~[na:na]
... 10 common frames omitted

-Djava.library.path=/home/hadoop/native/Linux-amd64-642013-12-09T12:21:51.644+0000 DEBUG query-management-2 com.facebook.presto.execution.SqlQueryManager Remove query 20131209_120650_00000_4qvm6


rooserv...@gmail.com

não lida,
6 de jan. de 2014, 22:33:5806/01/2014
para presto...@googlegroups.com, da...@acz.org, stefan.sc...@gmail.com

Hi Stefan and David,

Any updates on running Presto on S3?

I followed Stefan's method, copying jets3t-0.7.4.jar to:
presto-server-0.54/plugin/hive-hadoop1

and also set the jvm.config.

While, I kept getting the following error:


Query 20140107_031635_00010_2uma6, FAILED, 1 node
http://ec2-54-205-226-69.compute-1.amazonaws.com:8080/v1/query/20140107_031635_00010_2uma6?pretty
Splits: 1 total, 0 done (0.00%)
CPU Time: 0.0s total, 0 rows/s, 0B/s, 0% active
Per Node: 0.0 parallelism, 0 rows/s, 0B/s
Parallelism: 0.0
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20140107_031635_00010_2uma6 failed: java.io.IOException: java.lang.NoClassDefFoundError: org/apache/commons/httpclient/methods/PutMethod
java.lang.RuntimeException: java.io.IOException: java.lang.NoClassDefFoundError: org/apache/commons/httpclient/methods/PutMethod
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:433)
at com.facebook.presto.hive.HiveSplitIterable$HiveSplitQueue.computeNext(HiveSplitIterable.java:392)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at com.facebook.presto.execution.SqlStageExecution.startTasks(SqlStageExecution.java:463)
at com.facebook.presto.execution.SqlStageExecution.access$300(SqlStageExecution.java:80)
at com.facebook.presto.execution.SqlStageExecution$5.run(SqlStageExecution.java:435)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: java.lang.NoClassDefFoundError: org/apache/commons/httpclient/methods/PutMethod
at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:71)
at com.facebook.presto.hive.ForwardingPath.getFileSystem(ForwardingPath.java:47)
at com.facebook.presto.hive.FileSystemWrapper$1.getFileSystem(FileSystemWrapper.java:78)
at com.facebook.presto.hive.HiveSplitIterable.loadPartitionSplits(HiveSplitIterable.java:181)
at com.facebook.presto.hive.HiveSplitIterable.access$100(HiveSplitIterable.java:73)
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:154)
at com.facebook.presto.hive.HiveSplitIterable$2.call(HiveSplitIterable.java:149)
... 4 more
Caused by: java.lang.NoClassDefFoundError: org/apache/commons/httpclient/methods/PutMethod
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:55)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at org.apache.hadoop.fs.s3native.$Proxy152.initialize(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:234)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:92)
at com.facebook.presto.hive.FileSystemCache$2.call(FileSystemCache.java:87)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193)
at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721)
at com.facebook.presto.hive.FileSystemCache$1$1.getFileSystem(FileSystemCache.java:68)
... 10 more
Caused by: java.lang.ClassNotFoundException: org.apache.commons.httpclient.methods.PutMethod
at com.facebook.presto.server.PluginManager$SimpleChildFirstClassLoader.loadClass(PluginManager.java:322)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 34 more

My httpclient version is 4.2.3:
hadoop@ip-10-233-24-207:~$ find . -name httpclient*.jar
./presto-server-0.54/lib/httpclient-4.2.3.jar
./presto-server-0.54/plugin/hive-cdh4/httpclient-4.2.3.jar
./presto-server-0.54/plugin/hive-hadoop1/httpclient-4.2.3.jar
./discovery-server-1.16/lib/httpclient-4.2.3.jar

Do you have any hints or suggestions?

Thanks,
Zhenxiao

David Phillips

não lida,
6 de jan. de 2014, 23:50:3906/01/2014
para presto...@googlegroups.com
I hope to have this fixed in the next release. The code is nearly done.
> --
> You received this message because you are subscribed to the Google Groups "Presto" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

rooserv...@gmail.com

não lida,
7 de jan. de 2014, 01:43:5707/01/2014
para presto...@googlegroups.com, da...@acz.org

Thanks, David. Look forward to it.

stefan.sc...@gmail.com

não lida,
7 de jan. de 2014, 07:29:3607/01/2014
para presto...@googlegroups.com, da...@acz.org, rooserv...@gmail.com
That's great to hear! I unfortunately had not more success that what I posted above. It's interesting, though, that you get errors regarding org/apache/commons/httpclient/methods/PutMethod, I don't remember seeing that particular issue. Probably caused by yet another combination of packages :/.

Cheers!
Stefan

Responder a todos
Responder ao autor
Encaminhar
0 nova mensagem