Unable to perform operations on buckets using hadoop when using jets3t AWS V4 signature

162 views
Skip to first unread message

Rajat Jain

unread,
Aug 26, 2015, 1:44:34 PM8/26/15
to JetS3t Users
I am trying to use AWS V4 signature for authorization using jets3t. The problem is that other than operations like 'ls' nothing is working. I'm using jets3t 0.9.3.

A sample request I made was:

/usr/lib/hadoop/bin/hadoop dfs -put some_file s3://bucket-name-in-singapore/

The problem is that in RestStorageService.java#authorizeHttpRequest - it makes the call like

String region = SignatureUtils.awsRegionForRequest(httpMethod);

This call returns null because the hostname for the http URI is bucket-name-in-singapore.s3.amazonaws.com. Hence it defaults to "us-east-1" and then errors out.

Any idea what is the best way to fix this?

Thanks,
Rajat

Rajat Jain

unread,
Aug 26, 2015, 3:10:33 PM8/26/15
to JetS3t Users
In more naive terms, an operation like:

/usr/lib/hadoop/bin/hadoop dfs -put some_file s3://bucket-name-in-singapore/dir/ works with AWS v2 but nto with AWS v4 when using jets3t.

Any idea why this happens?

James Murty

unread,
Aug 26, 2015, 10:45:20 PM8/26/15
to jets3t...@googlegroups.com
Hi Rajat,

Can you provide more information about exactly what error you are getting back from S3? Also, can you please try with version 0.9.4 which was just released a few days ago?

My best guess is that JetS3t isn't able to determine the correct location for the bucket, and therefore cannot generate a AWS v4 signature. The need for the client to know the location of a bucket before it can generate a valid v4 signature is an ongoing problem that JetS3t works around as best it can by catching related error responses and retrying, but this technique is far from perfect (see [1] and later).

If you know the location for your buckets in advance you can tell JetS3t's service, so it doesn't need to try and figure this out by itself. Do something like the following immediately after you instantiate the RestS3Service:

    myRestS3Service.getRegionEndpointCache().putRegionForBucketName("bucket-name-in-singapore", "ap-southeast-1");

Hope this helps,
James




--
You received this message because you are subscribed to the Google Groups "JetS3t Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jets3t-users...@googlegroups.com.
To post to this group, send email to jets3t...@googlegroups.com.
Visit this group at http://groups.google.com/group/jets3t-users.
For more options, visit https://groups.google.com/d/optout.

Rajat Jain

unread,
Aug 27, 2015, 1:03:24 PM8/27/15
to jets3t...@googlegroups.com
Hi James,

Thanks for your reply. I upgraded to 0.9.4, but still getting the same error. This is the error that I'm facing (I'm running a mapreduce job). Does this error sound familiar to you?

Thanks,
Rajat

org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:329)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:280)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1066)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2283)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2212)
at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:2575)
at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1774)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:208)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:886)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1650)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:292)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:337)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:303)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:501)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:632)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:624)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:500)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1350)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1635)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1347)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1635)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:429)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:98)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:77)
Caused by: org.apache.http.ProtocolException: Received redirect response HTTP/1.1 301 Moved Permanently but no location header
at org.apache.http.impl.client.DefaultRedirectStrategy.getLocationURI(DefaultRedirectStrategy.java:139)
at org.apache.http.impl.client.DefaultRedirectStrategy.getRedirect(DefaultRedirectStrategy.java:217)
at org.apache.http.impl.client.DefaultRequestDirector.handleResponse(DefaultRequestDirector.java:1105)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:548)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
... 48 more

--
You received this message because you are subscribed to a topic in the Google Groups "JetS3t Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jets3t-users/YSL3T2ZqE9Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jets3t-users...@googlegroups.com.

James Murty

unread,
Sep 2, 2015, 8:40:59 AM9/2/15
to jets3t...@googlegroups.com
Hi Rajat,

It's a bit hard to see what is going on, but it might be that this issue is caused by the "initial HEAD request" problem with AWS v4 signatures that JetS3t cannot solve. In short, for all request types except HEAD JetS3t can learn itself the location of a bucket which it needs to sign v4 requests and eventually succeed despite signing errors. For HEAD requests, however, it cannot.

I describe the problem in more detail here: https://groups.google.com/forum/#!topic/jets3t-users/Y2x-KL3LIgg

If this is indeed the problem, and if you know the location of your buckets in advance, you can inform the S3 service of bucket locations and it may work around the problem.

    // Set location of bucket in JetS3t's bucket-name-to-location cache, before making any HEAD requests
    service.getRegionEndpointCache().putRegionForBucketName(bucketName, regionCode)

Hope this helps,
James

Reply all
Reply to author
Forward
0 new messages