S3 Signature Version 4 support in newer AWS regions

1,370 views
Skip to first unread message

Shuai Chang

unread,
Aug 21, 2016, 9:35:24 PM8/21/16
to Druid User
We've encountered issue for indexing service in newer AWS regions such as ap-northeast-1 and eu-central-1. The index.zip are successfully uploaded to S3 but when historical nodes are not able to read them with below error, I believe this might be related to signature version 4 support.

Caused by: org.jets3t.service.impl.rest.HttpException: 400 Bad Request
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:425) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1052) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2264) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2193) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:2574) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1773) ~[jets3t-0.9.4.jar:0.9.4]


I've seen previous posts on similar issues, is there a fix or workaround on this?
https://groups.google.com/forum/#!searchin/druid-user/Ingetst$20local$20data$20to$20s3$20deep$20storage$20failed|sort:relevance/druid-user/E-Hd0nsY2Wg/m9Z1VwEsBQAJ

https://groups.google.com/forum/#!searchin/druid-user/s3$20signature$20version$204|sort:relevance/druid-user/vpAOj9KIoTg/etHponv4BAAJ

https://groups.google.com/forum/#!searchin/druid-user/s3$20signature$20version$204|sort:relevance/druid-user/VYAySNm7PUw/JHTlSOmFAQAJ

Shuai Chang

unread,
Aug 21, 2016, 9:41:14 PM8/21/16
to Druid User
We are using Druid 0.8.3

Shuai Chang

unread,
Aug 22, 2016, 1:49:38 AM8/22/16
to Druid User
I was able to solve the problem, after raising the log level to debug, I found below:
2016-08-22 05:08:27,506 DEBUG o.j.s.Jets3tProperties [ZkCoordinator-0] s3service.s3-endpoint=s3.amazonaws.com
2016-08-22 05:08:27,506 DEBUG o.j.s.Jets3tProperties [ZkCoordinator-0] storage-service.request-signature-version=AWS2

Obviously the S3 endpoint is not correct for the region (ap-northeast-1 in my case), also the AWS2 is not the correct signature version either:

Two things are required to fix this:
1. Added a jets3t.properties file in _common/jets3t.properties
2. In jets3t.properties, added below lines:
s3service.s3-endpoint=s3.ap-northeast-2.amazonaws.com
storage-service.request-signature-version=AWS4-HMAC-SHA256

Then restart historical nodes, the load should be fine now

Some references to reach to this point:
Gian's comment on turning on the debug log: https://groups.google.com/forum/#!topic/druid-user/efSrQt8a3S8
jets3t support for S3 sigv4: https://bitbucket.org/jmurty/jets3t/issues/183/support-for-aws-signature-version-4

Hope this helps for people using newer AWS regions like eu-central-1 and ap-northeast-1

Jonathan Wei

unread,
Aug 22, 2016, 6:19:43 PM8/22/16
to druid...@googlegroups.com
Awesome, thanks for researching this solution!

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/af0f8fac-4d61-4e17-afca-28edd8c6429e%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Shuai Chang

unread,
Aug 26, 2016, 12:31:04 AM8/26/16
to Druid User
Some additional notes:

The above configs will fix the historical nodes not able to read from S3 in those AWS regions. However, once those configs are set, batch index will start to fail with java.io.IOException: Resetting to invalid mark. To fix the entire issue, below is what I have in my _common/jets3t.properties

s3service.s3-endpoint=s3.eu-central-1.amazonaws.com
storage-service.request-signature-version=AWS4-HMAC-SHA256
uploads.stream-retry-buffer-size=2147483646


uploads.stream-retry-buffer-size has to be bigger than the final segment size before uploading to S3. Not entirely sure the reason but that's the observation.


Caused by: java.lang.RuntimeException: Failed to automatically set required header "x-amz-content-sha256" for request with entity org.jets3t.service.impl.rest.httpclient.RepeatableRequestEntity@36e1eb58
        at org.jets3t.service.utils.SignatureUtils.awsV4GetOrCalculatePayloadHash(SignatureUtils.java:259) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.authorizeHttpRequest(RestStorageService.java:778) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326) ~[jets3t-0.9.4.jar:0.9.4]

        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestPut(RestStorageService.java:1157) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.createObjectImpl(RestStorageService.java:1968) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.putObjectWithRequestEntityImpl(RestStorageService.java:1889) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.putObjectImpl(RestStorageService.java:1881) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.StorageService.putObject(StorageService.java:840) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.S3Service.putObject(S3Service.java:2212) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.S3Service.putObject(S3Service.java:2356) ~[jets3t-0.9.4.jar:0.9.4]
        at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:87) ~[hadoop-common-2.3.0.jar:?]
        ... 33 more
Caused by: java.io.IOException: Resetting to invalid mark
        at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) ~[?:1.8.0_102]
        at org.jets3t.service.utils.ServiceUtils.hash(ServiceUtils.java:238) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.utils.ServiceUtils.hashSHA256(ServiceUtils.java:267) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.utils.SignatureUtils.awsV4GetOrCalculatePayloadHash(SignatureUtils.java:251) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.authorizeHttpRequest(RestStorageService.java:778) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326) ~[jets3t-0.9.4.jar:0.9.4]

        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:279) ~[jets3t-0.9.4.jar:0.9.4]
        at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestPut(RestStorageService.java:1157) ~[jets3t-0.9.4.jar:0.9.4]

Иван Дорошенко

unread,
Aug 26, 2016, 9:20:06 AM8/26/16
to Druid User
Can you please clarify how to provide jets3t.properties config for batch index tasks?

пятница, 26 августа 2016 г., 7:31:04 UTC+3 пользователь Shuai Chang написал:

Felipe Barros

unread,
Feb 7, 2017, 12:38:50 PM2/7/17
to Druid User
Awesome man!!!

You deserve a beer

Jaspinder Virdee

unread,
May 24, 2017, 2:16:51 AM5/24/17
to Druid User
where is _common folder of _common/jets3t.properties ?

Gian Merlino

unread,
May 24, 2017, 11:28:20 AM5/24/17
to druid...@googlegroups.com
It's in conf/_common (or conf-quickstart/_common if you're using the quickstart config).

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

Gaurav Shah

unread,
Nov 16, 2017, 8:08:44 AM11/16/17
to Druid User
Thank you man!!!


On Sunday, August 21, 2016 at 6:35:24 PM UTC-7, Shuai Chang wrote:

chaitany...@zeotap.com

unread,
Mar 13, 2018, 2:49:13 AM3/13/18
to Druid User
Hi,

This solution unfortunately does not work with 0.11.0 version. Is there something else that has to be done ?
Thanks,
Chaitanya

Lawrence Huang

unread,
May 24, 2018, 7:52:00 PM5/24/18
to Druid User
I did the following to use S3 deep storage in eu-central-1:

Build druid 0.11.0 with modifications to use hadoop 2.8.3

ex: https://github.com/druid-io/druid/compare/0.11.0...hoesler:feature/hadoop2.8

git clone https://github.com/hoesler/druid.git

cd druid

git checkout 47290406a5fa01200545ab0825e7500dafdcfaba

mvn clean package -DskipTests

Creates the following files:

  • distribution/target/druid-0.11.0-bin.tar.gz

  • distribution/target/mysql-metadata-storage-0.11.0.tar.gz


Use the druid-hdfs-storage extension with an s3 storage directory. This should work the same way as s3 deep storage. Example relevant part of _common/common.runtime.properties

druid.extensions.loadList=["druid-s3-extensions", "mysql-metadata-storage", "druid-hdfs-storage"]


#druid.storage.type=s3

#druid.storage.bucket=${S3_BUCKET}

#druid.storage.baseKey=druid/segments

druid.s3.accessKey=${S3_ACCESS_KEY_ID}

druid.s3.secretKey=${S3_SECRET_ACCESS_KEY}


druid.storage.type=hdfs

druid.storage.storageDirectory=s3a://${S3_BUCKET}/druid/segments


Have hadoop use S3a. Example relevant part of _common/core-site.xml:

<property>

<name>fs.s3a.endpoint</name>

<value>s3.${AWS_REGION}.amazonaws.com</value>

</property>

<property>

<name>fs.s3.impl</name>

<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>

</property>

<property>

<name>fs.s3n.impl</name>

<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>

</property>

<property>

<name>fs.s3a.impl</name>

<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>

</property>

<property>

<name>fs.s3a.access.key</name>

<value>${S3_ACCESS_KEY_ID}</value>

</property>

<property>

<name>fs.s3a.secret.key</name>

<value>${S3_SECRET_ACCESS_KEY}</value>

</property>


Thanks to https://github.com/hoesler


Reply all
Reply to author
Forward
0 new messages