S3 Read Write Limitation Issue (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down)

380 views
Skip to first unread message

Akash Gupta

unread,
Apr 16, 2020, 3:09:04 AM4/16/20
to Delta Lake Users and Developers
When running with larger machines in a cluster (for example 17 m5.4xlarge with 16 cores, 64GB memory) sometimes this error was received when writing to the table:

com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down)

Tathagata Das

unread,
Apr 16, 2020, 11:21:10 AM4/16/20
to Akash Gupta, Delta Lake Users and Developers
This is essentially S3's rate limiting. It happens when you are generating too many files/objects too fast and S3 is not happy with it. How to solve it depends on your workload, either slow down the data write by using a smaller cluster or tune your workloads parameters to generate larger files (therefore, a lesser number of objects for the same write size).

On Thu, Apr 16, 2020 at 3:09 AM Akash Gupta <akash...@gmail.com> wrote:
When running with larger machines in a cluster (for example 17 m5.4xlarge with 16 cores, 64GB memory) sometimes this error was received when writing to the table:

com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down)

--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/e431a4dc-5454-4273-a0c7-0669824df932%40googlegroups.com.

Jin Lee

unread,
Apr 16, 2020, 11:54:25 AM4/16/20
to Tathagata Das, Akash Gupta, Delta Lake Users and Developers
AWS recommends some strategy/workaround related to this:




--
Jin Lee
Architect - Data, AI, Infrastructure


doc.ai

AI -  federated learning - on real world data to develop predictive models and personal health insights.

The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged.  Inadvertent disclosure of this message does not constitute a waiver of any privilege.  If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message.  Please also delete this e-mail and all copies and notify the sender.  Thank you.

Sanchit Gautam

unread,
Apr 21, 2020, 1:46:03 AM4/21/20
to Jin Lee, Tathagata Das, Akash Gupta, Delta Lake Users and Developers
Having a micro-batching strategy in your processing is also important along with the larger file size. Also, take into consideration the Object-level invocations taking place in your system.



--
Thanks&Regards,
Sanchit Gautam
09807344668

Gourav Sengupta

unread,
Apr 21, 2020, 3:21:43 AM4/21/20
to Sanchit Gautam, Jin Lee, Tathagata Das, Akash Gupta, Delta Lake Users and Developers
Hi,

one of the other methods that you could use is by enabling the EMRFS consistent view. That will massively speed up the process as well :) 

Thanks and Regards,
Gourav Sengupta 

Reply all
Reply to author
Forward
0 new messages