How significant is the "S3 key names prefix similarities bottleneck" to 's3backer'?

159 views
Skip to first unread message

Jeff Byers

unread,
May 5, 2017, 5:53:15 PM5/5/17
to s3backer-devel
How significant is the "S3 key names prefix similarities bottleneck" to 's3backer'?

This issue would apply to 's3backer' since its object names are hex block id's,
sequentially going from '00000000' on up:

  Request Rate and Performance Considerations - Amazon Simple Storage Service
  latency on S3 operations depends on key names since prefix similarities become a bottleneck
  https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
 
I was thinking on making a change locally that would suppress the leading zeros,
which should help the hashing, but would lose the sorted order.

Of course, that change would not be backward compatible with any existing
's3backer' virtual disks.

Does this seem worth the effort?

~ Jeff Byers ~ 

Archie Cobbs

unread,
May 6, 2017, 8:31:15 PM5/6/17
to s3backer-devel
Great question. I haven't really thought about this myself (wasn't aware of it).

As you point out, this is easy to avoid.

Might be cleaner to add a standard-length prefix which is computed using some deterministic hash function of the block number.

Then you'd get blocks with names looking like this:

  DD54012685DF-00000000
  A92813B38727-00000001
  F125D1ABE971-00000002
  4C382BB43B4F-00000003

Etc.

-Archie

mwkorver

unread,
May 17, 2017, 5:56:58 PM5/17/17
to s3backer-devel
Only if S3backer users need more than 300 PUT/LIST/DELETE req/sec and do more than 800 GETs/sec.
I think when Jeff Barr originally blogged about key names 2012, it was in reference to request rate in the order of 50/sec for PUT/LIST/DELETE and 100 for GET.
S3 has improved over the years. Also, note the part in the quote below about how Amazon S3 will automatically partition over time, if growth is gradual.

"Amazon S3 scales to support very high request rates. If your request rate grows steadily, Amazon S3 automatically partitions your buckets as needed to support higher request rates. However, if you expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second, we recommend that you open a support case to prepare for the workload and avoid any temporary limits on your request rate. "

http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
Reply all
Reply to author
Forward
0 new messages