add retry logic inside versioned_bucket_lister when fetching versions

14 views
Skip to first unread message

Tongjie

unread,
May 13, 2013, 4:49:08 PM5/13/13
to boto...@googlegroups.com
Hi all,

in boto/s3/bucketlistresultset.py,  versioned_bucket_lister method fetches 999 key versions at a time;  Our versioning-enabled bucket has more than 100 millions key (and growing), our application that lists through all versions in that bucket often fails due to some ssl_error when listing versions (we have increased http_socket_timeout in /etc/boto.cfg to a fairly large value, but it still happens).  The fact that it fails in the middle of listing is painful because the whole listing process is very long-running. The problem is that bucket module's list_versions has no idea about where it failed, hence retry on bucket.list_versions won't help.  We push the retry logic inside the versioned_bucket_lister method of bucketlistresultset.py, so that it can retry with finer granularity hence bucket.list_versions would have more resilience.

Would that be something that could be added to boto library itself?

Thanks,

Tongjie


diff -r boto/s3/bucketlistresultset.py boto-2.6.0/boto/s3/bucketlistresultset.py
> from time import sleep
>
> def retry(run, numRetries=5, sleepTime=1, backoff=2):
>     assert numRetries >= 0, numRetries
>     assert sleepTime > 0, sleepTime
>
>     while True:
>         try:
>             return run()
>         except Exception, e:
>             if numRetries == 0:
>                 raise RuntimeError("fail after %d retries" % numRetries, e)
>             sleep(sleepTime)
>             numRetries -= 1
>             sleepTime *= backoff
>     assert 0, 'should never reach here'
>
66c83
<         rs = bucket.get_all_versions(prefix=prefix, key_marker=key_marker, .... )
---
>         rs = retry(lambda:bucket.get_all_versions(prefix=prefix, key_marker=key_marker, ... ))

Reply all
Reply to author
Forward
0 new messages