Preliminary httplib->requests branch

4 views
Skip to first unread message

Gregory Taylor

unread,
Feb 4, 2012, 2:07:08 AM2/4/12
to boto...@googlegroups.com
For giggles, I decided to take a stab at getting as much of boto working under requests (http://docs.python-requests.org/) as I could in one night. I was pleasantly surprised at how quickly this progressed, and figured I'd mention the effort on here for others to look and/or help with.

https://github.com/boto/boto/tree/requests_refactor

Cool things:
* Removed boto's connection pooler. This is built into the requests module, and is 100% automatic. -300 lines.
* Headers and a few other things are greatly simplified.
* It looks like we'd be able to nuke a ton of SSL gunk that was previously needed to play nicely with httplib.

What works:
* sdb
* sqs
* parts of s3
* Maybe some others? Hard to tell with our test coverage.

What doesn't:
* S3 uploads. Streaming uploads was a bit of custom work. Should be easy to restore via requests, just haven't had time. A lot of the other operations work, though.
* DynamoDB. Ran into a weird header issue with the session tokens. Not sure what's going on here.
* Others? - Didn't let the unit tests run all the way through.
* proxy support. I think this will be a lot more simple with requests.

This is merely an experimental branch, and I'll try to keep it up to date with master. If we end up liking how this progresses, perhaps we can consider it seriously. I don't think it'd take a huge amount of work to get all of the tests passing again. It's mostly little stuff. If you use or are intimate with a boto module or two, give this a shot and fix any problems you encounter.

Greg

Michael Schwartz

unread,
Feb 21, 2012, 11:45:36 AM2/21/12
to boto...@googlegroups.com
Hi Greg and Mitch,

Thanks for the work on the Requests branch. I look forward to having boto use this much better designed replacement for httplib.

One thing I wanted to point out is that gsutil allows the user to optionally dump the HTTP request/response protocol, which is useful for troubleshooting. gsutil -d is also useful when a user wants to understand something about how the protocol works (e.g., when trying to understand how the resumable upload protocol works so they can implement it in a new library or tool).

At present this mechanism is supported via boto's debug variable, that gets passed down to httplib via connection.set_debuglevel(). I'm wondering how this might work when httplib is replaced by Requests. One possibility would be to modify boto to extract the needed info from member data on the Request class (url, headers, etc.) However I'm not sure that will capture all the relevant details. For example, if a user attempts an upload and the connection times out, will the initial PUT attempt show up? And, will the retries show up, or would it just show up as a single PUT?

Although this is a gsutil feature, being able to diagnose an HTTP session seems like a generally useful capability for other boto applications.

We're willing to do some work on the code in the requests branch, to support this functionality. But first I wanted to bring it up on the boto-dev list, to make sure folks are aware of this dependency, and find out if you're planning to support this capability.

Thanks,

Mike
Reply all
Reply to author
Forward
0 new messages