Hi -
First, let me say that is an impressive accomplishment. We attempted an in-place update of boto to support both Python 2.x and 3.x before and we failed. Perhaps it was partly because we were also trying to convert to requests at the same time. I think we bit off more than we could chew. So, congratulations to all involved. Great job!
I know I've been kind of MIA on this discussion. You shouldn't interpret that to mean that it isn't important to me. It's vitally important. I'm just oversubscribed right now. My short-term focus has been mainly on aws-cli and botocore. We really want to get aws-cli to a 1.0 release as soon as we can. While command line tools appeal to a somewhat different audience than boto, I think there is a lot of overlap and I'm really excited that the new, universal CLI tools for AWS are being developed in Python. It's great for the community and great for our customers.
As far as boto is concerned, here are my thoughts.
The current boto project is thriving. The package has been downloaded from PyPI over 4.5 million times and that number is growing by 8-10K a day. Of course, that doesn't mean that thousands of new individuals are downloading it every day. Most of those downloads are automated installs on EC2 instances as part of deployments (I'm speculating here, no data to back that up). But that number is a good indicator of how many people are using and depending on the current boto code base.
So, the prime directive is to do no harm. We want to continue to support and extend the current boto codebase. If we can get Python 3.x support into the current code base without breaking 2.x support, I am 100% in support of that.
Having said all of that, I do have concerns about the current boto codebase independent of whether it supports Python 3.x or not. The current code base is hand-coded. Every change in a service and every new service requires a considerable amount of new code to be written. Sometimes that code is written by different people (thanks for those contributions, BTW) and all of this code has been written over the course of almost seven years. The first version of boto supported three services, S3, EC2, and SQS. And there was no concept of regions because there was only one endpont for each service. Today we support 23 Amazon services across 8 regions. Plus Google Storage. The pace of innovation on the service side, however, has put a strain on boto. It's very difficult to keep up when significant amounts of hand-coding are required just to get basic access to a new service or an update of an existing service.
That's where botocore comes in. The botocore project is a complete rethinking of the low-level layer of boto. Internal to AWS, we have developed tools that transform the canonical definition of a service into an intermediate JSON format that we can distribute. These JSON service descriptions are now being used by many of the AWS SDK, like Ruby, PHP, Javascript, etc. The botocore package is essentially an engine that can interpret these JSON data service descriptions and dynamically create a low-level Python interface to the services. The benefits are that:
- We can factor out the low-level Python code in such a way that all common services use the same code. So, all Query-style services use the exact same QueryEndpoint class to marshal parameters to the service. All services that use SigV4 for authentication use the exact same SigV4Auth class to sign requests. And all classes with XML responses use the exact same Response class to parse the XML and transform it to native Python data structures. The code base is dramatically smaller and easier to maintain.
- New services and changes in existing services can be incorporated immediately simply by updating the JSON service description. No coding required.
- Since we are developing a new code base, we can design it to be compatible with Python 2.6-3.x from the start. And, we can take advantage of other great Python packages like Kenneth Reitz' requests package rather than rolling our own HTTP layer based on httplib.
The aws-cli package is the first attempt to build something on top of botocore and I have to say it has been going really well. We are very excited about it. The next project will be to build a new boto package, let's call it boto3, on top of the same botocore base. The goal of boto3 would be to supply higher-level interfaces on top of botocore. For example, we may want to have a compatibility layer that tries to emulate the existing boto interfaces for popular services like S3 and EC2. Or, people in the community may want to develop some completely new abstractions layer on top of the botocore layer. But one thing boto3 will not be is 100% compatible with boto2. There will be changes, just as there were changes from Python 2.x to Python 3.x.
So, boto2 and boto3 will be separate projects. We will support both but our hope would be that eventually, most people would make the move to boto3. But we will not abandon the boto 2.x code base. It will continue to be supported as long as people are using it.
This is a pretty long note. I thought about providing a tl;dr but I really would like everyone to read the whole thing. And then let me know what you think. The success of boto is almost completely a result of the community that has grown around it. It's your project as much as it is mine. If you have concerns, express them. If you like the plan, let me know.
Mitch
So, my fork (https://github.com/kurin/boto) now passes all unit tests in pythons 2.6, 2.7, and 3.3. Including the new set of httpretty tests.
The utilities in bin/ still need working on, but I've tested it unextensively in ec2 (spun up a couple of instances, copied some ssh keys around) and s3 (created buckets and keys; stored some data) and it seemed to work. I'd love to get some actual usage tests.