On 1/6/15 7:19 PM, Christian Hammond wrote:
> Hey everyone,
>
> Gregory Szorc just provided us with a patch to convert one of our modules to support the new Python Wheel format. Thanks for that!
>
> There were some questions along with the patch that I wanted to address to a wider audience. I thought this would be a good time to get a discussion going on the future of Python packaging and how it may impact us.
>
> This is a long e-mail, so I'll try to section things off.
Yikes, I appear to have disturbed a hornets nest. Sorry about that.
For the record, the plaintext distribution and requirement of
easy_install are what bother me most. easy_install has poor verification
(non-existent?) verification and combined with the plaintext
distribution are a gaping security hole. pip has much saner defaults
around trust management. Unfortunately, you can't `pip install
ReviewBoard` today - at least not easily. Because pip can't find binary
packages (only source distributions), it must build
{django-evolution,Djblets,ReviewBoard} from source archives. And this
likely fails due to missing dependencies required to build the media
files. Installing those dependencies is a nightmare.
If `pip install ReviewBoard` "just worked" and installed from pre-built
binary packages, I would be happy. I understand parts of that (mainly
packaging) would be hard.
More comments inline.
>
> == Wheels and Eggs ==
>
> For those who aren't familiar with the world of Python packaging, the Wheel format is a newer binary package standard designed to provide a common way to ship modules and to replace the older egg format (which is what we use). All the new packaging standards for metadata and versioning improvements are designed to work with wheels.
>
> Wheels and eggs differ in a few key ways. Wheels are an installation format, which gets unpacked on install, while eggs can either be unpacked or can remain as a single relocatable file. Content in eggs are importable and can be selectively extracted or read, using a set of APIs in the pkg_resources module. Wheels cannot do this. Eggs ship .pyc files, whereas Wheels generate them at install time (potentially problematic -- see below).
>
> Wheels are also versioned differently, in an interesting way. Unlike eggs, they don't have to be tied to a Python version, so we could, for instance, ship a single Wheel for Review Board 2.0.x, and not have to ship separate ones for Python 2.6 and 2.7. Neat, but in practice, it's not a strong benefit, as we automate all our builds, which need to package and run tests on the various versions anyway.
>
> Wheels are new, though, and not supported everywhere. One needs to have a modern version of pip/setuptools to install them or work with them. This can be a problem on older Linux distros, or companies with servers that are set up a very particular way, where upgrading components is hard/time-consuming.
>
> Now, I'm not at all an expert on Wheels. There are things I will need to be educated on, or things I'm misunderstanding. I'll have some questions later in this e-mail.
>
> Wheels are definitely the way to go moving forward, as the Python world has agreed on it and that's where the tools are moving. It's possible, though, that we may face some challenges in moving to them entirely.
The existence of wheels does not exclude non-wheel distribution. On
PyPI, it is common to upload both a wheel and a classical "sdist" source
archive. If your Python packaging tool knows about wheels and a wheel
for your system is available, it fetches the wheel. Else, it falls back
to installing from the source archive.
I'd like to see various Review Board packages released as both source
archives and wheels so people with modern tools can take advantage of
their existence. i.e. supplement eggs with wheels.
> == Impact on Review Board ==
> 2) Wheels, last I read, never ship .pyc files. This is in order to support different builds/versions of Python in a single Wheel package. However, commercial extensions (such as Power Pack) do not ship source, and need to ship the compiled .pyc files instead. We'd need a solution for this long-term. Short-term, so long as eggs can be installed at all, we're still okay.
In case .pyc-only wheels are not supported, you may wish to raise your
use case to the distutils-sig list so support may be considered.
https://mail.python.org/mailman/listinfo/distutils-sig
>
> == Impact on users ==
>
> One concern I do have is legacy support. Review Board is installed on a lot of older machines that don't have any sort of modern setuptools/pip. We can't make everyone upgrade this to get a new version of Review Board, so for a while, we're going to have to ship eggs alongside any wheels. I assume wheels will take precedence if both are available?
Yes, tools will prefer wheels.
> We'll also need to keep shipping eggs/allowing eggs to be built in order to avoid problems for companies that either pull down the eggs separately from easy_install/pip, or those who build eggs for installation themselves. There would need to be a sufficient amount of time allowed for people to update their processes.
That makes sense to me.
> We also know that, while legacy, ez_setup.py is still needed in our builds by some companies' setups. We had an old version pointing to a set of URLs that were no longer around, and had bug reports from customers who couldn't install because of it.
The presence of ez_setup.py is encouraging legacy and insecure packaging
practices. I think Review Board should be a forcing function for these
companies to upgrade their deployment methodology. I don't think adding
a few lines to the docs to say "if easy_install isn't available do X" is
too high of a penalty for those not following best practices.
> == Some questions about Wheels ==
>
> Some other things I want to make sure of, because I can't find a lot of reliable info:
>
> 1) Are Entrypoints supported in Wheel?
Yes.
> 2) Does pkg_resources continue to work as before? We need this for quite a lot.
I believe the data parts of pkg_resources continue to work. What doesn't
work is the multi-version / parallel install foo. My understanding is
Python's take on mutli-versioning is "don't do it: use virtualenvs instead."
> 3) Is there a good comparison guide somewhere showing the egg features that do or do not carry over to Wheel?
https://www.python.org/dev/peps/pep-0427/
https://packaging.python.org.org/en/latest/wheel_egg.html
> 4) Are there any issues with a Wheel package depending on an egg? I assume not, but want to be sure.
I'm not sure. But I do know that pip does *not* support installing from
eggs.
>
> == Package hosting ==
>
> Another topic that came up was the hosting of packages.
>
> We don't use PyPI for a few reasons:
> 2) As mentioned above, not everyone installs using easy_install/pip, and not everyone uses the latest released versions. We know of companies out there that manually wget specific builds from current or older versions from the build directories on
downloads.reviewboard.org, and copy them to an internal machine or integrate it with an internal deployment system. I believe you can get older builds on PyPI, but the interface doesn't make it as easy.
Indeed, the PyPI web interface does not make finding old versions easy.
I suspect that's by design.
You can use pip requirements files to pin package versions. Anyone who
cares about determinism and reproducibility should be doing this.
https://pip.pypa.io/en/latest/user_guide.html#requirements-files
> 3) We distribute alphas/betas/RCs of builds, or sometimes custom builds, that we don't want people to install accidentally. It's too easy to cause problems there with PyPI. I think I read that the newer stuff can be made to install stable builds by default and not install some new alpha (might be wrong), but again, a good portion of our user base are using older installs.
>
> By keeping it separate, we can safely give people a command to install an alpha/beta/RC. Or a custom build, without having to advertise it to the world on PyPI.
By default, pip only finds and installs stable versions.
https://packaging.python.org/en/latest/installing.html#installing-prereleases
If maintaining separate alphas/betas/RCs distributions is important, I
think the preferred way to do that is to maintain a separate packaging
index and direct users at that. e.g. `pip install --index-url
https://downloads.reviewboard.org/alpha ReviewBoard`
> 4) We sometimes also build private staging releases of several of our components (into a staging location on
downloads.reviewboard.org), run test installs and deployments off of those in a sandboxed environment, and make sure all that works before making that release public. We can't do this nearly as easily with PyPI, and would have to special-case stuff.
Run a separate index.