Autogenerating the AUTHORS.txt file

32 views
Skip to first unread message

Donald Stufft

unread,
Feb 25, 2014, 1:22:03 PM2/25/14
to pypa-dev
I would like to propose that instead of manually curating the AUTHORS.txt file
and trying to remember to update when we have a new author that we should
automatically generate it at release time from the git log. This would have
the following features:

* Automatically recognize new authors without needing to do any of the busy
work to manage the AUTHORS list.
* Uses the standard git .mailmap to allow collapsing different names by the
same author into one canonical one.
* Some sort of override or template method so that we can add boilerplate
and/or additional names that do not appear in the git log.

I did a quick test of our current git log, and it appears we lose these names:

* Aziz Köksal
* David (d1b)
* dengzhp
* Endoh Takanao
* Gabriel de Perthuis
* Geoffrey Lehée
* hetmankp
* Jakub Stasiak
* John-Scott Atlakson
* Jon Parise
* Masklinn
* Markus Hametner
* Preston Holmes
* Przemek Wrzos
* Thomas Johansson
* Hsiaoming Yang

And we gain these names:

* Ashley Manton
* Baptiste Mispelon
* Ben Darnell
* Daniel Jost
* David
* Dongweiming
* Erik Bray
* Gabriel
* Jim Garrison
* Jorge Niedbalski R
* Matthew Iversen
* MiCHiLU
* Oscar Benjamin
* Ralf Schmitt
* Stefan Scherfke
* Your Name
* Yu Jian
* Zearin
* anatoly techtonik
* andreiko
* awentzonline
* coagulant
* david
* dengzhp
* dholth
* fin
* hetmankp
* john.scot...@gmail.com
* jstasiak
* lepture
* masklinn
* prencher
* ptone
* socketubs
* unknown
* y-p


Immediately I see some of the names that are missing appear to be in the new
names list in a slightly different form, perhaps as a username. These would be
taken care of by the .mailmap file. It also shows a decent number of people
that we are currently missing.

So, Thoughts? Concerns?

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

signature.asc

Jannis Leidel

unread,
Feb 25, 2014, 2:44:40 PM2/25/14
to Donald Stufft, pypa-dev
If we’re going to automate updating AUTHORS.txt like this we still need to make sure we manually update the .mailmap file with real names before every release. Assuming that’s your intention I don’t see a problem with your proposal as it’s basically about reducing the amount of time we spend on every big release instead of blindly accepting what the git index contains.

Jannis

signature.asc

Jason R. Coombs

unread,
Feb 25, 2014, 2:51:52 PM2/25/14
to Jannis Leidel, Donald Stufft, pypa-dev
I've been thinking of just dropping the CONTRIBUTORS file from setuptools and in the documentation acknowledging all contributors and referencing the repository for details. I don't see a lot of value in maintaining a text file with the project when the repository history is public and readily accessible.

Jannis Leidel

unread,
Feb 25, 2014, 3:00:32 PM2/25/14
to Jason R. Coombs, Donald Stufft, pypa-dev
Yeah, I also thought about that but a while ago I decided for myself to consider the contribution from other developers worth my time updating the AUTHORS files. It’s not only a nice gesture but also gives clear indication to the authorship legally. Only refering to a technical process (git log) which could change over time and often doesn’t even contain the real name of the contributor is a no-go to me.

Jannis
signature.asc

Donald Stufft

unread,
Feb 25, 2014, 5:18:24 PM2/25/14
to Jannis Leidel, pypa-dev
Well we only need to update the .mailmap file if the commit author information doesn’t contain a full name (or the wrong full name). What I put above is the data with no .mailmap file, all other names in the AUTHORS.txt stay exactly as they are.

Essentially I’m trying to figure out how make our release and contributing process less error prone. I’ve done a few releases now and there are a bunch of steps that are manual steps that if we automate them will be more likely to actually happen each release. This is even more important on contributions because we’re really bad at checking PRs for these more procedural kinds of things like “Is the author in AUTHORS.txt?” and “Is the style nice?” (This is why i’ve recently added flake8 to the repo!). One of my short term goals is to automate at the very least *checking* that things are what they should be if not automating doing it all together.

Another nice thing about the failure modes of the two methods. The current method the failure mode is a name doesn’t show up in AUTHORS.txt which is basically impossible to detect without going through the git log. With my proposal the failure mode is that we’ll get a less than pretty line in the AUTHORS.txt (like an email address or a user name) which is easier to detect for a human reading over the AUTHORS.txt.
signature.asc

Donald Stufft

unread,
Feb 25, 2014, 7:44:22 PM2/25/14
to Jannis Leidel, pypa-dev
Here’s a WIP of my proposal: https://github.com/pypa/pip/pull/1594
signature.asc

Donald Stufft

unread,
Feb 25, 2014, 10:13:47 PM2/25/14
to Jannis Leidel, pypa-dev
Ok, I have the .mailmap file in the above PR more or less worked out. There are a few open issues though.

Aziz Köksal was added to the AUTHORS file because they wrote and allowed us to relicense tests/lib/path.py, however since there was no direct contribution to pip they do not appear in the new git generated AUTHORS. How do we want to handle this? Should we leave them off of AUTHORS (since the file they wrote has it’s own license header) or should I add a mechanism to add additional names?

There is a commit made by an unknown person named "unknown <hakaton@Anam-mbl.(none)>”. The commit is https://github.com/pypa/pip/commit/8ebc7a316016c4eb01cb77c8df1fe538c7bddf2a . I’m guessing the answer is no but does anyone remember who this may be?

I would like to change the format of the AUTHORS file from ``Author Name`` to ``Author Name <aut...@example.com>``. I have a rough sketch of this done and I think it makes the unknown user better as well as the overall experience of the authors file nicer.

Any thoughts on these 3 things?
signature.asc

Jannis Leidel

unread,
Feb 26, 2014, 6:10:02 AM2/26/14
to Donald Stufft, pypa-dev
Hm, yeah, I think that case is an exception because it really is basically a vendored library and we don’t mention the authors of the other vendored libraries as well. I see three options: a) add a “used libraries” section to the docs mentioning its authors b) just drop Aziz’ name c) move to an official lib like pathlib (as it’s included in Python>3.4)

> There is a commit made by an unknown person named "unknown <hakaton@Anam-mbl.(none)>”. The commit is https://github.com/pypa/pip/commit/8ebc7a316016c4eb01cb77c8df1fe538c7bddf2a . I’m guessing the answer is no but does anyone remember who this may be?

This was merged by Carl here: https://github.com/pypa/pip/commit/e08dfa897c241df5a2203180a1d8c6dec35d53af That gives us at least the Github nickname.

> I would like to change the format of the AUTHORS file from ``Author Name`` to ``Author Name <aut...@example.com>``. I have a rough sketch of this done and I think it makes the unknown user better as well as the overall experience of the authors file nicer.

+1 (although how would we handle contributors that have committed with different email addresses? (if we have any))
signature.asc

Donald Stufft

unread,
Feb 26, 2014, 12:48:55 PM2/26/14
to Jannis Leidel, pypa-dev
Sounds good to me.

>
>> There is a commit made by an unknown person named "unknown <hakaton@Anam-mbl.(none)>”. The commit ishttps://github.com/pypa/pip/commit/8ebc7a316016c4eb01cb77c8df1fe538c7bddf2a . I’m guessing the answer is no but does anyone remember who this may be?
>
> This was merged by Carl here: https://github.com/pypa/pip/commit/e08dfa897c241df5a2203180a1d8c6dec35d53af That gives us at least the Github nickname.

I followed the github nickname to another repo where they have committed recently and got:

Andrei Geacar <andrei...@gmail.com>

>
>> I would like to change the format of the AUTHORS file from ``Author Name`` to ``Author Name <aut...@example.com>``. I have a rough sketch of this done and I think it makes the unknown user better as well as the overall experience of the authors file nicer.
>
> +1 (although how would we handle contributors that have committed with different email addresses? (if we have any))

.mailmap handles this too. It lets you compress multiple user entries into one canonical. If you look at the diff I’ve done this for myself and for Carl.

>
>> Any thoughts on these 3 things?
>>
>> -----------------
>> Donald Stufft
>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


signature.asc

Marcus Smith

unread,
Feb 27, 2014, 12:40:56 PM2/27/14
to pypa...@googlegroups.com
On Tue, Feb 25, 2014 at 11:51 AM, Jason R. Coombs <jar...@jaraco.com> wrote:
I've been thinking of just dropping the CONTRIBUTORS file from setuptools and in the documentation acknowledging all contributors and referencing the repository for details. I don't see a lot of value in maintaining a text file with the project when the repository history is public and readily accessible.


oh, didn't see this, interesting, I tend to agree, although I certainly won't stand in the way of others (i.e. Donald) working to make the authors file process easier.
 
Reply all
Reply to author
Forward
0 new messages