having authors names in .py files

Ondrej Certik

unread,

Dec 7, 2007, 6:08:34 AM12/7/07

to sage-...@googlegroups.com

Hi,

currrently *.py files in SAGE usually contain names who wrote them.
The famous Karl Fogel's Producing Open Source Software discourages
that:

http://producingoss.com/en/managing-volunteers.html#territoriality

mainly:

People sometimes argue in favor of author or maintainer tags in source
files on the grounds that this gives visible credit to those who have
done the most work there. There are two problems with this argument.
First, the tags inevitably raise the awkward question of how much work
one must do to get one's own name listed there too. Second, they
conflate the issue of credit with that of authority: having done work
in the past does not imply ownership of the area where the work was
done, but it's difficult if not impossible to avoid such an
implication when individual names are listed at the tops of source
files. In any case, credit information can already be obtained from
the version control logs and other out-of-band mechanisms like mailing
list archives, so no information is lost by banning it from the source
files themselves.

Ondrej

Martin Albrecht

unread,

Dec 7, 2007, 6:26:33 AM12/7/07

to sage-...@googlegroups.com

Also, I would add that these names are usually poorly maintained (at least for
Sage) partly due to "the awkward question of how much work one must do to get
one's own name listed there".

Martin

--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_www: http://www.informatik.uni-bremen.de/~malb
_jab: martinr...@jabber.ccc.de

David Joyner

unread,

Dec 7, 2007, 7:32:05 AM12/7/07

to sage-...@googlegroups.com

I am certainly happy to share credit with anyone on any file I work on.
IMHO, anyone who does anything non-trivial has the write to put their
name on a xyz.py file, at least if they are happy to cede their copyright to
William Stein. In fact, for licensing issues, I would think it is useful to know
who worked on which files.
If the concern is that it is hogging docstring real estate, then maybe
a new xyz.history file could be created which would have all this info in it?

++++++++++++++++++++++++++++

Ondrej Certik

unread,

Dec 7, 2007, 9:50:35 AM12/7/07

to sage-...@googlegroups.com

On Dec 7, 2007 1:32 PM, David Joyner <wdjo...@gmail.com> wrote:
>
> I am certainly happy to share credit with anyone on any file I work on.
> IMHO, anyone who does anything non-trivial has the write to put their
> name on a xyz.py file, at least if they are happy to cede their copyright to
> William Stein. In fact, for licensing issues, I would think it is useful to know
> who worked on which files.
> If the concern is that it is hogging docstring real estate, then maybe
> a new xyz.history file could be created which would have all this info in it?

I think the idea of the OSS book (and also mine) is to use the
Mercurial revision history for tracking who wrote each line.
It's the psychology, for example when I modified the calculus.py, by
adding a new function and a few lines, I don't feel I should add
myself at the top of the file. When another 15 people like me are
going to modify calculus.py, the authors will be quite misleading. The
Mercurial nevertheless provides exact history who wrote what.

Ondrej

David Joyner

unread,

Dec 7, 2007, 9:56:37 AM12/7/07

to sage-...@googlegroups.com

I'm not an expert on mercurial, but I assumed that mercurial
revision history is available to people using sage.math. What if
Joe Shmoe decides to redistribute SAGE on
www.cool-free-math-software.com? Will his version of SAGE also see
who contributed to what? I personally would like to see this information
in the distribution but maybe no one else does.

>
> Ondrej
>
>
> >
>

William Stein

unread,

Dec 7, 2007, 3:50:58 PM12/7/07

to sage-...@googlegroups.com

I agree with maybe 60% of producingoss.com and disagree with maybe
40% of that book, at least for Sage. The above is an example of some
of the many things in that book that I disagree with (at least for Sage).

> People sometimes argue in favor of author or maintainer tags in source
> files on the grounds that this gives visible credit to those who have
> done the most work there.

I definitely argue that.

> There are two problems with this argument.
> First, the tags inevitably raise the awkward question of how much work
> one must do to get one's own name listed there too.

Just because something raises a question doesn't mean that something
is a mistake to do! The answer in the case of Sage is that if a person
feels they've done enough to explicitly list themselves in an AUTHOR block,
e.g., for a function, for a file, whatever, then they've done enough. Full
stop.

By the way, a great example of a file that makes good use of the AUTHOR
blocks is:

http://www.sagemath.org/hg/sage-main/file/7110a20969c8/sage/rings/bernoulli_mod_p.pyx

Finally, I have say that in mathematics research at least author
credit is *everything*.
It is by far the most important commodity there is. To argue for
banning explicitly
listing credits in places in code is frankly a very stupid waste of
valuable gold.
I've seen *precisely* this sort of thing be enforced with Magma in some cases,
and it seriously aliented certain people, present company included.
If somebody
feels strongly enough to put

AUTHOR:
name (date) -- summary of what they did

in a function docstring, then they deserve that right.

> Second, they
> conflate the issue of credit with that of authority: having done work
> in the past does not imply ownership of the area where the work was
> done, but it's difficult if not impossible to avoid such an
> implication when individual names are listed at the tops of source
> files.

What's wrong with some implied ownership!? That's actually
very very important. For example, to take a concrete situation, Robert
Miller and Emily Kirkman spent a huge amount of time during the last
year writing graph theory code. Their names are clearly listed in
AUTHOR blocks at the tops of files. I've done some minor reorganization
of docstrings and code, but definitely don't feel I should be listed -- it's
their part of Sage. Now suppose some talented enthusiastic person, e.g.,
named Jason Grout, comes along and starts submitting patches all over
the place for
graph theory. It's clear what should happen -- Robert and Emily should get
notified, get first dibs to referee, etc., until after a while Jason
starts getting
so confident he lists his name under AUTHOR, and he should be consulted
too.

Morever, "implied ownership" really isnt' an issue with Sage, beyond the
basic respect it should entail, since the whole
culture of the project is that anybody can work on any part of the system, as
long as they _respectfully_ post patches, get them refereed etc.

> In any case, credit information can already be obtained from
> the version control logs and other out-of-band mechanisms like mailing
> list archives, so no information is lost by banning it from the source
> files themselves.

There is a significant barrier to entry in getting credit information from
version control logs, and they can be very misleading (e.g., in the case
of moving chunks of code around).

Again I strongly disagree with removing all the AUTHOR: blocks from
the Sage docstrings. I think doing this would
(1) stupidly ignore a huge amount of what makes Sage work,
(2) removes a valuable mechanism for getting a quick sense of
who the main people are who consider themselves serious contributors
to a file or function,
(3) raises the barrier to *giving* people credit for their work.
(4) completely goes contrary to the tradition in mathematics, where
credit is everything -- which
is why most mathematical objects are named after people (e.g.,
Bernoulli numbers, Tate curves,
etc.). When some random mathematician, say, types
sage: bernoulli_mod_p?
I get
...
AUTHOR:
-- David Harvey (2006-08-06)
This tells me something very useful immediately - that there's a
real specific
person behind this code, maybe somebody I'll see at a conference soon and
thank for their function, ask further questions about it, etc. etc.
If as a random
mathematician I had to rely on clicking around with Mercurial to get info like
that it would never happen, and I probably wouldn't trust what I see anyways.

-- William

John Cremona

unread,

Dec 7, 2007, 5:40:15 PM12/7/07

to sage-...@googlegroups.com

I agree!

John

> -- William
>
> >
>

--
John Cremona

Ondrej Certik

unread,

Dec 7, 2007, 6:46:23 PM12/7/07

to sage-...@googlegroups.com

I think I don't agree with this, but it's true that I am not doing
mathematics. I think if someone devises something new, some new
algorithm, or something, it's fine to put his name on it, but if it's
just a code, I see it just as a code, nothing more. Clearly there are
successful projects, like apache, that use this strategy (see that
link I posted in my first email), so I don't think those people are
stupid. I think both ways can work, so I just wanted to discuss this.
I myself don't list my name in any functions or files I do, nor in
SymPy or other projects. Mainly because I believe it's a work of many
people and it's not fair to list just some. But anyway, I just wanted
to know what you think about it.

Ondrej

Robert Miller

unread,

Dec 8, 2007, 1:39:25 AM12/8/07

to sage-devel

> > > Again I strongly disagree with removing all the AUTHOR: blocks from
> > > the Sage docstrings.

The following is from the GPL v3:

"""
...
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders
of that material) supplement the terms of this License with terms:

* a) ...
* b) Requiring preservation of specified reasonable legal notices
or author attributions ...
"""

Disturbing discussions like this seriously make me consider adding
some provision like this to at least the code I have contributed. If
people were to start removing my name from software I have spent hard
time writing, Sage would be down one developer. The example with
graph.py is great, because in fact many code submissions Jason was
making were coming in without names corresponding to the patches,
simply because Jason was using a slightly different revision control
program. If it weren't for his name in the code itself, it might not
be there at all.

Recall: we are not the borg. We all have names. What is the real
objective here? I'd like to help develop the best math software in the
world, and get credit for it. In the kind of job market many of us
face, this is what differentiates different people vying for the same
job. Ownership and credit are very different things. Tell me this- why
are we so worried about owning something that is free, that anyone can
change and distribute, and whose goal is to be available to everyone?

-- Robert L. Miller

boo...@u.washington.edu

unread,

Dec 8, 2007, 3:29:49 AM12/8/07

to sage-...@googlegroups.com

> I think I don't agree with this, but it's true that I am not doing
> mathematics. I think if someone devises something new, some new
> algorithm, or something, it's fine to put his name on it, but if it's
> just a code, I see it just as a code, nothing more. Clearly there are
> successful projects, like apache, that use this strategy (see that
> link I posted in my first email), so I don't think those people are
> stupid. I think both ways can work, so I just wanted to discuss this.
> I myself don't list my name in any functions or files I do, nor in
> SymPy or other projects. Mainly because I believe it's a work of many
> people and it's not fair to list just some. But anyway, I just wanted
> to know what you think about it.

Some of the code I've written for Sage, like the cython BinaryTree implementation is in the public domain, because it's totally naive; and I get exactly what I need out of it so I don't care what people do with it. Other code, like the javascript AJAX interface the notebook uses, is the product of many hours of hard work and months of experimentation; and *of course* that file has my name in the copyright block.

It doesn't matter if you're writing "math" or "just code". Something that I don't think William mentioned, is that it's really good to know who to go to if something is busted, or incomprehensible. When I put my name in the AUTHORS: block of a function or file, I'm saying, "come to me if there's a problem in what I did." IMO, credit is equal parts pride, responsibility and respect.

Martin Albrecht

unread,

Dec 8, 2007, 4:34:30 AM12/8/07

to sage-...@googlegroups.com

> Recall: we are not the borg. We all have names. What is the real
> objective here? I'd like to help develop the best math software in the
> world, and get credit for it. In the kind of job market many of us
> face, this is what differentiates different people vying for the same
> job. Ownership and credit are very different things. Tell me this- why
> are we so worried about owning something that is free, that anyone can
> change and distribute, and whose goal is to be available to everyone?

Hi,

I don't think Ondrej's suggestion was to remove attribution it was more to
change the technique how to write it down, wasn't it? My initial consent was
based on the impression that the current attribution scheme doesn't always
give credit because people forget to add themselves.

Ondrej Certik

unread,

Dec 8, 2007, 8:36:36 AM12/8/07

to sage-...@googlegroups.com

On Dec 8, 2007 10:34 AM, Martin Albrecht <ma...@informatik.uni-bremen.de> wrote:
>
> > Recall: we are not the borg. We all have names. What is the real
> > objective here? I'd like to help develop the best math software in the
> > world, and get credit for it. In the kind of job market many of us
> > face, this is what differentiates different people vying for the same
> > job. Ownership and credit are very different things. Tell me this- why
> > are we so worried about owning something that is free, that anyone can
> > change and distribute, and whose goal is to be available to everyone?
>
> Hi,
>
> I don't think Ondrej's suggestion was to remove attribution it was more to
> change the technique how to write it down, wasn't it? My initial consent was
> based on the impression that the current attribution scheme doesn't always
> give credit because people forget to add themselves.

Yep, exactly this. It's not about forgeting. When making a small
change, people don't add them deliberately, because they feel they
didn't do enough (which they didn't imho). But if a file was originaly
written by one or two authors, but then completely refactored by 15
different authors, so that later the file doesn't really resemble the
original one, do the original authors deserve to be on top of the
file? I think it's a fair question to ask.

There is no doubt credit needs to be given. Also noone will remove
authors field from files without permission of the author, at least I
would consider this to be very impolite.

Ondrej

Reply all

Reply to author

Forward