currrently *.py files in SAGE usually contain names who wrote them.
The famous Karl Fogel's Producing Open Source Software discourages
that:
http://producingoss.com/en/managing-volunteers.html#territoriality
mainly:
People sometimes argue in favor of author or maintainer tags in source
files on the grounds that this gives visible credit to those who have
done the most work there. There are two problems with this argument.
First, the tags inevitably raise the awkward question of how much work
one must do to get one's own name listed there too. Second, they
conflate the issue of credit with that of authority: having done work
in the past does not imply ownership of the area where the work was
done, but it's difficult if not impossible to avoid such an
implication when individual names are listed at the tops of source
files. In any case, credit information can already be obtained from
the version control logs and other out-of-band mechanisms like mailing
list archives, so no information is lost by banning it from the source
files themselves.
Ondrej
Martin
--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_www: http://www.informatik.uni-bremen.de/~malb
_jab: martinr...@jabber.ccc.de
++++++++++++++++++++++++++++
I think the idea of the OSS book (and also mine) is to use the
Mercurial revision history for tracking who wrote each line.
It's the psychology, for example when I modified the calculus.py, by
adding a new function and a few lines, I don't feel I should add
myself at the top of the file. When another 15 people like me are
going to modify calculus.py, the authors will be quite misleading. The
Mercurial nevertheless provides exact history who wrote what.
Ondrej
I'm not an expert on mercurial, but I assumed that mercurial
revision history is available to people using sage.math. What if
Joe Shmoe decides to redistribute SAGE on
www.cool-free-math-software.com? Will his version of SAGE also see
who contributed to what? I personally would like to see this information
in the distribution but maybe no one else does.
>
> Ondrej
>
>
> >
>
I agree with maybe 60% of producingoss.com and disagree with maybe
40% of that book, at least for Sage. The above is an example of some
of the many things in that book that I disagree with (at least for Sage).
> People sometimes argue in favor of author or maintainer tags in source
> files on the grounds that this gives visible credit to those who have
> done the most work there.
I definitely argue that.
> There are two problems with this argument.
> First, the tags inevitably raise the awkward question of how much work
> one must do to get one's own name listed there too.
Just because something raises a question doesn't mean that something
is a mistake to do! The answer in the case of Sage is that if a person
feels they've done enough to explicitly list themselves in an AUTHOR block,
e.g., for a function, for a file, whatever, then they've done enough. Full
stop.
By the way, a great example of a file that makes good use of the AUTHOR
blocks is:
http://www.sagemath.org/hg/sage-main/file/7110a20969c8/sage/rings/bernoulli_mod_p.pyx
Finally, I have say that in mathematics research at least author
credit is *everything*.
It is by far the most important commodity there is. To argue for
banning explicitly
listing credits in places in code is frankly a very stupid waste of
valuable gold.
I've seen *precisely* this sort of thing be enforced with Magma in some cases,
and it seriously aliented certain people, present company included.
If somebody
feels strongly enough to put
AUTHOR:
name (date) -- summary of what they did
in a function docstring, then they deserve that right.
> Second, they
> conflate the issue of credit with that of authority: having done work
> in the past does not imply ownership of the area where the work was
> done, but it's difficult if not impossible to avoid such an
> implication when individual names are listed at the tops of source
> files.
What's wrong with some implied ownership!? That's actually
very very important. For example, to take a concrete situation, Robert
Miller and Emily Kirkman spent a huge amount of time during the last
year writing graph theory code. Their names are clearly listed in
AUTHOR blocks at the tops of files. I've done some minor reorganization
of docstrings and code, but definitely don't feel I should be listed -- it's
their part of Sage. Now suppose some talented enthusiastic person, e.g.,
named Jason Grout, comes along and starts submitting patches all over
the place for
graph theory. It's clear what should happen -- Robert and Emily should get
notified, get first dibs to referee, etc., until after a while Jason
starts getting
so confident he lists his name under AUTHOR, and he should be consulted
too.
Morever, "implied ownership" really isnt' an issue with Sage, beyond the
basic respect it should entail, since the whole
culture of the project is that anybody can work on any part of the system, as
long as they _respectfully_ post patches, get them refereed etc.
> In any case, credit information can already be obtained from
> the version control logs and other out-of-band mechanisms like mailing
> list archives, so no information is lost by banning it from the source
> files themselves.
There is a significant barrier to entry in getting credit information from
version control logs, and they can be very misleading (e.g., in the case
of moving chunks of code around).
Again I strongly disagree with removing all the AUTHOR: blocks from
the Sage docstrings. I think doing this would
(1) stupidly ignore a huge amount of what makes Sage work,
(2) removes a valuable mechanism for getting a quick sense of
who the main people are who consider themselves serious contributors
to a file or function,
(3) raises the barrier to *giving* people credit for their work.
(4) completely goes contrary to the tradition in mathematics, where
credit is everything -- which
is why most mathematical objects are named after people (e.g.,
Bernoulli numbers, Tate curves,
etc.). When some random mathematician, say, types
sage: bernoulli_mod_p?
I get
...
AUTHOR:
-- David Harvey (2006-08-06)
This tells me something very useful immediately - that there's a
real specific
person behind this code, maybe somebody I'll see at a conference soon and
thank for their function, ask further questions about it, etc. etc.
If as a random
mathematician I had to rely on clicking around with Mercurial to get info like
that it would never happen, and I probably wouldn't trust what I see anyways.
-- William
I agree!
John
> -- William
>
> >
>
--
John Cremona
I think I don't agree with this, but it's true that I am not doing
mathematics. I think if someone devises something new, some new
algorithm, or something, it's fine to put his name on it, but if it's
just a code, I see it just as a code, nothing more. Clearly there are
successful projects, like apache, that use this strategy (see that
link I posted in my first email), so I don't think those people are
stupid. I think both ways can work, so I just wanted to discuss this.
I myself don't list my name in any functions or files I do, nor in
SymPy or other projects. Mainly because I believe it's a work of many
people and it's not fair to list just some. But anyway, I just wanted
to know what you think about it.
Ondrej
Some of the code I've written for Sage, like the cython BinaryTree implementation is in the public domain, because it's totally naive; and I get exactly what I need out of it so I don't care what people do with it. Other code, like the javascript AJAX interface the notebook uses, is the product of many hours of hard work and months of experimentation; and *of course* that file has my name in the copyright block.
It doesn't matter if you're writing "math" or "just code". Something that I don't think William mentioned, is that it's really good to know who to go to if something is busted, or incomprehensible. When I put my name in the AUTHORS: block of a function or file, I'm saying, "come to me if there's a problem in what I did." IMO, credit is equal parts pride, responsibility and respect.
Hi,
I don't think Ondrej's suggestion was to remove attribution it was more to
change the technique how to write it down, wasn't it? My initial consent was
based on the impression that the current attribution scheme doesn't always
give credit because people forget to add themselves.
Yep, exactly this. It's not about forgeting. When making a small
change, people don't add them deliberately, because they feel they
didn't do enough (which they didn't imho). But if a file was originaly
written by one or two authors, but then completely refactored by 15
different authors, so that later the file doesn't really resemble the
original one, do the original authors deserve to be on top of the
file? I think it's a fair question to ask.
There is no doubt credit needs to be given. Also noone will remove
authors field from files without permission of the author, at least I
would consider this to be very impolite.
Ondrej