Suggestion about encouraging citations of software

146 views
Skip to first unread message

Michael Droettboom

unread,
Apr 17, 2015, 10:49:04 AM4/17/15
to numf...@googlegroups.com, Erik Tollerud, Perry Greenfield, Thomas Robitaille
I thought the Numfocus community may want to know about a recent academic publishing incident involving matplotlib.

TL;DR: We should be encouraging and recommending citations to our software, but explicitly *not requiring* such citations.

I was contacted by the journal publishing arm of a major American scientific society (who will remain nameless to avoid embarrassment on a public mailing list) that they were of the belief that since matplotlib requests a citation in journal articles that use matplotlib in their creation, that the journal was unable to include matplotlib figures in their table of contents, cover pages, or other locations outside of the journal article itself where such a citation would be cumbersome or impossible.  On this basis, they were rejecting publication of a specific article, and, even more troubling, considering including matplotlib on a list of tools that were unacceptable for use by its authors for any purpose.  Obviously, this is completely contrary to the spirit and letter of matplotlib's license, which is intended to promote as wide adoption as possible.  (Some of the confusion also was around the misunderstanding that matplotlib was a provider or repository of content, as opposed to a software tool for creating content -- that was easily cleared up, but it's important to remember that our policies and licenses are often read by individuals who are not scientists and software developers).

While this has been resolved adequately to the best of my understanding, the lesson to share here is that as various Numfocus projects move to encourage more citation of their software that we are careful about wording.  IANAL, but I would suggest that any language about citations should clearly state that citations are *recommended and encouraged* (for all the various reasons we already know), but not explicitly *required*.  This is in the spirit of the sort of BSD-style widely-permissive licenses that are the norm in our community.

Cheers,
Michael Droettboom

Brian Granger

unread,
Apr 17, 2015, 12:18:00 PM4/17/15
to numf...@googlegroups.com, Erik Tollerud, Perry Greenfield, Thomas Robitaille
Wow, thanks, I would have never imagined this!
--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgra...@calpoly.edu and elli...@gmail.com

Jacob Barhak

unread,
Apr 17, 2015, 2:56:39 PM4/17/15
to numf...@googlegroups.com

Hi Michael,

Your description of incompatibility of the scientific publication system to emerging ideas and tools is not new.

Academic publications many times follow outdated rules and concepts that hinder scientific progress rather than promote it.

There are too many illnesses in the current scientific publication system that need fixing. This is not at the level of respect the publisher anymore, it is at the level of test if the publisher is working properly.

I regularly get comments on the basis of citations not being "scientific". It is a plague where tradition stops progress. If your citation is correct and well documented, then it can be used in a scientific argument. If a venue cannot publish this argument,  then there is always an alternative venue. Contemporary technology simplified publishing so almost anyone can do it.

So try to stick to those venues who work well, rather than to those who are "considered respected".

This is where venues who follow public non blind review policies have an advantage of credibility.

         Jacob

--

Benjamin Root

unread,
Apr 17, 2015, 3:24:23 PM4/17/15
to numf...@googlegroups.com
Jacob,

I almost 100% agree with you (I have had reviewers skewer my manuscript for publishing a github link to my source code, claiming that I must have been offloading documentation of my procedures to the source code).

The part where I disagree is in this instance, the publisher was going to put matplotlib on a blacklist (I can't even believe such a thing exists!). If that had happened, and others get their papers rejected because of it, or if they just simply see a list of free software on a blacklist of technologies, it gives a bad impression to those technologies. As producers of open source software, we need to continue to counter these misguided attempts to cast open-source software as "bad" and untouchable.

I do agree that we should also be pushing for wider use of open access publications, though.

Cheers!
Ben Root

Jacob Barhak

unread,
Apr 17, 2015, 4:45:50 PM4/17/15
to numf...@googlegroups.com
Thanks Ben,

Let me touch upon your black list issue.

For the sake of argument. If I do not like you and put you on my black list, does it affect you in any way if I do not share it with others?

A black list becomes effective only If it is shared. And sharing it is a form of publication. So eventually we end up with competing publication mechanisms. For a black list to reduce your exposure it has to publish better than you. This is simplified, yet the idea remains that it is hard to stop information flowing, you can delay it for a while, or try to exhaust your opponent, yet ideas flow.

And since we are in the NumFocus group I will add that even the SciPy organizers learned that the hard way when the communications officer tried to stop discussion on their list a few years ago. Information just kept flowing in other channels.

So if you want your tools/ideas to be used, just out publish your opposition. You will always have people not agreeing with you. Yet a black list is probably just a scaring mechanism.

If you turn away in fear then the mechanism was efficient. It is very effective against those who really should not do something and are trying to game the system. Yet if a black list ever is used against someone who was correct, then publication will nullify those who control the black list.

So yes, we agree, publishing is important and technology today allows up to do so. And open systems are great in that perspective.

            Jacob

Anthony Scopatz

unread,
Apr 19, 2015, 7:02:11 PM4/19/15
to numf...@googlegroups.com
Hello All,

Mike, first off, that sounds awful.  It sounds like they didn't read or understand the license at all.

I'll fully out myself as the SciPy 2014 communication chair whom Jacob refers to.  SciPy 2014 is well and done and I am not associated with SciPy 2015 at all. 

Yes, I requested that a conversation be stopped on the publication and review of the SciPy proceedings well after the conversation had been helpful.  Several other SciPy organizers (volunteers themselves) had threatened to leave over that thread.  The quality of the conference, and the existence of the proceedings at all, were threatened.

Everyone has a right to the freedom of expression (which was had). Everyone also has the right to be free from harassment.  No attempt at censorship was ever made; the thread should still be up and archived and no one was banned from any list for any reason.

I think that the thing that everyone needs to understand is that the scientific computing community is a sprawling, international, multicultural organization. If we are to accomplish our technical and social goal (like not having publishers ban our high-quality figures), we have to maintain our attitude of respect and pluralism and our focus of the scientific and technical problems.

Have things always been perfect? No. Was the SciPy 2014 issue the only time this has happened to us? Far from it. However, the good interactions *vastly* outweigh the bad.

Feel free to disagree, but when a conversation losses mutual respect, I am happy to note that it has probably gone too far and request that it end in favor of more constructive discourse.

Be Well
Anthony

Jacob Barhak

unread,
Apr 20, 2015, 12:57:38 AM4/20/15
to numf...@googlegroups.com

Thanks Anthony,

It is about time that this issue got revealed publicly. To put things into perspective,  the stopped discussion was about the publication approach of SciPy 2014 after 2013 changes.

This example shows that our internal devides do us damage and make our own argument against other publication paradigms weaker.

The correct action in that specific case we mention would have been to continue discussion and reveal opinions. Then decide and make the decision process public. This is appropriate to the open source community.

One advantage of open source that transfers to scientific publications is that ideas do not go away - they stay until someone finds them useful.

And I believe this is what we strive for,  either by publishing code, publishing a scientific paper, making a presentation,  or even writing to this list.

And do note that SciPy 2011 and even 2012 had the correct publication and review approach to support those ideas and eniminate negative issues discussed in this thread. If other scientists adopt those publication and review approaches, we will all see benefits.

        Jacob

Nathaniel Smith

unread,
Apr 20, 2015, 2:20:53 AM4/20/15
to numf...@googlegroups.com
Hi Jacob,

On Sun, Apr 19, 2015 at 9:57 PM, Jacob Barhak <jacob....@gmail.com> wrote:
> Thanks Anthony,
>
> It is about time that this issue got revealed publicly. To put things into
> perspective, the stopped discussion was about the publication approach of
> SciPy 2014 after 2013 changes.
>
> This example shows that our internal devides do us damage and make our own
> argument against other publication paradigms weaker.
>
> The correct action in that specific case we mention would have been to
> continue discussion and reveal opinions. Then decide and make the decision
> process public. This is appropriate to the open source community.

As an uninvolved third-party reading this, your comments raise a lot
of red flags for me. I don't know anything about the specific
conversation that Anthony intervened in, or what his intervention
specifically was, so I can't say anything to that. But it is
absolutely the case that -- as a general principle -- having healthy
and respectful conversations online often requires the presence of
formal or informal moderators willing to step in when things get out
of line. This has nothing to do with censorship (as you say, people
can and will continue to say whatever they like in other places); it
has to do with setting an individual community's standard of
discourse. If your "open source" publishing paradigm requires we allow
unchecked harassment etc. as a matter of principle, then please count
me out.

Honestly, though, from where I'm sitting, it kinda sounds like your
motivation here maybe isn't really principle-based, and might really
be that you're still angry about this event that happened last year.
Whether or not you're an injured party in this specific case, and
whether or not Anthony made the right decision (like I said, I have no
idea of the specifics), it's pretty inappropriate to try and derail an
unrelated conversation on a mostly-unrelated organization's mailing
list into a referendum on your pet issue. If you have constructive
suggestions for how the SciPy conference should organize itself then
it would be more productive to take it up with them.

-n

--
Nathaniel J. Smith -- http://vorpus.org

Nathaniel Smith

unread,
Apr 20, 2015, 2:36:42 AM4/20/15
to numf...@googlegroups.com, Erik Tollerud, Perry Greenfield, Thomas Robitaille
On Fri, Apr 17, 2015 at 7:49 AM, Michael Droettboom <mdb...@gmail.com> wrote:
> While this has been resolved adequately to the best of my understanding, the
> lesson to share here is that as various Numfocus projects move to encourage
> more citation of their software that we are careful about wording. IANAL,
> but I would suggest that any language about citations should clearly state
> that citations are *recommended and encouraged* (for all the various reasons
> we already know), but not explicitly *required*. This is in the spirit of
> the sort of BSD-style widely-permissive licenses that are the norm in our
> community.

This is an old issue that has caused many clashes between the academic
and free software worlds -- e.g., "citation clauses" in licenses get
explicitly called out in the Debian legal FAQ as making software
non-free (= non-open-source) according to Debian's definition. (And
Debian-legal along with Fedora legal are more or less the de facto
arbiters of what the phrases free and open-source mean.) See 10g:
https://people.debian.org/~bap/dfsg-faq.html
It is very very very definitely a bad thing to mention anything about
citations in your actual license text; licenses are not for saying
random things you would like to happen, they are for saying what you
want to sue people over.

Being wary of such things, the phrasing I've been putting in my
software docs is:

"If you use this software in work that leads to a scientific
publication, and feel that a citation would be appropriate, then here
is a possible citation: ..."

Todd

unread,
Apr 20, 2015, 3:01:54 AM4/20/15
to numf...@googlegroups.com

That is good. Here is another phrasing I thought of:

matplotlib claims no copyright on figures created using matplotlib and puts no requirements or restrictions on their use. When feasible, we would appreciate it if you cite matplotlib when using it in a publication.  However, this is completely voluntary.

Jacob Barhak

unread,
Apr 20, 2015, 3:17:43 AM4/20/15
to numf...@googlegroups.com

So Nathaniel,

Numfocus is affiliated with SciPy somewhat. In fact I joined this list after Scipy 2012.

The discussion here is relevant somewhat since it deals with publication and rejection of ideas. And software consists of ideas coded. And we do want to publish our work.

This thread started from incompatibility of publication paradigms. Specifically software licenses and scientific publication. 

What I am suggesting here is adopting the ideas of public non blind review pre and post publicaion as a community standard.

And yes,  I am pressing this idea quite relentlessly. Let it be the main topic here.
If we unite under this idea,  it will solve many issues for us and influence others.

Again, I believe SciPy 2011 and even 2012 had the right model.

        Jacob

Todd

unread,
Apr 20, 2015, 3:36:35 AM4/20/15
to numf...@googlegroups.com


On Apr 20, 2015 9:17 AM, "Jacob Barhak" <jacob....@gmail.com> wrote:
>
> So Nathaniel,
>
> Numfocus is affiliated with SciPy somewhat. In fact I joined this list after Scipy 2012.
>
> The discussion here is relevant somewhat since it deals with publication and rejection of ideas. And software consists of ideas coded. And we do want to publish our work.
>
> This thread started from incompatibility of publication paradigms. Specifically software licenses and scientific publication. 
>
> What I am suggesting here is adopting the ideas of public non blind review pre and post publicaion as a community standard.
>
> And yes,  I am pressing this idea quite relentlessly. Let it be the main topic here.
> If we unite under this idea,  it will solve many issues for us and influence others.
>
> Again, I believe SciPy 2011 and even 2012 had the right model.
>
>         Jacob

You can't just unilaterally declare that a thread should be on a different topic and expect everyone to go along with it, especially when other people consider the existing topic to be with discussing. It is very easy to start a new thread on the relevant mailing list, so please do so.

I can't speak for others, but I for one consider the fact that one or more journals are refusing figures created in matplotlib very serious indeed, and your insistence that we should stop discussing it and talk about your pet issue instead are not helping your cause.

Jacob Barhak

unread,
Apr 20, 2015, 4:06:48 AM4/20/15
to numf...@googlegroups.com

So Tod,

Your correction to the license wording is in place. Yet even with this correction,  you will find many other incompatibilities to the scientific publication system.

If you want software to be effectively published,  you have to change more things than a license. And this requires some level of unity in the community.

Yet a license is a good start and may help.

         Jacob

Reply all
Reply to author
Forward
0 new messages