[SciPy-User] Central File Exchange for SciPy

35 views
Skip to first unread message

O

unread,
Oct 24, 2010, 10:43:35 AM10/24/10
to scipy...@scipy.org

Hi everyone,

I'm a recent convert from MatLab.

One thing I miss is the Central File Exchange.  Are there any plans
to set up a site like this for our community?  It occurs to me this could
dramatically strengthen our user base.  And by a "Central File Exchange"
I mean something far simpler and less formal than "SciKits", where users
can just post their code with some information about how it works.

Just a thought.

Cheers,

O


josef...@gmail.com

unread,
Oct 24, 2010, 12:20:37 PM10/24/10
to SciPy Users List

There are methods available to publish your code more easily than a
scikits, if it's just a single module than the cookbook is a good
location, http://www.scipy.org/Cookbook . For anything larger, setting
up a simple python package for pypi is relatively easy, e.g.
numdifftools which is a translation of the matlab fileexchange program
coauthored jointly with the file exchange author.

The main thing I'm missing compared to the file exchange is the
comment and starring system, which reduces the time to check out a new
package a lot. Also compared to matlab developers, python developers
are often keeping public source control repositories which makes
finding "recipes" easier, but again finding something specific is a
bit of a random search. A good but incomplete overview is in
http://www.scipy.org/Topical_Software

Maybe we could extend the purpose of ask.scipy.org to a package
review/commenting and package "advertising". But I haven't found a
search button on it yet.

Josef


>
> Just a thought.
>
> Cheers,
>
> O
>
>
>

> _______________________________________________
> SciPy-User mailing list
> SciPy...@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

denis

unread,
Oct 29, 2010, 12:15:50 PM10/29/10
to scipy...@scipy.org
> On Sun, Oct 24, 2010 at 10:43 AM, O <mondif...@gmail.com> wrote:
> > I'm a recent convert from MatLab.
> > One thing I miss is the Central File Exchange.  Are there any plans
...

O,
can you describe a bit which parts of Matlab Central you want most ?
As Josef says, Scipy uses mail forums, ask.scipy.org,
stackoverflow ...
to answer questions pretty fast. Do you / do other users want

- package reviews / comments / advertising --
for new users, for experts ? Examples from Matlab ?

- overviews of major areas, along the lines of Wikipedia articles
with links to detailed doc and recipes ?

I believe that Scipy experts get more points for new stuff
and for answering questions than they would for either of these.
Matlab has a different reward system:

"I believe in Art, but my manager believes in money,
and who am I to argue with such a baboon ?"
-- Groucho Marx

cheers
-- denis

Almar Klein

unread,
Oct 29, 2010, 4:53:26 PM10/29/10
to SciPy Users List
I agree with the OP. The Matlab file exchange is a great tool for developers to easily publish small (but also larger) pieces of code, and for people to search for particular code.

As to what features I think such a system should have: It should be a repository capable of storing thousands of entries, which should be indexed, and categorized so that users can easily find code they are looking for. A review system and comments would also be nice.

One `problem´ is that many Python developers that have a great tool, publish it as an open source project on googlecode for example. Maybe such projects could be entered in the database as well, with a reference to the googlecode website for the code itself.

I think it is a great idea, as it would help unite the Python scientific community. One (maybe the only) disadvantage I found when converting from Matlab to the Holy Language, is that the Python world seems a bit fragmented; you have to download Python from www.python.org, then numpy and scipy from scipy.org, etc. A repository of the likes of the Matlab file exchange would put at least all non standard code in a single place, which would be a big advantage.

But who's going to set up and maintain such a big project, and whos going to pay for the server?

Cheers,
  Almar

Alan G Isaac

unread,
Oct 29, 2010, 6:07:26 PM10/29/10
to SciPy Users List
On 10/29/2010 4:53 PM, Almar Klein wrote:
> The Matlab file exchange is a great tool for developers to easily publish small (but also larger) pieces of code

The Cookbook holds small pieces of code:
http://www.scipy.org/Cookbook

fwiw,
Alan Isaac

Joshua Holbrook

unread,
Oct 29, 2010, 6:10:28 PM10/29/10
to SciPy Users List
I'm surprised nobody's mentioned pypi yet: http://pypi.python.org/pypi

--Josh

Almar Klein

unread,
Oct 30, 2010, 7:44:45 AM10/30/10
to SciPy Users List
On 30 October 2010 00:07, Alan G Isaac <alan....@gmail.com> wrote:
On 10/29/2010 4:53 PM, Almar Klein wrote:
> The Matlab file exchange is a great tool for developers to easily publish small (but also larger) pieces of code

The Cookbook holds small pieces of code:

Yes, but it wouldn't really work if hundreds (or thousands) of people would submit pieces of code.

  Almar

Gerrit Holl

unread,
Oct 30, 2010, 8:02:46 AM10/30/10
to SciPy Users List

Why not add it to scipy? Or if it doesn't fit put it somewhere and
link it from http://www.scipy.org/Topical_Software

Gerrit.

Gael Varoquaux

unread,
Oct 30, 2010, 8:07:48 AM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 02:02:46PM +0200, Gerrit Holl wrote:
> >> The Cookbook holds small pieces of code:
> >> http://www.scipy.org/Cookbook

> > Yes, but it wouldn't really work if hundreds (or thousands) of people would
> > submit pieces of code.

> Why not add it to scipy?

Because code requires maintenance, releases, and quality assurance. If
thousands of people start pushing code in scipy, they need to help doing
both of these things.

The goal of a repo with no garanties like Matlab Central is to lower the
barrier to sharing code, but giving up on any garanties.

> Or if it doesn't fit put it somewhere and
> link it from http://www.scipy.org/Topical_Software

That means buiding software packages, which is also more work than simply
dumping code on a webpage.

Gaël

Pauli Virtanen

unread,
Oct 30, 2010, 8:14:41 AM10/30/10
to scipy...@scipy.org
Sat, 30 Oct 2010 14:02:46 +0200, Gerrit Holl wrote:
[clip]

>> Yes, but it wouldn't really work if hundreds (or thousands) of people
>> would submit pieces of code.
>
> Why not add it to scipy?

Because Scipy should only receive general-purpose and good-quality code,
and its release cycle is not that fast.

On purpose:

The scope of Scipy is mainly to contain "basic tools for numerical
scientific computation".

On quality:

What you typically have at first is "research quality code" --- it works
for your particular problem, but it might not do everything necessary,
may actually be a poor way to solve the problem, you are not 100% sure it
has no bugs, and you haven't tested it for other problems. Refining it
from this point onwards takes quite a bit of effort.

On speed:

You typically would like to publish your code now and not wait for a year
before it's out.

--
Pauli Virtanen

Matthew Brett

unread,
Oct 30, 2010, 8:15:06 AM10/30/10
to SciPy Users List
Hi,

>> Or if it doesn't fit put it somewhere and
>> link it from http://www.scipy.org/Topical_Software
>
> That means buiding software packages, which is also more work than simply
> dumping code on a webpage.

It's probably worth pointing out that most Matlab utilities are Matlab
only (no extensions), and the code dumped is usually just an archive
that you unpack somewhere and put on your Matlab path.

That's the rough equivalent of a python package that is pure python,
and for which the install method is copying or linking the <mypackage>
directory into some directory on your python path.

I can imagine something like a 'snippet' distribution format, which is
just a README file, and the <mypackage> directory. Obviously if
someone wanted to be more pypi about the whole thing, that would be
easy too.

See y'all,

Matthew

John

unread,
Oct 30, 2010, 8:15:28 AM10/30/10
to SciPy Users List
+1 on pypi, but it needs some features added to it. Creating something
between Matlab file exchange and vim scripts would be ideal.

I've just checked out ask.scipy.org, but this doesn't seem what the OP
is after. Furthermore, is there really no search feature here??

I think what pypi needs to fill this void is some features such as:

1) a separation of simple scripts versus packages (so really we need
PyScript... also)

2) a good rating system

3) a good comment system

--john

--
Configuration
``````````````````````````
Plone 2.5.3-final,
CMF-1.6.4,
Zope (Zope 2.9.7-final, python 2.4.4, linux2),
Python 2.6
PIL 1.1.6
Mailman 2.1.9
Postfix 2.4.5
Procmail v3.22 2001/09/10
Basemap: 1.0
Matplotlib: 1.0.0

John

unread,
Oct 30, 2010, 8:21:01 AM10/30/10
to SciPy Users List
Speaking of pypi, does anyone know if there is a way to see where the
downloads came from as a package distributor. It would be interesting
information.

-john

josef...@gmail.com

unread,
Oct 30, 2010, 8:21:33 AM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 8:07 AM, Gael Varoquaux
<gael.va...@normalesup.org> wrote:
> On Sat, Oct 30, 2010 at 02:02:46PM +0200, Gerrit Holl wrote:
>> >> The Cookbook holds small pieces of code:
>> >> http://www.scipy.org/Cookbook

there is also the python cookbook (the interface looks closer to
stackoverflow now)
http://code.activestate.com/recipes/tags/numeric/

>
>> > Yes, but it wouldn't really work if hundreds (or thousands) of people would
>> > submit pieces of code.
>
>> Why not add it to scipy?
>
> Because code requires maintenance, releases, and quality assurance. If
> thousands of people start pushing code in scipy, they need to help doing
> both of these things.
>
> The goal of a repo with no garanties like Matlab Central is to lower the
> barrier to sharing code, but giving up on any garanties.

The big advantage in my view of the matlab file exchange is the
ability to comment and rate an existing package and fork it if it
looks like it can be improved with attribution link and has inspired
links. And given that it is all (new code) clearly defined as BSD, it
is safe to do so.

This improves the quality control problem for the user quite a bit.

The problem with pypi and "Topical Software", and as seen in the
question on neural networks, is, for example, that dead and active
projects are indistinguishable without finding the source repository
and checking the updates.

Without user contributed commenting it is a lot of work to maintain a
list, see the (non)speed in cleaning up dead links on the Topical
page.

Josef

Almar Klein

unread,
Oct 30, 2010, 8:25:54 AM10/30/10
to SciPy Users List


On 30 October 2010 00:10, Joshua Holbrook <josh.h...@gmail.com> wrote:
I'm surprised nobody's mentioned pypi yet:  http://pypi.python.org/pypi

You're right. Pypi already has quite a few of the required features.
But still, for some reason I cannot put my finger on, Matlab central feels nicer.


On 30 October 2010 14:15, John <wash...@gmail.com> wrote:

I think what pypi needs to fill this void is some features such as:

1) a separation of simple scripts versus packages (so really we need
PyScript... also)

2) a good rating system

3) a good comment system

These features might indeed be improved a bit.

  Almar


josef...@gmail.com

unread,
Oct 30, 2010, 9:39:46 AM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 8:15 AM, Matthew Brett <matthe...@gmail.com> wrote:
> Hi,
>
>>> Or if it doesn't fit put it somewhere and
>>> link it from http://www.scipy.org/Topical_Software
>>
>> That means buiding software packages, which is also more work than simply
>> dumping code on a webpage.
>
> It's probably worth pointing out that most Matlab utilities are Matlab
> only (no extensions), and the code dumped is usually just an archive
> that you unpack somewhere and put on your Matlab path.
>
> That's the rough equivalent of a python package that is pure python,
> and for which the install method is copying or linking the <mypackage>
> directory into some directory on your python path.
>
> I can imagine something like a 'snippet' distribution format, which is
> just a README file, and the <mypackage> directory.  Obviously if
> someone wanted to be more pypi about the whole thing, that would be
> easy too.

For plain python packages, "paster create" provides a full package
structure with just filling out a few questions

>paster create --list-templates
Available templates:
basic_package: A basic setuptools-enabled package
complete: Complete, documentable, testable Python project template
...

Just to be more pypi about it.
(The only explanation a very short google search provides, is how it
can be used for zope templates
http://plone.org/documentation/kb/use-paster)

Josef

william ratcliff

unread,
Oct 30, 2010, 11:00:47 AM10/30/10
to SciPy Users List
If we could automate it, how much do you think the bandwidth/hosting costs would be per month?   Would it be restricted to just code (that is just text files, cutoff above a certain size)?  No bug tracking and a simple rating system for packages? A section for comments about a given package.  The submitter gives it up to 4 tags (for searching) and we start out with a given list of topics and let people additional ones later?  People register for an account (to reduce spam) or do we just use Openid or Openauth?     How do we deal with spam?  Do we allow people to sort packages by date?  Rating?  Would people want to use Django?   What would we call it?  I'd be willing to purchase a domain name and pay for hosting on webfaction to try it out.  If it gets too pricey then I may have to ask for help later. (I think we should avoid ads).    What would you guys like to call it?

PythonCentral (is that infringing?)?  ScipyExchange?    

If anyone wants to help mock up a prototype in Django, I have some time next week.  I have no design skills ;> 

Finally, licensing--I don't want to start a flame war or anything, but can we agree to make code on the site BSD, or should we allow the submitter to pick an open source license.  If so, do we follow googlecode for the choice of license?

One last question (sorry for so many),  given how many people already have nice projects on github, sourceforge, googlecode, etc., should we provide an option for people to simply link to their repository rather than provide us with a direct copy of the code?  Actually, one model could be that people host their code somewhere else and we merely provide an aggregation service so people can easily see what's out there in the scientific python universe and how the community has rated a given package.   That way, developers can keep their existing codebases without changing their workflow....

William

Pauli Virtanen

unread,
Oct 30, 2010, 12:51:58 PM10/30/10
to scipy...@scipy.org
Sat, 30 Oct 2010 11:00:47 -0400, william ratcliff wrote:
> If we could automate it, how much do you think the bandwidth/hosting
> costs would be per month?

No idea. Probably the traffic wouldn't be too much, at least at first.

> No bug tracking and a simple rating system for packages?

Yes.

> A section for comments about a given package.

Perhaps with a possibility to up/downvote comments?

> The submitter gives it up to 4 tags (for searching) and we
> start out with a given list of topics and let people additional ones
> later?

Yes. It might be useful to try to follow PyPi style classifiers here, and
extend them as needed.

> People register for an account (to reduce spam) or do we just use
> Openid or Openauth? How do we deal with spam?

Email verification on registration + spam flagging by users +
rel=nofollow in comments?

> Do we allow people to sort packages by date? Rating?

Yes and yes.

> Would people want to use Django?

Django will get the job done, and it's on the easier end of the spectrum
of Python web frameworks. I'd pick it.

> I'd be willing to purchase a domain name and pay for hosting on
> webfaction to try it out. If it gets too pricey then I may have to
> ask for help later. (I think we should avoid ads).

One possibility might be to ask if Enthough would be interested in
sponsoring such a thing, and running it on the scipy.org servers. But
that's for later, when there's actually an something working to show.

> What would you guys like to call it?
> PythonCentral (is that infringing?)? ScipyExchange?

Well, it might be worth to target it for the scientific audience, so the
name choice should be in accord. Also, I'd avoid clone-ish names.

> If anyone wants to help mock up a prototype in Django, I have some time
> next week. I have no design skills ;>

I know some Django.

> Finally, licensing--I don't want to start a flame war or anything, but
> can we agree to make code on the site BSD, or should we allow the
> submitter to pick an open source license. If so, do we follow
> googlecode for the choice of license?

I'd believe allowing the submitter to pick an open-source license for
bigger packages could be useful.

However, for code snippets we might want to enforce BSD.

> One last question (sorry for so many), given how many people already
> have nice projects on github, sourceforge, googlecode, etc., should we
> provide an option for people to simply link to their repository rather
> than provide us with a direct copy of the code? Actually, one model
> could be that people host their code somewhere else and we merely
> provide an aggregation service so people can easily see what's out there
> in the scientific python universe and how the community has rated a
> given package. That way, developers can keep their existing codebases
> without changing their workflow....

Here, it would be best to not forget that we already have the scikits.*
namespace packages, and scikits.appspot.com. How that web app works, is
that people just upload a package named scikits.something on PyPi, and
the portal picks it up from there.

The new system should be a "spiritual successor" to scikits, with more
features etc., and a friendly hosting option for small snippets.

So yes, it should definitely allow externally hosted packages, especially
PyPi. Perhaps it would even be useful to automatically import science
packages (including scikits) from PyPi. The package entry should also be
usable only as an "advertisement" for a package, with the package itself
being hosted elsewhere. (Here, users should be able to flag broken links
etc.)

Another thing that should be considered: the system should enforce that
metadata is entered: package descriptions should be sufficiently
detailed, a suitable number of tags should be entered, etc.

--
Pauli Virtanen

O

unread,
Oct 30, 2010, 1:01:57 PM10/30/10
to scipy...@scipy.org
William,

Bravo! If you decide to follow through on this, I think it could be *huge*.  One other question is whether to do it for just scipy or python generally.  I think people really need a place to deposit snipits of useful code w/rating system and commentary.  If it takes off, I think it would attract lots more people to Python and scipy/numpy.

I agree with all of your suggestions.  Maybe contact python.org to see if you can get a link to it.  As for what to call it, "PythonExchange" is a third option. 

With respect to licensing, I'd allow the submitters to choose, and supply guidelines on the site about how to do this properly in their code (if they want to).

Allowing people to link to projects they have elsewhere is an excellent idea.

O (phaustus)








Almar Klein

unread,
Oct 30, 2010, 1:00:15 PM10/30/10
to SciPy Users List
On 30 October 2010 17:00, william ratcliff <william....@gmail.com> wrote:
If we could automate it, how much do you think the bandwidth/hosting costs would be per month?   Would it be restricted to just code (that is just text files, cutoff above a certain size)?  No bug tracking and a simple rating system for packages? A section for comments about a given package.  The submitter gives it up to 4 tags (for searching) and we start out with a given list of topics and let people additional ones later?  People register for an account (to reduce spam) or do we just use Openid or Openauth?     How do we deal with spam?  Do we allow people to sort packages by date?  Rating?  Would people want to use Django?   What would we call it?  I'd be willing to purchase a domain name and pay for hosting on webfaction to try it out.  If it gets too pricey then I may have to ask for help later. (I think we should avoid ads).    What would you guys like to call it?

Woaw, I like your enthusiasm! However, let's first establish whether we should discard Pypi or if we can maybe make it suitable for our needs with a few changes (assuming that the rest of the Python community lets us make these changes).

One maybe-downside is that Pypi is for Python in general. Is this a problem, do we want something purely for science and engineering?



PythonCentral (is that infringing?)?  ScipyExchange?    

If we're doing this, I guess it'd be science focused, so I suggest a name with a reference to science of scipy.

 
If anyone wants to help mock up a prototype in Django, I have some time next week.  I have no design skills ;> 

Finally, licensing--I don't want to start a flame war or anything, but can we agree to make code on the site BSD, or should we allow the submitter to pick an open source license.  If so, do we follow googlecode for the choice of license?

Given that Python is mainly BSD oriented, I would vote for making all code hosted at the site BSD. Maybe that larger projects that are only referenced (as you also suggested) may choose their own license.

(I actually own two non-BSD projects because I did not fully understand the value/importance of the BSD license in the Python world. I was recently convinced by a wise man and will convert both my projects to BSD.)
 

One last question (sorry for so many),  given how many people already have nice projects on github, sourceforge, googlecode, etc., should we provide an option for people to simply link to their repository rather than provide us with a direct copy of the code?  Actually, one model could be that people host their code somewhere else and we merely provide an aggregation service so people can easily see what's out there in the scientific python universe and how the community has rated a given package.   That way, developers can keep their existing codebases without changing their workflow....

I definitely think this is a good idea. The site would then serve the role as the central place to search for scientific Python projects, without the need for people to host their projects at two locations.

  Almar

PS: While writing this, Pauli also sent his response. I'm happy to see that we agree on most topics :)


William


On Sat, Oct 30, 2010 at 9:39 AM, <josef...@gmail.com> wrote:
On Sat, Oct 30, 2010 at 8:15 AM, Matthew Brett <matthe...@gmail.com> wrote:
> Hi
>

denis

unread,
Oct 30, 2010, 1:32:56 PM10/30/10
to scipy...@scipy.org
Folks,
that's quite a few good comments in various directions.
It looks to me as though we want a combination of two things:

- a database of numpy/scipy packages, searchable on tags, date ...
- a web front end for user comments on packages.

These exist to some extent, as concrete models:
for the database part,
http://code.google.com/hosting/search?q=label%3Anumpy&projectsearch=Search+projects
looks reasonable, but is for code.google packages only.
For commenting, I like the stackoverflow (== Solace ?) realtime html
subset.
There must be other rating/commenting systems as partial models ?

On rating: book ratings mostly 5-star are worthless
unless the reviewer can articulate why.
So I'd leave number ratings out;
people can say "no doc" or "excellent doc" etc. in the text.

On Pypi: imho way too big and old, improvements take years.
Scipy is plenty -- start small.


So what's next, towards a version 0 of "Scipy Review":

- find an expert who's built a db + web interface or two --
perhaps the Ask.scipy.org people ?
- 1-page spec (matter of taste, I like a paper spec first).

cheers
-- denis

Gael Varoquaux

unread,
Oct 30, 2010, 1:42:18 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 07:00:15PM +0200, Almar Klein wrote:
> Woaw, I like your enthusiasm! However, let's first establish whether we
> should discard Pypi or if we can maybe make it suitable for our needs with
> a few changes (assuming that the rest of the Python community lets us make
> these changes).

Yes, I am impressed by the positive attitude that this thread is taking.
Congratulations for that, and for offering time and energy.

I think the idea in general is an excellent one. In addition to what has
already been said, here are a few remarks, in random order:

* Don't disregard PyPI for well-maintained packages: we need the
non-scientific Python community, let us not break appart from them.
Besides, scientific users also need packages to read XML, talk over
the network... On the other hand, for simply cookbook-like stuff,
I believe that there might be some value in having a scipy-specific
* repository.

* Searching is important, but can be made easily using a Google
custom search. For example, I have been impressed by what Fabian
did for the scikit-learn's website,
http://scikit-learn.sourceforge.net/, on the top right of the
webpage. Try it out, it is cooler than it seems.

* License: let's at least force users to choose an OSI-compatible
license. I would try to push them using the BSD, as in my
experience many people choose GPL by default, but I would not
enforce this choice.

* Self hosting, bandwidth... this should not stop anybody from start
wild ideas. http://docs.scipy.org started as a crazy idea hosted on
my girlfriend's server (and coded by Stefan, Pauli and herself :P).
It migrated to Enthought-hosted servers when it became more
'production-ready'.

By the way, speaking of migration of service, now that ask.scipy.org is
in production, we should add a link to it on the sidebar of the scipy.org
website, the docs frontpage, and the planet. I can do it for the planet,
so if someone does it for the scipy.org wiki, I'll just copy the design.

In general, I think that it is important that all these websites be
linked together.

As Eleftherios would say 'Go, go team'

Gaël

josef...@gmail.com

unread,
Oct 30, 2010, 2:18:40 PM10/30/10
to SciPy Users List

I would strongly recommend (to users) that all shorter code, snippets
and recipes, are BSD by default or made explicit by the user, and that
the license is very easy to see on the web page.
Given that we are writing BSD code and to avoid any conflicts, I
essentially ignore all non-BSD code, for example on the matlab file
exchange.

Josef

Almar Klein

unread,
Oct 30, 2010, 3:00:23 PM10/30/10
to SciPy Users List
I would strongly recommend (to users) that all shorter code, snippets
and recipes, are BSD by default or made explicit by the user, and that
the license is very easy to see on the web page.
Given that we are writing BSD code and to avoid any conflicts, I
essentially ignore all non-BSD code, for example on the matlab file
exchange.

Hear hear! Since most Python code is BSD licensed, a module/package using non-BSD compatible license (for example GLP) would be incompatible with, well almost all Python code. This may sound trivial, but I, for one, did not fully understand this until someone explained it.

I would even go so far as to force a BSD license for all code hosted on the site itself. Referenced code can then still choose a license. At the very least there should be a proper explanation that people should chose the BSD license in most cases, and *why*.

  Almar

william ratcliff

unread,
Oct 30, 2010, 4:38:15 PM10/30/10
to SciPy Users List

How about "scisnippets"?  I'll try to start a prototype tomorrow so we have something tangible.

Filipe Pires Alvarenga Fernandes

unread,
Oct 30, 2010, 5:43:32 PM10/30/10
to SciPy Users List

Well even the "original" file exchange ended up forcing BSD licensing:

http://www.mathworks.com/matlabcentral/FX_transition_faq.html

To me that's the first limitation for considering Pypi similar to a
possible file exchange.
Don't get me wring, I'm a big fan of Pypi, but it is far more complex
than a file exchange (That's the second limitation).

>   Almar

Fernando Perez

unread,
Oct 30, 2010, 6:22:56 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 8:00 AM, william ratcliff
<william....@gmail.com> wrote:
> If we could automate it, how much do you think the bandwidth/hosting costs
> would be per month?   Would it be restricted to just code (that is just text
> files, cutoff above a certain size)?  No bug tracking and a simple rating
> system for packages? A section for comments about a given package.  The
> submitter gives it up to 4 tags (for searching) and we start out with a
> given list of topics and let people additional ones later?  People register
> for an account (to reduce spam) or do we just use Openid or Openauth?
> How do we deal with spam?  Do we allow people to sort packages by date?
>  Rating?  Would people want to use Django?   What would we call it?  I'd be
> willing to purchase a domain name and pay for hosting on webfaction to try
> it out.  If it gets too pricey then I may have to ask for help later. (I
> think we should avoid ads).    What would you guys like to call it?

Just a few comments from the sidelines... I think it would be really
great if every snippet had an automatic version control history
associated with it. For me, the gist model at github is perfect in
this regard. Consider for example (random gist I found that had numpy
in it):

http://gist.github.com/364369

This very simple page has all the code, a download button, space for
comments, revision history and a 'fork' button. The last two for me
are very, very important: they plant the seed that allows a simple
script to very easily grow into something larger. The author has an
easy way to make improvements and track those (with near-zero setup
overhead), and the 'fork' button makes it easy for others to
contribute.

For multi-file projects, the obvious counterpart is obviously a real
repo (github or whatever).

I know it may feel a little harsh to push a specific version control
system, but to me the idea of binding revision history and forking
support as an integral part of a 'file exchange' is actually
important. I think that we should try not just to replicate matlab's
file exchange website, but rather to do better. And I think that
pervasive version control 'as a way of life' is actually one
ingredient in the right direction.

In any case, there's zero chance that I'll do any actual work on this,
so consider this idle chat from the peanut gallery :) I'll be happy
to use anything those actually putting in the real elbow grease can
come up with.

Regards,

f

Almar Klein

unread,
Oct 30, 2010, 6:54:03 PM10/30/10
to SciPy Users List
On 30 October 2010 22:38, william ratcliff <william....@gmail.com> wrote:

How about "scisnippets"?  I'll try to start a prototype tomorrow so we have something tangible.

That sounds a bit too narrow. From how I see it, we can distinguish three or four submit-categories (I'm not sure whether the first two should be combined):
  * snippets (few lines of code)
  * modules / scripts (single file)
  * packages (multiple files)
  * referenced project (linking to say googlecode or github)

Thanks for taking this on, William. I really think this is a great project and I think this can become something big. Sadly, I'm not in a position to do any actual work, as I have no experience with Zope, and have not much free time to spare with my 5 month old son at home :)

Cheers,
  Almar


Jochen Schröder

unread,
Oct 30, 2010, 6:59:34 PM10/30/10
to SciPy Users List

Let me first say that I love your idea and the enthusiasm you've already
created.

However I really take issue with the above statement and the notion of
forcing a specific OSS licence choice onto users. First your statement
above is factually not correct: GPL is a BSD compatible licence (in the
usual meaning of this phrase), i.e. you can include BSD code in a GPL
project. You can also do otherwise, however then your project
effectively becomes GPL.
Secondly, the argument that most Python code is already BSD, one could
just as well make the argument that most OSS code is GPL so use GPL.
Furthermore your argument also ignores the fact that if you're using
(ctypes, cython) wrappers around C-code you will probably be bound by
the licence of the C-library so some code might not have a choice.

Finally the biggest problem I have is with the notion that forcing a
specific OSS choice onto developers is ok. If someone chooses a licence,
they have a reason to do so and it is their choice. The funny thing is
that the "free software crowd", often gets accused of this, however I've
found that often the BSD crowd is a lot worse, and often quite hostile
towards GPL licensing. Anyway I don't want to start a licence flamewar.

Now a option to restrict search to common OSS licences I'm all for that!

Cheers
Jochen

Matthew Brett

unread,
Oct 30, 2010, 7:19:15 PM10/30/10
to SciPy Users List
Hi,

> However I really take issue with the above statement and the notion of
> forcing a specific OSS licence choice onto users. First your statement
> above is factually not correct: GPL is a BSD compatible licence (in the
> usual meaning of this phrase)

BSD is GPL compatible (you can include BSD code in a GPL project and
still stay GPL) but not the other other way round.

You'll see that matlab file exchange does _enforce_ BSD - from the
page cited above:

http://www.mathworks.com/matlabcentral/FX_transition_faq.html

"Can I ask you to consider [my favorite license]?
No. For consistency, the BSD license will be the standard for the File
Exchange. "

Of course it's OK in general to have other licenses, but maybe as an
exception for this site, and thus linked-to rather than hosted.

Best,

Matthew

william ratcliff

unread,
Oct 30, 2010, 7:36:13 PM10/30/10
to SciPy Users List
Let me think about how to implement the auto-repo part.   In the mean time, what about something along the lines of:

William

David Cournapeau

unread,
Oct 30, 2010, 7:37:01 PM10/30/10
to SciPy Users List
2010/10/31 Jochen Schröder <cyco...@gmail.com>:

This is for code snippets - if you want to choose a specific license,
then nobody forces you not to use it. We just don't support it through
code snippets.

David

Nathaniel Smith

unread,
Oct 30, 2010, 8:43:39 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 3:22 PM, Fernando Perez <fpere...@gmail.com> wrote:
> Just a few comments from the sidelines... I think it would be really
> great if every snippet had an automatic version control history
> associated with it.  For me, the gist model at github is perfect in
> this regard.  Consider for example (random gist I found that had numpy
> in it):
>
> http://gist.github.com/364369
>
> This very simple page has all the code, a download button, space for
> comments, revision history and a 'fork' button.  The last two for me
> are very, very important: they plant the seed that allows a simple
> script to very easily grow into something larger.  The author has an
> easy way to make improvements and track those (with near-zero setup
> overhead), and the 'fork' button makes it easy for others to
> contribute.

gist.github.com is *really* slick, but... I'm guessing it wouldn't be
so easy to reimplement for someone who hasn't just implemented github?
And it seems to me that the sort of people who use git (i.e., people
with a substantial investment of time and mental energy in "real
programming") are already pretty well supported by existing
infrastructure. I'm not going to be working on this either, so this is
also from the peanut gallery, but... if I *were* doing this project,
my focus would be on achieving exactly two things as quickly as
possible:

1) A minimum ceremony way for your average scientific programmer to
get some useful code they wrote online. Maximum five steps (or fewer
would be better!): a) log-in, b) type some text about what the snippet
does, c) check the box saying yeah they understand what BSD means, d)
paste in the code, e) hit submit. Maybe there should be some extra
optional steps for richer metadata or whatever, but srsly, you cannot
make "understand the GPL" or "know what git is" or "fill out this
complicated form to specify tags in our obscure Trove ontology"
prerequisites for scientific programmers to contribute.

2) Solid one-stop-shopping support for scientific code. (If you do
this right, then everyone will use the site, and then it's what
they'll think of when they have something useful to upload!) That
means, a good search function for all the snippets that have been
uploaded. It also means the search function needs to know about
"proper" packages -- searching for "wavelets" should find pywt, etc.
I'm not sure if that's best done by searching pypi directly, or by
having people explicitly enter pointers to scientific software into
the database -- I'd probably do the latter because it's both quicker
to implement and would keep the search results much more focused. And
for real one-stop-shopping, searches should be able to find functions
embedded inside larger packages (so e.g. searching for matrix
exponential should give you a hit on scipy.linalg.expm). I guess this
means, index the documentation for at least numpy and scipy, and maybe
the docs for other packages as they get added?

Obviously there are lots of enhancements one can imagine -- tracking
of multiple versions of the same snippet, discussions, syntax
highlighting, finding related snippets, git support, etc. etc., and
there are lots more ideas in this thread -- but I'd start by lasering
in on those two features, work hard on making the fundamentals as
useful as possible, and then build up from there.

Hope that's useful,
-- Nathaniel

Fernando Perez

unread,
Oct 30, 2010, 9:00:14 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 5:43 PM, Nathaniel Smith <n...@pobox.com> wrote:
> gist.github.com is *really* slick, but... I'm guessing it wouldn't be
> so easy to reimplement for someone who hasn't just implemented github?
> And it seems to me that the sort of people who use git (i.e., people
> with a substantial investment of time and mental energy in "real
> programming") are already pretty well supported by existing
> infrastructure.

Well, part of the beauty of gist is that you don't have to set up
*any* version control client-side if you don't want to. gist is
literally copy/paste, finished. They do the VC for you, and you can
later use it if you want to, and people can fork it if they want to,
but you don't have to.

The reason I'd like that is that it would increase the likelihood that
small snippets would actually get improved over time in a reusable
way, rather than in people's personal collections.

But you're absolutely right in that it's probably a lot of work unless
you've already created something like github, and most certainly not
the first priority at all. Just a wish :)

Cheers,

f

Joshua Holbrook

unread,
Oct 30, 2010, 9:04:05 PM10/30/10
to SciPy Users List
Can you use gists directly? Just a thought.

--Josh

Fernando Perez

unread,
Oct 30, 2010, 10:01:30 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 6:04 PM, Joshua Holbrook
<josh.h...@gmail.com> wrote:
> Can you use gists directly? Just a thought.
>

Well, sure, but gists are buried in all the code at github. One can
search for numpy/scipy, but at that point we might as well just use
google.

I really like the ideas being discussed, so we have something with a
specific scipy focus and other things suggested (description, a
filename, perhaps the ability to upload a figure, tags, etc). I
simply was saying that in addition to those --not instead of those--
gist-like functionality would be great to have.

But a tool with a scipy focus is more important than something like gist.

Joshua Holbrook

unread,
Oct 30, 2010, 10:11:06 PM10/30/10
to SciPy Users List
What I mean is more like, what if we could make a tool that basically
tracks and organizes gists?

Just an idea. I agree that it takes back seat to the essentials of a
python file exchange kinda thing, so if this would just make things
harder, disregard it.

--Josh

Fernando Perez

unread,
Oct 30, 2010, 10:14:54 PM10/30/10
to SciPy Users List
On Sat, Oct 30, 2010 at 7:11 PM, Joshua Holbrook
<josh.h...@gmail.com> wrote:
> What I mean is more like, what if we could make a tool that basically
> tracks and organizes gists?
>
> Just an idea. I agree that it takes back seat to the essentials of a
> python file exchange kinda thing, so if this would just make things
> harder, disregard it.
>

Ah, I think I misunderstood you, sorry. Yes, using gist as the
'backend' could work... I guess if it's possible to use a cross-site
authentication solution (OpenID or one of those things, I'm not very
familiar with those tools), then one option would be to offer upon
upload a checkbox 'Create gist for your contribution?'. If checked,
the system could automatically create the relevant gist and display a
prominent link to it on the snippet's page, fetch the code for display
and download from gist (getting versioning), etc.

I don't know how hard/practical it would be, but it's certainly an
intriguing idea.

Fernando Perez

unread,
Oct 30, 2010, 10:32:09 PM10/30/10
to SciPy Users List
2010/10/30 Jochen Schröder <cyco...@gmail.com>:

> Secondly, the argument that most Python code is already BSD, one could
> just as well make the argument that most OSS code is GPL so use GPL.

It's not an argument of majority, it's one of free flow of code across
projects and of reciprocity and fairness.

The GPL has an asymmetric relationship re. BSD code: gpl projects can
incorporate all the bsd code they want, but bsd projects can't
incorporate gpl code (without relicensing, which is often impossible
when there are many copyright holders, and is in any case a major
burden on a project). This asymmetry is at the heart of this
discussion: numpy, scipy, matpotlib, mayavi, ipython and most of the
open source projects around here are BSD-licensed and it means we can
all freely share code across all of them (and we do, very often,
freely copy pieces from one to the other as needed, this is not a
hypothetical statement). In fact, I relicensed ipython from its early
LGPL license (the one that I'm probably happiest with *personally*) to
BSD precisely based on this argument of free flow of code across
projects, made by John Hunter at the time. And I'm glad I did, as
we've been able to copy code at various points in time across projects
without any worries.

When an author takes a piece of BSD code, modifies or builds upon it,
and makes the new work available as GPL (something I've sadly seen
done many times), he's most certainly *not* behaving in a spirit of
reciprocity towards the author of the original BSD code. The BSD
author can no longer benefit from the improvements to his code:
despite the fact that those improvements remain open source, they are
no longer available to him unless he relinquishes his original license
terms and switches to the GPL. I find that practice actually worse
than building proprietary extensions on open source code, because when
this is done typically companies at least are doing some other
business-related stuff that the open source developers are unlikely to
engage in.

> Furthermore your argument also ignores the fact that if you're using
> (ctypes, cython) wrappers around C-code you will probably be bound by
> the licence of the C-library so some code might not have a choice.

In this case obviously there's no choice and no argument either, but I
don't think anyone here is ignoring it, as it's the most basic ground
truth of any licensing discussion.

> Finally the biggest problem I have is with the notion that forcing a
> specific OSS choice onto developers is ok. If someone chooses a licence,
> they have a reason to do so and it is their choice. The funny thing is
> that the "free software crowd", often gets accused of this, however I've
> found that often the BSD crowd is a lot worse, and often quite hostile
> towards GPL licensing. Anyway I don't want to start a licence flamewar.

Nobody is *forcing* anything onto anyone. A community is free to say:
if you want to use our tools, these are our terms. This is a
community that shares code under the terms of the BSD license and sets
up a website for that purpose. The rest of the whole internet is
available to anyone who wishes to publish GPL improvements to Numpy
and Scipy, just not on the Scipy servers :)

My personal opinion is that in the long run, it would be beneficial to
have this 'file exchange' have BSD-only code (or public domain, since
employees of the US Federal government as far as I understand must
publish their codes under public domain terms). The reason is simple:
snippets put there, when good, are prime candidates for integration
into numpy/scipy proper. It would be a shame, and frankly somewhat
absurd, to have a bunch of great codes sitting on the scipy server
that we couldn't integrate into scipy. At least it seems so to me...

Cheers,

f

josef...@gmail.com

unread,
Oct 30, 2010, 10:47:49 PM10/30/10
to SciPy Users List

Or for integration into other Scipy related packages.

That's exactly my opinion on this.

Josef

David Warde-Farley

unread,
Oct 31, 2010, 2:13:15 AM10/31/10
to SciPy Users List

On 2010-10-30, at 10:14 PM, Fernando Perez wrote:

> Ah, I think I misunderstood you, sorry. Yes, using gist as the
> 'backend' could work... I guess if it's possible to use a cross-site
> authentication solution (OpenID or one of those things, I'm not very
> familiar with those tools), then one option would be to offer upon
> upload a checkbox 'Create gist for your contribution?'. If checked,
> the system could automatically create the relevant gist and display a
> prominent link to it on the snippet's page, fetch the code for display
> and download from gist (getting versioning), etc.
>
> I don't know how hard/practical it would be, but it's certainly an
> intriguing idea.


I read this and had a slightly different idea: a mechanism for collecting gists and code snippets from sites like github. Basically a gist-indexing service. You submit a link to your gist, maybe some tags, and we fetch data about it, maybe have a fairly rapid moderation queue to make sure people aren't submitting spam.

Ideally, we'd be able to use GitHub gists, or whatever the equivalent mechanism is for several other sites (bitbucket, gitorious, pastebin.com). GitHub provides a gists API which makes machine-readable info about the gist available via JSON, so even just an URL and tags would be enough and you could fetch the rest via API.

Above all, I really like the gist.github.com model of forkable version-controlled snippets (under the hood). It would be nice if there was a software package that supported this that we could deploy on our own, but barring that, I think organizing topical gists from sites that *do* offer this model is a close second.

David

Fernando Perez

unread,
Oct 31, 2010, 2:24:20 AM10/31/10
to SciPy Users List
On Sat, Oct 30, 2010 at 11:13 PM, David Warde-Farley
<ward...@iro.umontreal.ca> wrote:
> I read this and had a slightly different idea: a mechanism for collecting gists and code snippets from sites like github. Basically a gist-indexing service. You submit a link to your gist, maybe some tags, and we fetch data about it, maybe have a fairly rapid moderation queue to make sure people aren't submitting spam.

That sounds quite doable and a nice balance of features and
implementation effort. People could still just paste in their code,
as we want to make it easy for anyone who doesn't use
github/bitbucket/etc to still be able to contribute with nothing more
than a copy/paste. But if they do have already an account with such a
service, that would be the preferred mode of operation.

Cheers,

f

David Warde-Farley

unread,
Oct 31, 2010, 2:35:56 AM10/31/10
to SciPy Users List
On 2010-10-30, at 6:59 PM, Jochen Schröder wrote:

> Finally the biggest problem I have is with the notion that forcing a
> specific OSS choice onto developers is ok.

Nobody's forcing them to release their code under any particular license -- merely suggesting that if you want to make your code available *on our servers* it's got to be liberally licensed. This isn't really any different from sites like Wikipedia, where you must consent to your contributions being used under the terms GFDL. If you don't like Wikipedia's terms you are free to make your content available elsewhere on the web, and the same goes for the hypothetical code-sharing site.

> If someone chooses a licence, they have a reason to do so and it is their choice.

Actually you'd be surprised how often developers who release GPL code are unaware of the implications or license specifics, and just chose it because it's popular.

> The funny thing is that the "free software crowd", often gets accused of this,
> however I've found that often the BSD crowd is a lot worse, and often quite
> hostile towards GPL licensing.

That shouldn't really be a surprise. As Fernando pointed out more eloquently than I can, the asymmetric nature of the two licenses means that developers and users of BSD-licensed code have nothing to gain and everything to lose in this situation. While there's no technical or legal grounds for objection, there's good reasons to be annoyed by it.

David

william ratcliff

unread,
Oct 31, 2010, 3:13:50 AM10/31/10
to SciPy Users List
Would a first version without the gist feature be useful?  We could try to add it in later once we see if the service is actually being used.   One question about gist though---what is the overall benefit?  If the original poster can edit the code is that sufficient?   

Also, how does the use of gist effect comments?  For example, suppose someone posts some code and people make comments about the original code and those are incorporated--aren't the comments then out of sync with the code?   Are the comments then synced with a given revision of the code?   

Is the fork option useful for our context, or is a simple option to download the current instance of the code sufficient? 

William

David Warde-Farley

unread,
Oct 31, 2010, 3:23:56 AM10/31/10
to SciPy Users List
On 2010-10-31, at 3:13 AM, william ratcliff wrote:

> Would a first version without the gist feature be useful? We could try to add it in later once we see if the service is actually being used. One question about gist though---what is the overall benefit? If the original poster can edit the code is that sufficient?

Basically it outsources the version-tracking burden, in addition to making it possible to use version control tools locally if you want to.

> Also, how does the use of gist effect comments? For example, suppose someone posts some code and people make comments about the original code and those are incorporated--aren't the comments then out of sync with the code? Are the comments then synced with a given revision of the code?


The Github API definitely provides either a SHA-1 hash or a last-modified date, and so recording that when a comment is made would make it easy to flag a comment as being "about a previous version". We'd have to do this bookkeeping for locally stored snippets, too, probably just with a timestamp.

David

Gael Varoquaux

unread,
Oct 31, 2010, 4:46:45 AM10/31/10
to SciPy Users List
On Sun, Oct 31, 2010 at 12:54:03AM +0200, Almar Klein wrote:
> That sounds a bit too narrow. From how I see it, we can distinguish three
> or four submit-categories (I'm not sure whether the first two should be
> combined):
>   * snippets (few lines of code)
>   * modules / scripts (single file)
>   * packages (multiple files)
>   * referenced project (linking to say googlecode or github)

IMHO, packages should go to PyPI. We need to work together with the wider
Python community to help gain momentum.

Gael Varoquaux

unread,
Oct 31, 2010, 4:50:04 AM10/31/10
to SciPy Users List
On Sun, Oct 31, 2010 at 02:35:56AM -0400, David Warde-Farley wrote:
> > If someone chooses a licence, they have a reason to do so and it is their choice.

> Actually you'd be surprised how often developers who release GPL code are unaware of the implications or license specifics, and just chose it because it's popular.

Yes, I have witness this a lot.

Gaël

David Cournapeau

unread,
Oct 31, 2010, 4:50:46 AM10/31/10
to SciPy Users List
On Sun, Oct 31, 2010 at 5:46 PM, Gael Varoquaux
<gael.va...@normalesup.org> wrote:

>
> IMHO, packages should go to PyPI. We need to work together with the wider
> Python community to help gain momentum.

Regardless of anyone's opinion about Pypi, I agree we should not try
to conflate code snippets /cookbook hosting and "real" projects
hosting.

David

Gael Varoquaux

unread,
Oct 31, 2010, 4:57:00 AM10/31/10
to SciPy Users List
On Sun, Oct 31, 2010 at 03:13:50AM -0400, william ratcliff wrote:
> Would a first version without the gist feature be useful?

IMHO: Yes!

I think the gist idea is excellent. Gist is awesome. I should start using
it in my technical blogposts. However, I side with Fernando that the core
scipy-specific functionality is more important than the version-control
sweetness of gist.

Ga�l, from the (crowded) peanut gallery

william ratcliff

unread,
Oct 31, 2010, 6:05:26 AM10/31/10
to SciPy Users List
I don't think anyone's considering forking from PyPi :>  However, I think that there is some value in having a central repository for code snippets that are relevant to scientific programmers.  Especially something that is easy for people to contribute to.

josef...@gmail.com

unread,
Oct 31, 2010, 7:28:47 AM10/31/10
to SciPy Users List

One advantage of the matlab fileexchange is it's permanence. The
problem I face quite often with topical link collections is that the
original author moves to other things, changes jobs, internet hosting
services and so on, and the source behind the links disappear.

So, even when a gist or similar option for source controlled recipes
is available, I would prefer if there is a local copy of the original
code available. This way it would still be available 5 years later.

Also, peanut gallery, but I think I will be a user and post my recipes
there instead of "spamming" the scipy mailing list with it.

Josef

>
> Gaël, from the (crowded) peanut gallery

Almar Klein

unread,
Oct 31, 2010, 9:27:13 AM10/31/10
to SciPy Users List
On 31 October 2010 09:46, Gael Varoquaux <gael.va...@normalesup.org> wrote:
On Sun, Oct 31, 2010 at 12:54:03AM +0200, Almar Klein wrote:
>    That sounds a bit too narrow. From how I see it, we can distinguish three
>    or four submit-categories (I'm not sure whether the first two should be
>    combined):
>      * snippets (few lines of code)
>      * modules / scripts (single file)
>      * packages (multiple files)
>      * referenced project (linking to say googlecode or github)

IMHO, packages should go to PyPI. We need to work together with the wider
Python community to help gain momentum.

Fair enough, so packages are included by reference only.

  Almar

Bruce Southey

unread,
Oct 31, 2010, 10:01:14 AM10/31/10
to SciPy Users List

I support Fernando's view of the licensing because the whole point of
this 'file exchange' is about sharing.That is make life a little
easier for those that are at some mental block (for whatever reason).
Personal views are really are irrelevant in a community where people
are freely helping other people without restriction by providing code
(either complete or fixing errors). For that reason, public domain
works best for 'trivial' code and BSD for more complex code since both
allow simple inclusion into numpy/scipy. Yet for really complex code
this is probably not the place and there are other options like
scikits for that.

Bruce

Keith Goodman

unread,
Oct 31, 2010, 3:36:46 PM10/31/10
to SciPy Users List
On Sat, Oct 30, 2010 at 11:13 PM, David Warde-Farley
<ward...@iro.umontreal.ca> wrote:

> Ideally, we'd be able to use GitHub gists, or whatever the equivalent mechanism is for several other sites (bitbucket, gitorious, pastebin.com). GitHub provides a gists API which makes machine-readable info about the gist available via JSON, so even just an URL and tags would be enough and you could fetch the rest via API.

>From where I sit, the gallery above the peanut gallery, I think it is
disheartening that we all use github instead of a FOSS version like
gitorious. If the linux kernel did the same (continued to use
proprietary package management) we wouldn't even have git. I wonder
what our choice to use github (along with what seems like a million
other developers) means for the prospects of a gist-like feature in
gitorious (which we could install on our own servers).

Gael Varoquaux

unread,
Oct 31, 2010, 3:50:48 PM10/31/10
to SciPy Users List
On Sun, Oct 31, 2010 at 12:36:46PM -0700, Keith Goodman wrote:
> From where I sit, the gallery above the peanut gallery, I think it is
> disheartening that we all use github instead of a FOSS version like
> gitorious. If the linux kernel did the same (continued to use
> proprietary package management) we wouldn't even have git. I wonder
> what our choice to use github (along with what seems like a million
> other developers) means for the prospects of a gist-like feature in
> gitorious (which we could install on our own servers).

While I have (very) mixed feeling for git, github is really awesome. The
git usability is, in my eyes, horrendous. It is compensated by a
fantastic internal design, and great power. However, I wouldn't be using
git if there was not github: its usability is fantastic and unlike git
there is almost no learning curve[*].

I guess there is a lesson to be learned here: free software tends to make
first good design and worry about normal users only very late down the
pipeline. We have seen this story repeat over and over. So many people
use Apple computers, even in the free software community.

Gael

[*] I must confess that I hadn't looked at gitorious for a while, and it
seems that it is making a lot of progress in terms of being usable. It's
probably doing better than most code sharing platforms. There is value in
competition!

Jonathan Guyer

unread,
Nov 1, 2010, 12:05:03 PM11/1/10
to SciPy Users List

On Oct 30, 2010, at 10:32 PM, Fernando Perez wrote:

> (or public domain, since
> employees of the US Federal government as far as I understand must
> publish their codes under public domain terms).

Correct.

John Hunter

unread,
Nov 1, 2010, 2:05:52 PM11/1/10
to SciPy Users List
On Sat, Oct 30, 2010 at 9:32 PM, Fernando Perez <fpere...@gmail.com> wrote:

> My personal opinion is that in the long run, it would be beneficial to
> have this 'file exchange' have BSD-only code (or public domain, since
> employees of the US Federal government as far as I understand must
> publish their codes under public domain terms).

The flip side of this is that there are many environments in which the
distinction between GPL and BSD is irrelevant, eg for code we deploy
internally at work and do not distribute. Suppose someone writes some
really nifty code that depends on pygsl. I would rather have access
to it on the file exchange than not. If the code submission dialogs
has a choice of licenses with BSD as the default, and selection of
non-BSD takes them to an explanation of why we prefer BSD and an "are
you sure" dialog, then including this code is beneficial in my view.

> The reason is simple:
> snippets put there, when good, are prime candidates for integration
> into numpy/scipy proper. It would be a shame, and frankly somewhat
> absurd, to have a bunch of great codes sitting on the scipy server
> that we couldn't integrate into scipy. At least it seems so to me...

I'm not sure I agree here. Many snippets may be more like elaborate
examples. Something designed to get you started that you can perturb
off of. For some of the stuff it may be farm league for scipy/numpy
inclusion, but there is plenty of room for useful scripts that don't
belong in scipy proper. So I would err on the side of inclusion and
very low barriers to entry.

JDH

josef...@gmail.com

unread,
Nov 1, 2010, 2:31:27 PM11/1/10
to SciPy Users List

Same with code that cannot be BSD either by infection from a part, or
because it is a translation of non-BSD code from another language.
Also, some code on the matlab fileexchange that is labeled BSD might
not be so because it is based on (derived from, translated from, or
inspired by) non-BSD compatible code.

There are also license mixtures like
http://ab-initio.mit.edu/wiki/index.php/NLopt which is only LGPL
because a small part is LGPL:
"Free/open-source software under the GNU LGPL (and looser licenses for
some portions of NLopt)",

Josef

Matthew Brett

unread,
Nov 1, 2010, 3:30:04 PM11/1/10
to SciPy Users List
Hi,

>> My personal opinion is that in the long run, it would be beneficial to
>> have this 'file exchange' have BSD-only code (or public domain, since
>> employees of the US Federal government as far as I understand must
>> publish their codes under public domain terms).
>
> The flip side of this is that there are many environments in which the
> distinction between GPL and BSD is irrelevant, eg for code we deploy
> internally at work and do not distribute.  Suppose someone writes some
> really nifty code that depends on pygsl.  I would rather have access
> to it on the file exchange than not.  If the code submission dialogs
> has a choice of licenses with BSD as the default, and selection of
> non-BSD takes them to an explanation of why we prefer BSD and an "are
> you sure" dialog, then including this code is beneficial in my view.

The risk is that people will tend to pick up code snippets from the
file exchange and paste them into their own code. It will be very
easy for them to accidentally pick up GPL code and accidentally
relicense, leading to a viral licensing mess.

If we do go down that route, can I suggest that the pages for the GPL
code snippets have nice red flashing graphics either side saying
'warning - please be aware that including any part of this code in
your code means that all your code has to be GPL'.

See you,

Matthew

Joshua Holbrook

unread,
Nov 1, 2010, 3:37:44 PM11/1/10
to SciPy Users List
Keep in mind, one may always start by insisting on BSD licensing and
see how it goes, then add GPL options later.

--Josh

Robert Kern

unread,
Nov 1, 2010, 3:52:31 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 14:30, Matthew Brett <matthe...@gmail.com> wrote:
> Hi,
>
>>> My personal opinion is that in the long run, it would be beneficial to
>>> have this 'file exchange' have BSD-only code (or public domain, since
>>> employees of the US Federal government as far as I understand must
>>> publish their codes under public domain terms).
>>
>> The flip side of this is that there are many environments in which the
>> distinction between GPL and BSD is irrelevant, eg for code we deploy
>> internally at work and do not distribute.  Suppose someone writes some
>> really nifty code that depends on pygsl.  I would rather have access
>> to it on the file exchange than not.  If the code submission dialogs
>> has a choice of licenses with BSD as the default, and selection of
>> non-BSD takes them to an explanation of why we prefer BSD and an "are
>> you sure" dialog, then including this code is beneficial in my view.
>
> The risk is that people will tend to pick up code snippets from the
> file exchange and paste them into their own code.   It will be very
> easy for them to accidentally pick up GPL code and accidentally
> relicense, leading to a viral licensing mess.

What viral licensing mess? Accidentally releasing GPLed code as part
of your code does *not* retroactively make the rest of your code GPLed
without your consent. It just means that you distributed the GPLed
code without the proper permission. The remedy for this infringement
is simply to stop distributing the GPLed code. You lose some time and
create some hassle while you fix your code to work without the GPLed
code, but there is absolutely nothing irrevocable about it.

tl;dr You cannot "accidentally relicense" your code. No such thing.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Matthew Brett

unread,
Nov 1, 2010, 3:57:01 PM11/1/10
to SciPy Users List
Hi,

> What viral licensing mess? Accidentally releasing GPLed code as part
> of your code does *not* retroactively make the rest of your code GPLed
> without your consent.

Puzzled to explain, but by 'mess' I mean that, if you want to make
code that does not violate the license terms, you will have to go back
rip out the GPL parts, and if they've been in there for a while, it
can be a mess.

By 'accidentally relicense' I mean copy GPL code, make some small
changes, and then enter a BSD license without realizing that you've
just radically changed the licensing terms.

I'm not quite sure what misunderstanding you are trying to correct.

See you,

Matthew

Robert Kern

unread,
Nov 1, 2010, 4:14:11 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 14:57, Matthew Brett <matthe...@gmail.com> wrote:
> Hi,
>
>> What viral licensing mess? Accidentally releasing GPLed code as part
>> of your code does *not* retroactively make the rest of your code GPLed
>> without your consent.
>
> Puzzled to explain, but by 'mess' I mean that, if you want to make
> code that does not violate the license terms, you will have to go back
> rip out the GPL parts, and if they've been in there for a while, it
> can be a mess.
>
> By 'accidentally relicense' I mean copy GPL code, make some small
> changes, and then enter a BSD license without realizing that you've
> just radically changed the licensing terms.
>
> I'm not quite sure what misunderstanding you are trying to correct.

It seemed like you were saying that one's own code would be
accidentally relicensed to GPL if you included GPLed code. You
ellipsized some critical nouns. :-) And it seemed to me that only this
drastic interpretation would warrant dramatic red flashing warning
signs.

In any event, I would not use "relicensing" to describe accidentally
labeling GPLed code as BSD. Only the copyright holders are able to
relicense. Anyone else going through the same motions just commits an
incorrect statement of fact. One that is usually trivial to discover
since most people, in my experience, do keep a note of where they got
a function from when they do copy-paste a snippet. If something goes
wrong, you want to know who to blame and where to get updates from.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Christopher Barker

unread,
Nov 1, 2010, 4:27:11 PM11/1/10
to SciPy Users List
On 10/30/10 10:00 AM, Almar Klein wrote:
> Woaw, I like your enthusiasm! However, let's first establish whether we
> should discard Pypi or if we can maybe make it suitable for our needs
> with a few changes (assuming that the rest of the Python community lets
> us make these changes).
>
> One maybe-downside is that Pypi is for Python in general. Is this a
> problem, do we want something purely for science and engineering?

I think there is a need for:

1) something focused on scientific/numerical computing

2) something suitable for tiny contributions -- just a page or two of
code -- I don't think we want hundreds of such tiny packages on PyPi.

> Given that Python is mainly BSD oriented, I would vote for making all
> code hosted at the site BSD.

It would be nice to have a public domain option, particularly for
smallish contributions.

> Actually, one model could be that people host their code somewhere
> else and we merely provide an aggregation service so people can
> easily see what's out there in the scientific python universe

I think this is good, but hosting small projects directly is a critical.
One of the goals here (my interpretation of following the discussion) is
to make it really easy to throw stuff up.

-Chris

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Jochen Schröder

unread,
Nov 1, 2010, 6:58:20 PM11/1/10
to SciPy Users List
On 02/11/10 06:30, Matthew Brett wrote:
> Hi,
>
>>> My personal opinion is that in the long run, it would be beneficial to
>>> have this 'file exchange' have BSD-only code (or public domain, since
>>> employees of the US Federal government as far as I understand must
>>> publish their codes under public domain terms).
>>
>> The flip side of this is that there are many environments in which the
>> distinction between GPL and BSD is irrelevant, eg for code we deploy
>> internally at work and do not distribute. Suppose someone writes some
>> really nifty code that depends on pygsl. I would rather have access
>> to it on the file exchange than not. If the code submission dialogs
>> has a choice of licenses with BSD as the default, and selection of
>> non-BSD takes them to an explanation of why we prefer BSD and an "are
>> you sure" dialog, then including this code is beneficial in my view.
>
> The risk is that people will tend to pick up code snippets from the
> file exchange and paste them into their own code. It will be very
> easy for them to accidentally pick up GPL code and accidentally
> relicense, leading to a viral licensing mess.
>
> If we do go down that route, can I suggest that the pages for the GPL
> code snippets have nice red flashing graphics either side saying
> 'warning - please be aware that including any part of this code in
> your code means that all your code has to be GPL'.
>
Even if the snippet is licensed BSD you cannot simply copy and paste a
code snippet. You have to include the license and copyright notice of
the original author. So if people simply copy and paste code snippets
without paying attention to the licensing it will end up being a mess
anyway, because they are possibly violating licenses. I don't think
restricting the file exchange to BSD only will make that any different.
Also, with respect to your argument, if people copied some part of the
snippet from somewhere else (possibly a GPL project), and post it as a
snippet under BSD you will end up in the same mess.

I don't want to come across as advertising GPL here, I just don't like
the concept of restricting the file exchange to one license only. People
already gave some examples where the license choice might be determined
not by the author of the snippet (e.g. "linking to (L)GPL C-code,
including some GPL code...). However these snippets can still be useful
for a lot of people, although they might not be suitable for inclusion
into scipy/numpy.

I also disagree with the idea that restricting everything to BSD will
make licensing simple miraculously. It is not, and people need to be
educated that looking and following licensing terms is important.

Cheers
Jochen

Christopher Barker

unread,
Nov 1, 2010, 7:07:28 PM11/1/10
to SciPy Users List
On 11/1/10 3:58 PM, Jochen Schröder wrote:
> Even if the snippet is licensed BSD you cannot simply copy and paste a
> code snippet. You have to include the license and copyright notice of
> the original author.

Exactly, which is why I think "snippets" are best put in the public
domain. yes, I know that public domain is not a license, and is even a
bit murky legally, but for small little chunks of code:

"I'm putting this out there without claiming copyright -- do with it
what you will"

really is appropriate.

It's more or less what we all do when we post a little code snippet on
this list in response to a question.

-Chris

ps: IANAL, blah, blah.

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Alan G Isaac

unread,
Nov 1, 2010, 7:10:27 PM11/1/10
to SciPy Users List
On 11/1/2010 7:07 PM, Christopher Barker wrote:
> I think "snippets" are best put in the public
> domain.

http://creativecommons.org/about/cc0

fwiw,
Alan Isaac

william ratcliff

unread,
Nov 1, 2010, 7:12:24 PM11/1/10
to SciPy Users List
Doesn't cc have an attribution clause?

Matthew Brett

unread,
Nov 1, 2010, 7:20:54 PM11/1/10
to SciPy Users List
Hi,

> Even if the snippet is licensed BSD you cannot simply copy and paste a
> code snippet. You have to include the license and copyright notice of
> the original author. So if people simply copy and paste code snippets
> without paying attention to the licensing it will end up being a mess
> anyway, because they are possibly violating licenses.

My point is, that the more accessible the interface, the more likely
it is that people will indeed copy and paste without taking note of
the license. You can easily imagine the situation, you're working on
some problem, you come across the code, it's short, you paste it as a
function into your code to get something going. A while later, you
find you've done some adaptations, you've written some supporting
functions, and, using the flexible and intuitive new interface, you
upload your snippet for other people to use. By that time, you've
forgotten that the original was GPL. Someone else sees your
function, perhaps notes that it is now (incorrectly) BSD, picks it up,
puts it into a larger code-base, and so on and so on.

Now, if the original code is BSD (and so is all the other code), you
are breaking the terms of the original license by not including the
original copyright notice, but you can easily fix that by - including
the copyright notices. If the original code is GPL, you'll have a
hell of a time trying to work out what code that you and other people
wrote was in fact based on the original code, and you'd likely give up
and change your license to GPL.

Best,

josef...@gmail.com

unread,
Nov 1, 2010, 7:23:15 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 7:12 PM, william ratcliff
<william....@gmail.com> wrote:
> Doesn't cc have an attribution clause?

http://wiki.creativecommons.org/CC0_FAQ#Does_CC0_require_others_who_use_my_work_to_give_me_attribution.3F

Josef

Christopher Barker

unread,
Nov 1, 2010, 7:26:54 PM11/1/10
to SciPy Users List
On 11/1/10 4:12 PM, william ratcliff wrote:
> Doesn't cc have an attribution clause?

There a a few CC licenses, some of which do. But Alan was suggesting the
CC0, which does not -- it is essentially a legalese way to come as
close as you can to putting your work in the public domain.

And I think a good choice for a default for a snippets site.

-Chris


> On Mon, Nov 1, 2010 at 7:10 PM, Alan G Isaac <alan....@gmail.com
> <mailto:alan....@gmail.com>> wrote:
>
> On 11/1/2010 7:07 PM, Christopher Barker wrote:
> > I think "snippets" are best put in the public
> > domain.
>
> http://creativecommons.org/about/cc0
>
> fwiw,
> Alan Isaac
>
> _______________________________________________
> SciPy-User mailing list

> SciPy...@scipy.org <mailto:SciPy...@scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-user


>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy...@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Robert Kern

unread,
Nov 1, 2010, 7:37:47 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 18:07, Christopher Barker <Chris....@noaa.gov> wrote:
> On 11/1/10 3:58 PM, Jochen Schröder wrote:
>> Even if the snippet is licensed BSD you cannot simply copy and paste a
>> code snippet. You have to include the license and copyright notice of
>> the original author.
>
> Exactly, which is why I think "snippets" are best put in the public
> domain. yes, I know that public domain is not a license, and is even a
> bit murky legally, but for small little chunks of code:
>
> "I'm putting this out there without claiming copyright -- do with it
> what you will"
>
> really is appropriate.

Unfortunately, only the "do with it what you will" has any legal
effect in most jurisdictions. The public domain isn't all that murky.
In many jurisdictions, it's very clear that you simply cannot do it.

Using Creative Commons' CC0 license, you can get most of the way
there, but that license is much longer than the BSD license. But that
doesn't necessarily matter much for this use case given the right
interface.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Robert Kern

unread,
Nov 1, 2010, 7:47:40 PM11/1/10
to SciPy Users List

I think that restricting the license options on the site would only
give you a false sense of security. The number of screwups is likely
to be small in any case. And I would suggest that many of those
screwups would come from moving over GPLed code from other sources
rather than from other files on the site. I suspect people are more
interested in adding new stuff to the site rather than tweaking other
bits already there. I also think that when it does happen, the
consequences are not nearly as bad as you are making them out to be.
It's just not that hard to disentangle code of the size we are talking
about.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

william ratcliff

unread,
Nov 1, 2010, 8:29:25 PM11/1/10
to SciPy Users List
I think for now, let's try going with BSD or CC0 and allowing people to link to other code if they so desire (but not put it on the site).   Another question:

For usability, I really like how the stack overflow answers with the most votes appear at the top of the page.  However, over time, the previous top answer may become less relevant.  Should we have "aging" for scores?   Also, a friend suggested requiring doctests for the code (with the eventual long term goal of being able to run the doctests on a vm somewhere like EC2, but that would be a much longer term goal with the latest versions of scipy, numpy, and matplotlib).    Does it sound reasonable to require doctests or is that too high of a burden?

William 

Vincent Davis

unread,
Nov 1, 2010, 8:31:45 PM11/1/10
to SciPy Users List
Would be nice if I for example could contribute a snippet and others could easily extend or make improvements without my involvement. I am not sure how to link the improvement to the original and track. Maybe something where a kind of pull/update could be submitted. The issue I am thinking of is avoiding code that is not being maintained and becomes out of date. (it was just a hat are not snippet after all) and also trying to avoid duplicate snippet with only small differences functionality. I think the solution should function without the involvement of the original contributor.
Maybe it is possible to adapt something like stackexchange voting of question answers to voting for the best snippet to perform some function. For example maybe you are looking for a snippet that imports data from fasta to an array. A search for fasta to array may return several results and they could be ranked/sorted by votes. This is different from a simple per snippet rating as this would allow a ranking/sorting.

I would prefer the discussion stay on the topic of central file exchange and not on licensing, but maybe this is more important to the planing than I realize.

I hope this post is comprehensible, I am currently experiencing many distractions and am hitting send without rereading.

Vincent
--
Thanks
Vincent Davis

John Hunter

unread,
Nov 1, 2010, 8:34:56 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 7:29 PM, william ratcliff
<william....@gmail.com> wrote:
> I think for now, let's try going with BSD or CC0 and allowing people to link
> to other code if they so desire (but not put it on the site).   Another
> question:
> For usability, I really like how the stack overflow answers with the most
> votes appear at the top of the page.  However, over time, the previous top
> answer may become less relevant.  Should we have "aging" for scores?   Also,
> a friend suggested requiring doctests for the code (with the eventual long
> term goal of being able to run the doctests on a vm somewhere like EC2, but
> that would be a much longer term goal with the latest versions of scipy,
> numpy, and matplotlib).    Does it sound reasonable to require doctests or
> is that too high of a burden?

Way too high. I may have some useful code laying around I would
upload if it were easy. I will certainly not do it if I have to
retroactively go add tests. I think we need the minimum barrier to
entry. Let the description, reviews, rankings and the end user decide
if the code is suitable for a purpose. With something like this the
key to use will be to get people to actually upload something. Once
you have critical mass, and if you have a problem with too many low
quality submissions, consider tightening the standards then.

JDH

Matthew Brett

unread,
Nov 1, 2010, 9:33:31 PM11/1/10
to SciPy Users List
Hi,

> I think that restricting the license options on the site would only
> give you a false sense of security. The number of screwups is likely
> to be small in any case.

Matlab switched from pick-your-own to BSD for file-exchange and put
some effort into doing that. My guess is that they ran into these
problems, but there might be another explanation.

Best,

Matthew

Robert Kern

unread,
Nov 1, 2010, 9:46:56 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 20:33, Matthew Brett <matthe...@gmail.com> wrote:
> Hi,
>
>> I think that restricting the license options on the site would only
>> give you a false sense of security. The number of screwups is likely
>> to be small in any case.
>
> Matlab switched from pick-your-own to BSD for file-exchange and put
> some effort into doing that.  My guess is that they ran into these
> problems, but there might be another explanation.

There is a FAQ:

http://www.mathworks.com/matlabcentral/FX_transition_faq.html

"""
Why is only one license being considered?
When everyone uses the same license, it is a simple matter to re-use
and re-license code. If more than one license is used, re-releasing
the code under a different license raises potential conflicts in the
terms of use.
"""

It seems more like they just wanted simplicity and consistency. Those
are perfectly good reasons but quite distinct from the scenarios you
are contemplating.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Matthew Brett

unread,
Nov 1, 2010, 10:00:01 PM11/1/10
to SciPy Users List
Hi,

> There is a FAQ:

Yes, that was the FAQ I was quoting from earlier.

> """
> Why is only one license being considered?
> When everyone uses the same license, it is a simple matter to re-use
> and re-license code. If more than one license is used, re-releasing
> the code under a different license raises potential conflicts in the
> terms of use.
> """
>
> It seems more like they just wanted simplicity and consistency. Those
> are perfectly good reasons but quite distinct from the scenarios you
> are contemplating.

I'm probably too jet-lagged to find the distinction very clear - but -
regardless of whether I was in fact talking about simplicity and
consistency, it seems wise to take note of what the Mathworks did, on
the basis that we like to learn from relevant experience where
possible.

Best,

Matthew

Robert Kern

unread,
Nov 1, 2010, 10:49:20 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 21:00, Matthew Brett <matthe...@gmail.com> wrote:
> Hi,
>
>> There is a FAQ:
>
> Yes, that was the FAQ I was quoting from earlier.
>
>> """
>> Why is only one license being considered?
>> When everyone uses the same license, it is a simple matter to re-use
>> and re-license code. If more than one license is used, re-releasing
>> the code under a different license raises potential conflicts in the
>> terms of use.
>> """
>>
>> It seems more like they just wanted simplicity and consistency. Those
>> are perfectly good reasons but quite distinct from the scenarios you
>> are contemplating.
>
> I'm probably too jet-lagged to find the distinction very clear - but -
> regardless of whether I was in fact talking about simplicity and
> consistency,

You were arguing that terrible things would happen if someone
accidentally relabeled GPL code, and specifically that the viral
aspects of the GPL would damage the utility of the site. That's
different from saying that having only one license would make things
easier.

> it seems wise to take note of what the Mathworks did, on
> the basis that we  like to learn from relevant experience where
> possible.

Countervailing that is the Python Cookbook, which used to default to
the Python license and moved to an explicit, enumerated set of
licenses accompanied by a strong recommendation for the MIT license.

I think this approach is the best one. Excluding GPL snippets doesn't
buy us much more simplicity in practice and tends to exclude
contributions from the mostly-GPL Sage community and others.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Robert Kern

unread,
Nov 1, 2010, 10:51:14 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 19:31, Vincent Davis <vin...@vincentdavis.net> wrote:

> I would prefer the discussion stay on the topic of central file exchange and
> not on licensing, but maybe this is more important to the planing than I
> realize.

The license flamewar is the eigenstate of all threads concerning open
source software distribution.

josef...@gmail.com

unread,
Nov 1, 2010, 11:21:29 PM11/1/10
to SciPy Users List
On Mon, Nov 1, 2010 at 10:51 PM, Robert Kern <rober...@gmail.com> wrote:
> On Mon, Nov 1, 2010 at 19:31, Vincent Davis <vin...@vincentdavis.net> wrote:
>
>> I would prefer the discussion stay on the topic of central file exchange and
>> not on licensing, but maybe this is more important to the planing than I
>> realize.
>
> The license flamewar is the eigenstate of all threads concerning open
> source software distribution.

Which is still much better than specifying no license. As I discovered
again, license statements are very scarce when you look at
econometrics or signal processing code published on the web (outside
of the matlab fileexchange !).

Josef

william ratcliff

unread,
Nov 2, 2010, 5:23:41 PM11/2/10
to SciPy Users List
I've been thinking about this a bit more, so hopefully two last questions before starting the actual prototype:

1)  There are a number of open-id packages which integrate with Django.  Has anyone used one and if so are there any preferences?  I assume that we don't need to support twitter, openauth, facebook, etc.

2)  I took a look at gist and it's really interesting.  If we run git on the server, then we can also make a repository on the server.   The interesting challenge is how to pick out a particular revision of a given repository.  For example, suppose somebody posts an svm code snippet and we have a series of edits and forks like:

A->B->C->D->E
        \
         F->G->H->I
                \
                  J->K

Is there an easy way to get to say K, or H directly through command line?  I imagine that every post can be treated then as a git repository (as with gist).  Comments will be attached to a given step (say for example J) and in our django database, we will store the repository and the identifier (for example J) to display that currently has the most votes for relevance.   

So, the workflow would be that someone would submit a code snippet (A).  Anyone who wants to can edit the code snippet, which will create a new view (B) with it's own comments.  If someone things they want to work on something that's related, but a bit further afield, they can fork (F).   Because the code can change each time, comments will follow the particular instance of the code.  The score on forking or editing will decrease by some fraction from the original score.  That way, it will still pop up in searches, but if it's deemed more relevant, it will be the entry point that people will see first.   There will be an interface so people can either go forward or backward along the commits, or can explore the branches.  I don't want to deal with displaying the whole structure initially.  I also don't want to deal with merging.  

Since we're primarily interested in code snippets, then only one file will be initially supported.  

Does this seem too complicated?

Thanks,
William

Bastian Weber

unread,
Nov 28, 2010, 4:10:31 PM11/28/10
to SciPy Users List
william ratcliff wrote:
> I've been thinking about this a bit more, so hopefully two last
> questions before starting the actual prototype:
>
> ...

When the thread about a Central File Exchange pendent for SciPy started
a month ago I was really excited. Unfortunately, after a few days of a
highly active discussion the community dropped the topic abruptly and
without any result.

Have I missed something? Is there a prototype already running? Or did
the community quietly come to the consensus that such a project would
not be worth the effort?

>From my point of view such a project would be extremely useful. I
definitely would donate some money for development and maintenance if
that would be the bottleneck. And I have the feeling, that many other
users would do the same, given the fact that such a platform could save
many hours of reinventing the wheel.

Currently I have at least three little "projects" lazily laying around
somewhere on my disk. All of them are far too small/immature for pypi
and friends. At the other hand, I think they are more then a mere
"recipe". I'm almost sure that somebody has to solve quite similar
problems and that the code, if not working out of the box, would be a
good starting point.

The only thing is: where to put it, such that it could be found?


> Since we're primarily interested in code snippets, then only one
> file will be initially supported.

I belive this is quite reasonable to get started. However I think it
might be good to support more files at some point. Then someone easily
could provide a module and additionally some example scripts


IMHO a sophisticated score system, git-integration, different views and
other complex features might be useful if the platform once has a
certain user base. For a first (public) prototype, however, they might
be ballast.

What to me seems more important, is a high resolution tagging system. I
could even imagine different types of tags. E.g. problem specific (like
structural mechanics, molecular biology, control theory, ...) and tags
regarding the math involved (linear algebra, symbolic computation,
optimization, ...). In fact, the examples I gave are too rough to meet
my idea of "high resolution" but better ones just do not occur to me
right now.

To conclude: I really look forward to this platform


Regards,
Bastian.

Matthew Brett

unread,
Nov 28, 2010, 6:59:28 PM11/28/10
to SciPy Users List
Hi,

On Sun, Nov 28, 2010 at 1:10 PM, Bastian Weber
<bastia...@gmx-topmail.de> wrote:
> william ratcliff wrote:
>> I've been thinking about this a bit more, so hopefully two last
>> questions before starting the actual prototype:
>>
>> ...
>
> When the thread about a Central File Exchange pendent for SciPy started
> a month ago I was really excited. Unfortunately, after a few days of a
> highly active discussion the community dropped the topic abruptly and
> without any result.

My guess is that the licensing discussion contributed to that - it got
a bit tense and wasn't very enjoyable.

Also, no-one replied to William's last post...

William - did you get anywhere with this? Is there any help we can
offer? Is it still good to reply to your last post, to help get the
discussion going again?

Thanks for waking this one,

Matthew

David

unread,
Nov 28, 2010, 7:52:35 PM11/28/10
to scipy...@scipy.org
On 11/29/2010 06:10 AM, Bastian Weber wrote:
> william ratcliff wrote:
>> I've been thinking about this a bit more, so hopefully two last
>> questions before starting the actual prototype:
>>
>> ...
>
> When the thread about a Central File Exchange pendent for SciPy started
> a month ago I was really excited. Unfortunately, after a few days of a
> highly active discussion the community dropped the topic abruptly and
> without any result.
>
> Have I missed something? Is there a prototype already running? Or did
> the community quietly come to the consensus that such a project would
> not be worth the effort?

More likely, people who really want this should start working on this,
that's the only way it is going to happen.

I am sure someone could get something working fast using GAE if motivated,

cheers,

David

william ratcliff

unread,
Nov 28, 2010, 8:47:31 PM11/28/10
to SciPy Users List

Andrew Wilson and I have started working on this.  But I've been a bit distacted by conferences...:(

Christopher Barker

unread,
Nov 29, 2010, 12:22:49 PM11/29/10
to SciPy Users List
On 11/28/10 3:59 PM, Matthew Brett wrote:
> My guess is that the licensing discussion contributed to that - it got
> a bit tense and wasn't very enjoyable.

you'd think people enjoy the licensing debates -- there sure are a lot
of them!

Anyway, I _think_ we sort-of converged on using the Creative Commons CC0
(more-or-less public domain):

http://creativecommons.org/choose/zero/

Though I think it's open as to whether to optionally allow contributors
to choose another license.

I say whoever builds it can decide how they want to do it.

On 11/28/10 5:47 PM, william ratcliff wrote:
> Andrew Wilson and I have started working on this.

Wonderful news!

-Chris


--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Matthew Brett

unread,
Nov 29, 2010, 5:15:05 PM11/29/10
to SciPy Users List
Hi,

On Mon, Nov 29, 2010 at 9:22 AM, Christopher Barker
<Chris....@noaa.gov> wrote:
> On 11/28/10 3:59 PM, Matthew Brett wrote:
>> My guess is that the licensing discussion contributed to that - it got
>> a bit tense and wasn't very enjoyable.
>
> you'd think people enjoy the licensing debates -- there sure are a lot
> of them!

It's a mystery of human behavior that we are sometimes drawn to
pointless dispute :) [1]

> Anyway, I _think_ we sort-of converged on using the Creative Commons CC0
> (more-or-less public domain):
>
> http://creativecommons.org/choose/zero/
>
> Though I think it's open as to whether to optionally allow contributors
> to choose another license.

I think we went off track round about the time I half-jokingly
suggested red bars either side of a GPL snippet. So, at the risk of
exciting further heat, and in the interests of peace and good will,
would this be a reasonable summary?:

Default is some very permissive thing such as CC0 or BSD or MIT
Other options allowed, list as for Python cookbook
The person who builds the thing can choose freely whether to put a
subtle or unsubtle or no warning on GPL snippets.

See you,

Matthew

[1] http://blog.stackoverflow.com/2010/09/fork-it/

william ratcliff

unread,
Nov 29, 2010, 5:26:52 PM11/29/10
to SciPy Users List
I'll put it as BSD.  That way, it will be consistent with the scipy license.  We will have a git repository with the code for the site, which will also be BSD.  Anyone who wants to fork it can....Once we throw up a simple demo, then we'll ask for feedback :>  Version 1 will essentially be a clone of the django snippets web site.   We can customize it later....


William
Reply all
Reply to author
Forward
0 new messages