[SciPy-User] Central File Exchange for Scipy

15 views
Skip to first unread message

Jason Grout

unread,
Apr 18, 2011, 11:40:01 PM4/18/11
to SciPy Users List

I've been funded over the summer by an NSF grant to build a library of
Sage "interacts" (basically small snippets of Sage/Python code).
Fernando Perez strongly encouraged me at a recent Sage Days to adopt a
version controlled snippet model (ala Gist). The other night I threw
together a very rough proof-of-concept, just-barely-working initial
start of something along these lines (I hope I put enough disclaimers in
there!) My code is here:

https://github.com/jasongrout/snippets

It requires Flask and Mercurial to be installed. Yes, the irony of using
a Mercurial backend in a project on github is not lost on me. Though I
prefer git, the python API to use and create mercurial repositories was
too nice to turn down in this prototyping stage. I tried to encapsulate
the VCS commands in a wrapper class so that the backend could be easily
switched to git if need be.

This reminded me of the Central File Exchange thread from last November,
and in particular, several people saying that they were working on a
version-controlled snippet database [1]. William or Andrew, have you
posted your work anywhere? I think we have very similar goals.

I won't be able to work on this heavily for about a month (until after
the semester), but I will be hitting it pretty hard during the summer.
The hope is to have a good production version by July.

If anyone else is working on a related project, please let me know, as
we can probably collaborate. If anyone wants to fork the github repo
above and work on it, feel free!

To prevent license discussions from eating up too much energy/time, I
have decided that the site that I set up will have all snippets be
(modified) BSD licensed. That may change, of course, before it's
actually implemented, but it's not up for debate now. To encourage
collaboration you folks, I'm willing to put a BSD license on the
codebase as well if possible, though the Mercurial docs page seems to
indicate that I'm forced to use GPLv2 (+?) if I use their API, which I
am at this point [2].

Thanks,

Jason

[1] http://mail.scipy.org/pipermail/scipy-user/2010-November/027690.html

[2] http://mercurial.selenic.com/wiki/MercurialApi

--
Jason Grout

_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Joshua Holbrook

unread,
Apr 19, 2011, 12:54:22 AM4/19/11
to SciPy Users List
On Mon, Apr 18, 2011 at 7:40 PM, Jason Grout
<jason...@creativetrax.com> wrote:
>
> I've been funded over the summer by an NSF grant to build a library of
> Sage "interacts" (basically small snippets of Sage/Python code).
> Fernando Perez strongly encouraged me at a recent Sage Days to adopt a
> version controlled snippet model (ala Gist).  The other night I threw
> together a very rough proof-of-concept, just-barely-working initial
> start of something along these lines (I hope I put enough disclaimers in
> there!)  My code is here:
>
> https://github.com/jasongrout/snippets

This sounds awesome! *watched*

> It requires Flask and Mercurial to be installed. Yes, the irony of using
> a Mercurial backend in a project on github is not lost on me.  Though I
> prefer git, the python API to use and create mercurial repositories was
> too nice to turn down in this prototyping stage.  I tried to encapsulate
> the VCS commands in a wrapper class so that the backend could be easily
> switched to git if need be.

I don't think there's anything wrong with using mercurial here. If you
want to support git pulls (though a lot of pythonistas prefer hg it
seems the scientific community's leaning towards git), you could
always make a system to convert repos to git on-the-fly, maybe using
something like http://offbytwo.com/git-hg/ . In fact, I think I would
do something like that as long as it wasn't too slow.

Or, y'know, rewrite the VCS commands. :) Either way.

> This reminded me of the Central File Exchange thread from last November,
> and in particular, several people saying that they were working on a
> version-controlled snippet database [1].  William or Andrew, have you
> posted your work anywhere?  I think we have very similar goals.

Yeah, it sounds like you're doing something really similar, though
with a Sage focus. How well do these Sage interacts things work with
more vanilla python? I for one don't really use Sage, so if it was
ridiculously different I don't know how much utility the SagExchange
would be for me.

> To prevent license discussions from eating up too much energy/time, I
> have decided that the site that I set up will have all snippets be
> (modified) BSD licensed.

Word. I approve. I hope good things come of this. I'm mostly working
with javascript these days, but if I have time I might take a look or
two.

--Josh

Jason Grout

unread,
Apr 19, 2011, 1:14:37 AM4/19/11
to scipy...@scipy.org
On 4/18/11 11:54 PM, Joshua Holbrook wrote:

>
> Yeah, it sounds like you're doing something really similar, though
> with a Sage focus. How well do these Sage interacts things work with
> more vanilla python? I for one don't really use Sage, so if it was
> ridiculously different I don't know how much utility the SagExchange
> would be for me.

The site would be simply a snippet site, geared towards python.
Additionally, for each snippet, it would be able to send the snippet to
a server to execute the snippet and display the results, like the
htmlnotebook project in the IPython trunk. Except we'd probably also
use another project we're currently working on, a "single-cell Sage
Notebook server": https://github.com/jasongrout/simple-python-db-compute
(see http://wiki.sagemath.org/DrakeSageGroup for rough notes of our
progress and todo list). The "interacts" part just is some client-side
javascript that makes sliders, textboxes, etc.

In a sense, the Sage interact site would be like a combination of Gist,
Wolfram Demonstrations, Pastebin, Central File Exchange, etc. For your
purposes, you could ignore the "interact" part and just think of it as a
version-controlled snippet site that also lets you execute the snippets
on a remote server.

>
>> To prevent license discussions from eating up too much energy/time, I
>> have decided that the site that I set up will have all snippets be
>> (modified) BSD licensed.
>
> Word. I approve. I hope good things come of this. I'm mostly working
> with javascript these days, but if I have time I might take a look or
> two.

We could definitely use some Javascript expertise, especially when we
start working on integrating the single-cell server mentioned above with
the snippet site.

Thanks,

Jason

Gael Varoquaux

unread,
Apr 19, 2011, 1:22:52 AM4/19/11
to SciPy Users List
On Mon, Apr 18, 2011 at 10:40:01PM -0500, Jason Grout wrote:
> I've been funded over the summer by an NSF grant to build a library of
> Sage "interacts" (basically small snippets of Sage/Python code).
> Fernando Perez strongly encouraged me at a recent Sage Days to adopt a
> version controlled snippet model (ala Gist). The other night I threw
> together a very rough proof-of-concept, just-barely-working initial
> start of something along these lines (I hope I put enough disclaimers in
> there!) My code is here:

> https://github.com/jasongrout/snippets

This is great. Congratulations for doing this. And congratulations for
doing it in such a way that it appeals to the scipy community in addition
to the sage community!

Gael

Joshua Holbrook

unread,
Apr 19, 2011, 2:30:48 AM4/19/11
to SciPy Users List
On Mon, Apr 18, 2011 at 9:14 PM, Jason Grout
<jason...@creativetrax.com> wrote:

> The site would be simply a snippet site, geared towards python.
> Additionally, for each snippet, it would be able to send the snippet to
> a server to execute the snippet and display the results, like the
> htmlnotebook project in the IPython trunk.  Except we'd probably also
> use another project we're currently working on, a "single-cell Sage
> Notebook server": https://github.com/jasongrout/simple-python-db-compute
> (see http://wiki.sagemath.org/DrakeSageGroup for rough notes of our
> progress and todo list).  The "interacts" part just is some client-side
> javascript that makes sliders, textboxes, etc.

This sounds pretty sweet. What do you mean by "single-cell?" Looking
forward to seeing how it goes! Either way, I'm watching the compute
server project now too.

>
> We could definitely use some Javascript expertise, especially when we
> start working on integrating the single-cell server mentioned above with
> the snippet site.
>

I'd hardly call myself a javascript expert XD but feel free to hit me
up if/when you're ready. You might find it easiest to get ahold of me
in the #scipy irc channel (irc://irc.freenode.net/#scipy) as
"jesusabdullah."

Good luck!

--Josh

Pauli Virtanen

unread,
Apr 19, 2011, 4:48:17 AM4/19/11
to scipy...@scipy.org
Mon, 18 Apr 2011 22:40:01 -0500, Jason Grout wrote:
[clip]

> I won't be able to work on this heavily for about a month (until after
> the semester), but I will be hitting it pretty hard during the summer.
> The hope is to have a good production version by July.
>
> If anyone else is working on a related project, please let me know, as
> we can probably collaborate. If anyone wants to fork the github repo
> above and work on it, feel free!

I have something here:

https://github.com/pv/scipyshare

Version controlled, yes, but on the file system, and not only for
snippets but also for slightly larger projects + reference links to
projects on PyPi and elsewhere.

There's no great UI yet, but the basic functionality is there. The
community part functionality is not fully there yet, though.

Pauli

Pauli Virtanen

unread,
Apr 19, 2011, 4:51:25 AM4/19/11
to scipy...@scipy.org
Tue, 19 Apr 2011 08:48:17 +0000, Pauli Virtanen wrote:
[clip]

> There's no great UI yet, but the basic functionality is there.

Should have said that there's essentially *no* real UI yet, it's
basically just dumping data and forms to HTML; but real UI should be easy
to add once the rest of the things work.

william ratcliff

unread,
Apr 19, 2011, 9:11:40 AM4/19/11
to SciPy Users List
Andrew Wilson and I were also playing with this--one thing I was looking into before getting waylaid by work was using S3 as a backing store for git/hg.


William

Alan G Isaac

unread,
Apr 19, 2011, 9:35:16 AM4/19/11
to SciPy Users List
On 4/18/2011 11:40 PM, Jason Grout wrote:
> though the Mercurial docs page seems to
> indicate that I'm forced to use GPLv2 (+?) if I use their API


That's only if you use their *internal* API,
which is discouraged.

fwiw,
Alan Isaac

Stephen Waterbury

unread,
Apr 19, 2011, 10:55:22 AM4/19/11
to SciPy Users List
I've been reading the traits modules has_traits.py and protocols.py
and trying to figure out what I'd use to create Interface classes
at runtime, but my brain is beginning to melt so thought I'd ask for
some advice. I have code that creates zope.interface-style Interface
classes at runtime by instantiating InterfaceClass, but PyProtocols,
which is used by traits, has a significantly different set of apis
and the code is somewhat more difficult to read than zope.interface
code, so it's not obvious how to create traits-style Interface classes
at runtime ... any hints would be appreciated!

Thanks,
Steve

Robert Kern

unread,
Apr 19, 2011, 11:07:24 AM4/19/11
to SciPy Users List
On Tue, Apr 19, 2011 at 09:55, Stephen Waterbury
<wate...@pangalactic.us> wrote:
> I've been reading the traits modules has_traits.py and protocols.py
> and trying to figure out what I'd use to create Interface classes
> at runtime, but my brain is beginning to melt so thought I'd ask for
> some advice.

I can answer your question over on enthought-dev:

https://mail.enthought.com/mailman/listinfo/enthought-dev

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Jason Grout

unread,
Apr 19, 2011, 12:27:52 PM4/19/11
to scipy...@scipy.org
On 4/19/11 8:35 AM, Alan G Isaac wrote:
> On 4/18/2011 11:40 PM, Jason Grout wrote:
>> though the Mercurial docs page seems to
>> indicate that I'm forced to use GPLv2 (+?) if I use their API
>
>
> That's only if you use their *internal* API,
> which is discouraged.


Good point. I started just using the external API (i.e., the commands
module), but that required that I always work through the repository on
disk (locking it and then writing files to disk, then committing from
the disk contents), parse the output of log, etc. It's *much* more
convenient to be able to commit things from memory, be able to directly
traverse the commit tree, etc. So yes, it's a matter of convenience
now, but I also suspect that it is a matter of speed when in production.

Jason

Jason Grout

unread,
Apr 19, 2011, 1:48:10 PM4/19/11
to scipy...@scipy.org
On 4/19/11 3:48 AM, Pauli Virtanen wrote:
> Mon, 18 Apr 2011 22:40:01 -0500, Jason Grout wrote:
> [clip]
>> I won't be able to work on this heavily for about a month (until after
>> the semester), but I will be hitting it pretty hard during the summer.
>> The hope is to have a good production version by July.
>>
>> If anyone else is working on a related project, please let me know, as
>> we can probably collaborate. If anyone wants to fork the github repo
>> above and work on it, feel free!
>
> I have something here:
>
> https://github.com/pv/scipyshare
>
> Version controlled, yes, but on the file system, and not only for
> snippets but also for slightly larger projects + reference links to
> projects on PyPi and elsewhere.

Nice! I've never done anything with Django, so I added instructions to
the README for people like me and submitted a pull request.

I'll be looking at this more closely over the next month. We certainly
have the same types of goals, but I don't have experience with Django,
so that's a bit of a hurdle.

Jason

josef...@gmail.com

unread,
Apr 19, 2011, 2:03:36 PM4/19/11
to SciPy Users List
On Tue, Apr 19, 2011 at 12:27 PM, Jason Grout
<jason...@creativetrax.com> wrote:
> On 4/19/11 8:35 AM, Alan G Isaac wrote:
>> On 4/18/2011 11:40 PM, Jason Grout wrote:
>>> though the Mercurial docs page seems to
>>> indicate that I'm forced to use GPLv2 (+?) if I use their API
>>
>>
>> That's only if you use their *internal* API,
>> which is discouraged.
>
>
> Good point.  I started just using the external API (i.e., the commands
> module), but that required that I always work through the repository on
> disk (locking it and then writing files to disk, then committing from
> the disk contents), parse the output of log, etc.  It's *much* more
> convenient to be able to commit things from memory, be able to directly
> traverse the commit tree, etc.  So yes, it's a matter of convenience
> now, but I also suspect that it is a matter of speed when in production.

there is a GSoC proposal on this issue

http://mercurial.selenic.com/wiki/SummerOfCode/2011/IdanKamara

So maybe this will improve in future.

Josef

Jason Grout

unread,
Apr 19, 2011, 2:25:05 PM4/19/11
to scipy...@scipy.org
On 4/19/11 12:48 PM, Jason Grout wrote:
> but I don't have experience with Django,
> so that's a bit of a hurdle.

Do you have any resources that you recommend for learning Django other
than the standard Django documentation [1]? I see the Django Book [2],
for example, and it appears pretty well written, but is for Django 1.0,
so I don't know how much still applies.

Also, it would be very helpful in browsing the codebase if there was a
short description of each app in the DESIGN.rst and a short blurb about
how it fit in with the rest of the website. I was a bit loss on parts
of the app. I didn't figure out how an actual revision of a file was
stored or retrieved, for example. Is it stored using a DVCS, or is each
version stored separately?

Thanks,

Jason

[1] http://docs.djangoproject.com/en/1.3/

[2] http://www.djangobook.com/

Pauli Virtanen

unread,
Apr 19, 2011, 2:35:00 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 12:48:10 -0500, Jason Grout wrote:
[clip]
> Nice! I've never done anything with Django, so I added instructions to
> the README for people like me and submitted a pull request.

Thanks, applied.

> I'll be looking at this more closely over the next month. We certainly
> have the same types of goals, but I don't have experience with Django,
> so that's a bit of a hurdle.

The basic urls -> view (<-> model) -> template CRUD flow is pretty
straightforward in Django, and I suppose that's mostly all that is needed
for this particular project.

The project layout maybe looks a bit framework-y, but it's basically just
a Python package with all the urls.py hooking it into the rest of Django.

Pauli

Pauli Virtanen

unread,
Apr 19, 2011, 3:00:24 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 13:25:05 -0500, Jason Grout wrote:
> On 4/19/11 12:48 PM, Jason Grout wrote:
>> but I don't have experience with Django, so that's a bit of a hurdle.
>
> Do you have any resources that you recommend for learning Django other
> than the standard Django documentation [1]? I see the Django Book [2],
> for example, and it appears pretty well written, but is for Django 1.0,
> so I don't know how much still applies.

Django is pretty slow-moving, and most of things in the book probably
still apply.

Personally, I'd start like this:

- Go through the Django tutorial:

http://docs.djangoproject.com/en/1.3/intro/tutorial01/

(The admin interface is in my opinion not so important,
makes things simpler to not think too deeply about it.)

- Browse through interesting sections of the main docs;
e.g. database query, template syntax docs:

http://docs.djangoproject.com/en/1.3/topics/db/queries/

http://docs.djangoproject.com/en/1.3/topics/templates/

> Also, it would be very helpful in browsing the codebase if there was a
> short description of each app in the DESIGN.rst and a short blurb about
> how it fit in with the rest of the website. I was a bit loss on parts
> of the app.

Things were a bit in motion, so yes, it's not terribly well documented at
the moment.

Anyway, the basic subpackages of `scipyshare` are at the moment like so:

catalog -- main application/snippet catalog data
community -- user-assigned tags, ratings, etc.
filestorage -- managing storing sets of files on the FS
front -- just some dummy front page
importing -- dealing with importing from PyPi
user -- user login/logout, profile, etc.

The aim would be to keep these parts knowing only little about each
other.

The deploy/ directory basically only contains the configuration files.

> I didn't figure out how an actual revision of a file was
> stored or retrieved, for example. Is it stored using a DVCS, or is each
> version stored separately?

Stored separately on the filesystem.

For each revision, you get a catalog.models.Revision object in a
catalog.models.Entry. The set of files associated with each Revision is
managed by filestorage.models.FileSet.

Since a DVCS would not be exposed to the outside of the application, I
decided against using one internally. A better storage system can be
swapped in later on, if such a thing turns out to be necessary (which I
doubt a bit -- for example the MoinMoin wiki works fine with a filesystem
based wiki page storage).

Pauli

Bruce Southey

unread,
Apr 19, 2011, 3:01:54 PM4/19/11
to scipy...@scipy.org
On 04/19/2011 02:00 PM, Pauli Virtanen wrote:
> On Tue, 19 Apr 2011 13:25:05 -0500, Jason Grout wrote:
>> On 4/19/11 12:48 PM, Jason Grout wrote:
>>> but I don't have experience with Django, so that's a bit of a hurdle.
>> Do you have any resources that you recommend for learning Django other
>> than the standard Django documentation [1]? I see the Django Book [2],
>> for example, and it appears pretty well written, but is for Django 1.0,
>> so I don't know how much still applies.
> Django is pretty slow-moving, and most of things in the book probably
> still apply.
>
> Personally, I'd start like this:
>
> - Go through the Django tutorial:
>
> http://docs.djangoproject.com/en/1.3/intro/tutorial01/
>
> (The admin interface is in my opinion not so important,
> makes things simpler to not think too deeply about it.)
>
> - Browse through interesting sections of the main docs;
> e.g. database query, template syntax docs:
>
> http://docs.djangoproject.com/en/1.3/topics/db/queries/
>
> http://docs.djangoproject.com/en/1.3/topics/templates/
>
I agree especially about the tutorial being a very good place to start
and go back to.

You just have to address the Django changes when the code from 0.9x
gives errors in your installed version. While a little stressing, I did
not find these that hard to fix.

Version 2 of the book (that I have not read) is at:
http://www.djangobook.com/en/2.0/


Bruce

PS Your site, so you choose the license :-)

Joshua Holbrook

unread,
Apr 19, 2011, 3:14:16 PM4/19/11
to SciPy Users List
> Django django django

FWIW, I'm a fan of the little frameworks. ;) Flask looks nice!

--Josh

william ratcliff

unread,
Apr 19, 2011, 3:22:38 PM4/19/11
to SciPy Users List
Flask is nice, but why reinvent the wheel?

Jason Grout

unread,
Apr 19, 2011, 3:25:25 PM4/19/11
to scipy...@scipy.org
On 4/19/11 2:14 PM, Joshua Holbrook wrote:
>> Django django django
>
> FWIW, I'm a fan of the little frameworks. ;) Flask looks nice!

In this exploratory phase of the project, it's probably not a bad idea
to have two implementations to compare between. On the other hand,
there's something to be said for spreading people too thin when there
are only a handful of people interested in working on it now.

I haven't abandoned the start of the flask site I posted to github. I
would like to understand the significant work that has gone into the
django site, though.

Thanks,

Jason

Joshua Holbrook

unread,
Apr 19, 2011, 3:26:31 PM4/19/11
to SciPy Users List
> Flask is nice, but why reinvent the wheel?

I guess I wasn't very clear in my thinking: Jason's current prototype
uses flask, so I was suggesting not changing *that* to django just
because django is more popular or something. That said, if forking
PV's prototype is the way to go, then sure, rewriting it to use flask
seems a bit silly.

Jason Grout

unread,
Apr 19, 2011, 3:27:59 PM4/19/11
to scipy...@scipy.org
On 4/19/11 2:22 PM, william ratcliff wrote:
> Flask is nice, but why reinvent the wheel?

Django is more like a semi-truck than a wheel. Flask is like the wheel.
If you want a go-cart, it's probably easier to build it up from the
wheels rather than to strip down a semi.

Jason

Charles R Harris

unread,
Apr 19, 2011, 3:36:25 PM4/19/11
to SciPy Users List
On Tue, Apr 19, 2011 at 1:27 PM, Jason Grout <jason...@creativetrax.com> wrote:
On 4/19/11 2:22 PM, william ratcliff wrote:
> Flask is nice, but why reinvent the wheel?

Django is more like a semi-truck than a wheel.  Flask is like the wheel.
 If you want a go-cart, it's probably easier to build it up from the
wheels rather than to strip down a semi.


Now that the license is settled, it's time for a long discussion about tools ;)

Chuck

Jason Grout

unread,
Apr 19, 2011, 3:38:43 PM4/19/11
to scipy...@scipy.org
On 4/19/11 2:36 PM, Charles R Harris wrote:

> Now that the license is settled, it's time for a long discussion about
> tools ;)

Thanks for your wisdom :)

josef...@gmail.com

unread,
Apr 19, 2011, 3:40:09 PM4/19/11
to SciPy Users List

It's a developer-only thread (in contrast to the last one).
Only those that are actually working on this are allowed to comment.

Josef

>
> Chuck

Pauli Virtanen

unread,
Apr 19, 2011, 3:53:48 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 14:27:59 -0500, Jason Grout wrote:
> On 4/19/11 2:22 PM, william ratcliff wrote:
>> Flask is nice, but why reinvent the wheel?
>
> Django is more like a semi-truck than a wheel. Flask is like the wheel.
> If you want a go-cart, it's probably easier to build it up from the
> wheels rather than to strip down a semi.

Yes, if you choose to use a monolithic framework, then you should play by
its rules rather than against them.

Some reasons I picked Django, though:

- Not a one-man project

- Good backward compatibility

- Batteries included, and mostly don't get in your way (e.g. if you want
to go ORMing, manually hooking up SqlAlchemy is a bit of a pain)

- It's not really more complicated than the microframeworks
in the end --- even for small applications.

- Good documentation and all in one place

- I already know it (and found I preferred it over Turbogears, Pylons,
and web.py)

- Not having to decide which templating language and other components
to use saves your time :)

***

The choice of the web framework is not really important here, though.
Basic CRUD and AJAX you can do using *any* of them without too much
trouble, and I believe that is all you need for a simple application like
this.

My 2cc of oil (to the flames),
Pauli

Jason Grout

unread,
Apr 19, 2011, 4:03:24 PM4/19/11
to scipy...@scipy.org
On 4/19/11 2:53 PM, Pauli Virtanen wrote:
> On Tue, 19 Apr 2011 14:27:59 -0500, Jason Grout wrote:
>> On 4/19/11 2:22 PM, william ratcliff wrote:
>>> Flask is nice, but why reinvent the wheel?
>>
>> Django is more like a semi-truck than a wheel. Flask is like the wheel.
>> If you want a go-cart, it's probably easier to build it up from the
>> wheels rather than to strip down a semi.


Sorry, I didn't mean that as a flame, but rather as an analogy for
situations where a smaller framework might be justified. In our
situation, I don't think we only need a go-cart, so it might very well
make sense to pick Django.


>
> Yes, if you choose to use a monolithic framework, then you should play by
> its rules rather than against them.
>
> Some reasons I picked Django, though:
>
> - Not a one-man project
>
> - Good backward compatibility
>
> - Batteries included, and mostly don't get in your way (e.g. if you want
> to go ORMing, manually hooking up SqlAlchemy is a bit of a pain)
>
> - It's not really more complicated than the microframeworks
> in the end --- even for small applications.
>
> - Good documentation and all in one place
>
> - I already know it (and found I preferred it over Turbogears, Pylons,
> and web.py)
>
> - Not having to decide which templating language and other components
> to use saves your time :)


Those all sound like great reasons.


>
> ***
>
> The choice of the web framework is not really important here, though.
> Basic CRUD and AJAX you can do using *any* of them without too much
> trouble, and I believe that is all you need for a simple application like
> this.

+1 about not debating the pros/cons of each framework endlessly. I'd
rather spend the time learning Django :).

Jason

Jason Grout

unread,
Apr 19, 2011, 4:27:40 PM4/19/11
to scipy...@scipy.org
On 4/19/11 2:00 PM, Pauli Virtanen wrote:
> Since a DVCS would not be exposed to the outside of the application,

I think it would be really nice if the DVCS was exposed to the extent
that someone could git/hg clone their snippet/package repository and
move it to github/bitbucket once it became a bigger project. On the
other hand, I can imagine some people wanting to work locally with their
package repository and then push to the server. I don't think we need
to do a gitorious/github/bitbucket rewrite, but having a way to get your
repository off the server or push to the server would make it much
easier to transition projects that grow larger.

I notice you have three categories of entries according to the
DESIGN.rst. Here are some design questions/comments about each:

1. Hosted software-Code submitted by people directly to this site.
Edited by the original submitters solely.

- It would make sense to allow multiple people to commit to a package,
if it's easy. If it's not, then I suppose we encourage projects that
need more group-aware tools to use github or bitbucket or something and
instead just link to their repository. Having a very simple "forking"
feature would also be nice.

2. Pointers to externally hosted packages, hosted on PyPi, github,
bitbucket, someone's homepage, etc. Editable by anyone.

- Editable by anyone worries me---these are like the grown-up versions
of (1), so it makes sense that the original submitter is the one to
update these.

3. Short code snippets and code examples. Public domain and editable by
anyone.

- Sounds good, though I'm ambivalent about whether these should be
wiki-style or gist-style (e.g., editable by anyone, or forkable by
anyone). These are the sorts of things I envisioned having an "execute"
button on that would send the code to a server, execute it, and post
back the results.

Finally: which pieces do you envision having a version history exposed
to the user? Do you envision a "forking" or "branching" action for any
of these?

Jason

Pauli Virtanen

unread,
Apr 19, 2011, 5:25:13 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 15:27:40 -0500, Jason Grout wrote:
> On 4/19/11 2:00 PM, Pauli Virtanen wrote:
>> Since a DVCS would not be exposed to the outside of the application,
>
> I think it would be really nice if the DVCS was exposed to the extent
> that someone could git/hg clone their snippet/package repository and
> move it to github/bitbucket once it became a bigger project. On the
> other hand, I can imagine some people wanting to work locally with their
> package repository and then push to the server. I don't think we need
> to do a gitorious/github/bitbucket rewrite, but having a way to get your
> repository off the server or push to the server would make it much
> easier to transition projects that grow larger.

I think we have a slightly different ideas of what we are trying to make.
Maybe it would be useful to try to clear up the aim before going to
nitty-gritty :)

My idea was to get something like Wikipedia or ohloh.net centered on
scientific Python software. So the focus would be mainly on "released"
software, rather than being a platform for software development, such as
Sourceforce, Gitorius, or Gists.

Allowing file and cookbook page hosting would then be just a side
feature, just for convenience for the people who are too lazy to learn
how to use bitbucket and other tools properly. In this view, use of DVCS
is out of scope. Joint development would be diverted to the "traditional"
channels when possible --- direct communication with the original
authors, or whatever community site they use to share their work.

Whether this is an useful target, is of course up to debate.


Pros:

+ simple to implement

+ puts more weight to "finished" and (hopefully) more useful works

Cons:

- does not especially encourage collaboration


***

> I notice you have three categories of entries according to the
> DESIGN.rst. Here are some design questions/comments about each:
>
> 1. Hosted software-Code submitted by people directly to this site.
> Edited by the original submitters solely.
>
> - It would make sense to allow multiple people to commit to a package,
> if it's easy. If it's not, then I suppose we encourage projects that
> need more group-aware tools to use github or bitbucket or something and
> instead just link to their repository.
>
> Having a very simple "forking" feature would also be nice.

My view would be that we only want more or less "finished" works to
appear there. If so, the catalog does not really need features related to
software development --- except maybe a link pointing to where the
development is ongoing.

> 2. Pointers to externally hosted packages, hosted on PyPi, github,
> bitbucket, someone's homepage, etc. Editable by anyone.
>
> - Editable by anyone worries me---these are like the grown-up versions
> of (1), so it makes sense that the original submitter is the one to
> update these.

Being editable by anyone makes sense if you think of this part as a
Wikipedia/ohloh.net of scientific Python software.

The main issue here is that it would be useful for other people to be
able to also add links to projects you aren't an author of, improve
incomplete or terse descriptions, etc. Note that these type-2 projects
do not have any hosted code.

The criticism on enabling vandalism and spam is a valid one. Dealing with
it requires an adequate reversion & change monitoring UI, and enough
manpower. As a middle way, the changes could of course be moderated
manually, so that they appear only after approved by a group of editors
(and/or the original submitter).

But in any case, retaining the ability for anyone (registered) to submit
corrections to these catalog entries seems quite useful to me.

> 3. Short code snippets and code examples. Public domain and editable by
> anyone.
>
> - Sounds good, though I'm ambivalent about whether these should be
> wiki-style or gist-style (e.g., editable by anyone, or forkable by
> anyone).

The aim with these "snippets" was mainly to replace what's currently in
scipy.org/Cookbook, rather than to build a branded competitor to Gist or
similar services. Wiki-style seems like a useful choice here to me (and
my calling them "snippets" was a bad choice of words --- they're not
really the same).

The aim that I had in mind would be more to get an organized collection
of useful recipes with explanations how things work, rather than a
collection of miscellaneous code snippets "owned" by different people.
Wikipedia vs. Knol...

> These are the sorts of things I envisioned having an "execute"
> button on that would send the code to a server, execute it, and post
> back the results.

Needs some serious sandboxing on the server side. Sage seems to manage to
do this, though.

> Finally: which pieces do you envision having a version history exposed
> to the user? Do you envision a "forking" or "branching" action for any
> of these?

The version history exposure would be essentially what you get in your
average wiki. You don't normally look at it, unless you want to revert
something.

Forking and branching is out of scope, since the focus is on released
"products" rather than their development.

Pauli

william ratcliff

unread,
Apr 19, 2011, 5:28:55 PM4/19/11
to SciPy Users List

Check out apparmor..

Pauli Virtanen

unread,
Apr 19, 2011, 5:38:41 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 17:28:55 -0400, william ratcliff wrote:
> Check out apparmor..

Can you give a link? Google is fixated on the Linux security framework,
which you probably don't mean here.

Jason Grout

unread,
Apr 19, 2011, 5:50:12 PM4/19/11
to scipy...@scipy.org
On 4/19/11 4:28 PM, william ratcliff wrote:
> Check out apparmor..

As well as packaginator:

https://github.com/cartwheelweb/packaginator

http://packaginator.readthedocs.org/en/latest/?redir

Jason

Robert Kern

unread,
Apr 19, 2011, 5:49:58 PM4/19/11
to SciPy Users List
On Tue, Apr 19, 2011 at 16:38, Pauli Virtanen <p...@iki.fi> wrote:
> On Tue, 19 Apr 2011 17:28:55 -0400, william ratcliff wrote:
>> Check out apparmor..
>
> Can you give a link? Google is fixated on the Linux security framework,
> which you probably don't mean here.

I suspect he was only responding to the one comment on sandboxing, not
the whole proposal.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Pauli Virtanen

unread,
Apr 19, 2011, 6:02:31 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 16:50:12 -0500, Jason Grout wrote:
> On 4/19/11 4:28 PM, william ratcliff wrote:
>> Check out apparmor..
>
> As well as packaginator:
>
> https://github.com/cartwheelweb/packaginator
>
> http://packaginator.readthedocs.org/en/latest/?redir

Yes, I'm aware of djangopackages.com

It however does not do all that is desired --- no hosting, no cookbook,
no reviews. Those could, however, be possible to glue on it as separate
Django apps.

Pauli

Pauli Virtanen

unread,
Apr 19, 2011, 6:04:35 PM4/19/11
to scipy...@scipy.org
On Tue, 19 Apr 2011 16:50:12 -0500, Jason Grout wrote:
> On 4/19/11 4:28 PM, william ratcliff wrote:
>> Check out apparmor..
>
> As well as packaginator:
>
> https://github.com/cartwheelweb/packaginator

And djangosnippets:

https://github.com/coleifer/djangosnippets.org/

Jason Grout

unread,
Apr 19, 2011, 7:08:40 PM4/19/11
to scipy...@scipy.org


Ah, you're right---I see now we may have different aims. My original
idea was for a project somewhere in between pastebin/gist and github in
functionality, but also having the tagging and comments features that
you're proposing. Add to that a way to execute snippets, like
http://python.codepad.org/ or the Sage notebook. With Sage interacts,
executing the snippets of code would allow a user to play with sliders
and buttons to interact with Sage/Python without having to log in to a
web notebook.

So I had envisioned much more of an active development site for small
chunks of code, rather than a repository of pointers to packages
developed on more heavyweight development sites.


>> 1. Hosted software-Code submitted by people directly to this site.
>> Edited by the original submitters solely.
>>
>> - It would make sense to allow multiple people to commit to a package,
>> if it's easy. If it's not, then I suppose we encourage projects that
>> need more group-aware tools to use github or bitbucket or something and
>> instead just link to their repository.
>>
>> Having a very simple "forking" feature would also be nice.
>
> My view would be that we only want more or less "finished" works to
> appear there. If so, the catalog does not really need features related to
> software development --- except maybe a link pointing to where the
> development is ongoing.

So when you say "Hosted software", are you thinking of a PyPi type of
site, where the release tarball might be hosted on the site, rather than
the development repository?


>
>> 2. Pointers to externally hosted packages, hosted on PyPi, github,
>> bitbucket, someone's homepage, etc. Editable by anyone.
>>
>> - Editable by anyone worries me---these are like the grown-up versions
>> of (1), so it makes sense that the original submitter is the one to
>> update these.
>
> Being editable by anyone makes sense if you think of this part as a
> Wikipedia/ohloh.net of scientific Python software.
>
> The main issue here is that it would be useful for other people to be
> able to also add links to projects you aren't an author of, improve
> incomplete or terse descriptions, etc. Note that these type-2 projects
> do not have any hosted code.
>
> The criticism on enabling vandalism and spam is a valid one. Dealing with

> it requires an adequate reversion& change monitoring UI, and enough


> manpower. As a middle way, the changes could of course be moderated
> manually, so that they appear only after approved by a group of editors
> (and/or the original submitter).
>
> But in any case, retaining the ability for anyone (registered) to submit
> corrections to these catalog entries seems quite useful to me.


I see. Your approach makes sense, especially with moderation and/or
easily reverted edits.


>
>> 3. Short code snippets and code examples. Public domain and editable by
>> anyone.
>>
>> - Sounds good, though I'm ambivalent about whether these should be
>> wiki-style or gist-style (e.g., editable by anyone, or forkable by
>> anyone).
>
> The aim with these "snippets" was mainly to replace what's currently in
> scipy.org/Cookbook, rather than to build a branded competitor to Gist or
> similar services. Wiki-style seems like a useful choice here to me (and
> my calling them "snippets" was a bad choice of words --- they're not
> really the same).
>
> The aim that I had in mind would be more to get an organized collection
> of useful recipes with explanations how things work, rather than a
> collection of miscellaneous code snippets "owned" by different people.
> Wikipedia vs. Knol...


Your explanation clears up a lot of confusion. Thanks. As I said, I'm
not sure which style is best for the use-cases I have in mind
(development/hosting of small educational snippets of code). It may be
that the best approach for me is to point people to gist to do their
development if they want to have forking, DVCS, etc., but then they
should copy their code for the snippet website (or maybe even better,
just give the gist and we can retrieve the code via the gist API).

Thanks,

Jason

Stefan Schwarzburg

unread,
Apr 20, 2011, 2:30:50 AM4/20/11
to SciPy Users List
Hi


Did you have a look at http://aciresnippets.wordpress.com/? Although it does not do exactly what you want, it is very close.
It's not a website for example, but a desktop program (acire) that interacts with a version controlled snippets repository.

Maybe you could use the underlying DVCS controlled python-snippets (http://aciresnippets.wordpress.com/contribute/) and just add a new category (Sage, scipy, ...) and build a website that interacts with these snippets. It might remove some of your work and additionally combine the work of a similar project with yours?


 



--
Institut für Astronomie und Astrophysik
Eberhard Karls Universität Tübingen
Sand 1   -  D-72076 Tübingen
Tel.: 07071/29-78605
-----------------------------------------------------------------------

Pauli Virtanen

unread,
Apr 20, 2011, 8:45:50 AM4/20/11
to scipy...@scipy.org
Tue, 19 Apr 2011 18:08:40 -0500, Jason Grout wrote:
[clip]

> Ah, you're right---I see now we may have different aims. My original
> idea was for a project somewhere in between pastebin/gist and github in
> functionality, but also having the tagging and comments features that
> you're proposing. Add to that a way to execute snippets, like
> http://python.codepad.org/ or the Sage notebook. With Sage interacts,
> executing the snippets of code would allow a user to play with sliders
> and buttons to interact with Sage/Python without having to log in to a
> web notebook.
>
> So I had envisioned much more of an active development site for small
> chunks of code, rather than a repository of pointers to packages
> developed on more heavyweight development sites.

Yes, I can see the value in having a more community-like collection of
snippets.

One thing to note is that there's no need to have a single backend or
even the main parts of the UI shared by the snippets and the
"catalog" (although this might some things simpler). The different
components of the site can be loosely coupled.

It should be possible to write a tagging/per-entry-comments/etc platform
that can be used both for the catalog and for the snippets. (Django has
support for 'generic' foreign keys, and rendering can be done via custom
template tags.) So even if the end result is that the snippets need a
different storage and UI approach, there would still be a non-negligible
amount of code that can be shared (and perhaps some could be lifted from
djangopackages.com), because a major part of the required "community"
features are quite similar.

Technical differences, I think, are not a reason to make two sites rather
than one, as long as the "community" and "indexing" aspects are shared. I
can imagine a multi-pronged approach with real python packages and
snippets in different sections of the same site.

Having the Sage community involved here would definitely be a big synergy
advantage for this type of a site.

[clip]


> So when you say "Hosted software", are you thinking of a PyPi type of
> site, where the release tarball might be hosted on the site, rather than
> the development repository?

Precisely so. The aim would be to make it less hassle to use than PyPi
for a relative Python newbie. (Although uploading packages to PyPi is not
hugely hassle-ful at the moment, as it is possible to do it using only
the web interface.) So there would be a bit of an overlap with PyPi; one
could however add some recommendations etc. to push people to use PyPi,
if they are willing to jump through some extra hoops.

***

At the moment, one thing seems clear:

- Pointers to externally hosted projects (& semi-automatic
import from PyPi)

But the following are not so clear:

- Hosted projects -- how much to overlap with PyPi?

- Snippets -- the Wiki or the Knol? Or both? How much overlap with
hosted projects?


Pauli

Andy Wilson

unread,
Apr 20, 2011, 2:11:33 PM4/20/11
to SciPy Users List

Sorry for showing up late to the party...


fwiw, I did some experimenting with flask+git. It is really rough but if you are interested you can find it here:  https://github.com/wilsaj/gitsnippets


I think using either git or hg for snippets is a good idea because it saves some trouble by giving you some nice things for free, namely: history of edits and the ability to rollback changes (in case of vandalism or errors). If you keep things linear and don't bother with forking or branching/merging then it isn't very different from writing to the filesystem, but revision history is automatically recorded. Repos are also self-contained and easy to manage.

There are good reasons to NOT expose snippets via DVCS. In order for people to be able to push back a cloned repo, you have to have to deal with authorization and be ready to handle branches. That's a lot of complexity and there are already systems that are very good at hosting DVCS, so diminishing returns...



The 'coming soon' parts of the github gist API look pretty enticing, no dice so far: http://develop.github.com/p/gist.html


-andy

Jason Grout

unread,
Apr 21, 2011, 2:18:35 AM4/21/11
to scipy...@scipy.org
On 4/20/11 1:11 PM, Andy Wilson wrote:
>
> Sorry for showing up late to the party...
>
>
> fwiw, I did some experimenting with flask+git. It is really rough but if
> you are interested you can find it here:
> https://github.com/wilsaj/gitsnippets
>
>
> I think using either git or hg for snippets is a good idea because it
> saves some trouble by giving you some nice things for free, namely:
> history of edits and the ability to rollback changes (in case of
> vandalism or errors). If you keep things linear and don't bother with
> forking or branching/merging then it isn't very different from writing
> to the filesystem, but revision history is automatically recorded. Repos
> are also self-contained and easy to manage.
>
> There are good reasons to NOT expose snippets via DVCS. In order for
> people to be able to push back a cloned repo, you have to have to deal
> with authorization and be ready to handle branches. That's a lot of
> complexity and there are already systems that are very good at hosting
> DVCS, so diminishing returns...

In playing with gist, it appears that it only shows the master branch in
the web UI. Any other branches are impossible to create using the web
UI (only the master HEAD can be edited), and if they are pushed to the
underlying git repository, they are ignored.

So that's one way of dealing with the branching problem with a DVCS.

Jason

Jason Grout

unread,
Apr 21, 2011, 4:07:29 AM4/21/11
to scipy...@scipy.org

+1. Those goals are definitely shared between our two needs.


>
> Having the Sage community involved here would definitely be a big synergy
> advantage for this type of a site.

+1; and the same can be said from the Sage side for the scipy and wider
scientific python community.


>
> [clip]
>> So when you say "Hosted software", are you thinking of a PyPi type of
>> site, where the release tarball might be hosted on the site, rather than
>> the development repository?
>
> Precisely so. The aim would be to make it less hassle to use than PyPi
> for a relative Python newbie. (Although uploading packages to PyPi is not
> hugely hassle-ful at the moment, as it is possible to do it using only
> the web interface.) So there would be a bit of an overlap with PyPi; one
> could however add some recommendations etc. to push people to use PyPi,
> if they are willing to jump through some extra hoops.

What extra hoops?

1. Create an account on PyPi (make up username/password, click the email
verification link)

2. Log in and go to their web form:
http://pypi.python.org/pypi?%3Aaction=submit_form

Fill out info and submit? (I didn't actually do this, but I'm assuming
it's easy).

Okay, there were some concerns about the process:

1. It seemed like the connection was *not* over SSL, which always
concerns me when we have passwords going back and forth

2. It wouldn't let me use the web submission form when I logged in using
OpenID.

However, these seem like minor technical issues that could be solved.

>
> ***
>
> At the moment, one thing seems clear:
>
> - Pointers to externally hosted projects (& semi-automatic
> import from PyPi)

I just read up more on PyPi, and in particular, read up on their recent
discussion which led to the disabling of the rating system [1]. I'm not
convinced that this point is clear. How is our pointing to packages on
PyPi and elsewhere improving on PyPi? Are there a number of other
packages out there that are not cataloged on PyPi and which should not
be cataloged there?

I could see us adding value by having a better tagging system that was
customized more for scientific software. On the other hand, maybe we
could just improve the PyPi entries for such software so that a keyword
search would pull up the packages.


>
> But the following are not so clear:
>
> - Hosted projects -- how much to overlap with PyPi?

If it's easy to host on PyPi, it seems like we should point people over
there. We have far fewer users (and infrastructure maintainers) than
PyPi, and PyPi itself already has the authoritative blessing of Python.

>
> - Snippets -- the Wiki or the Knol? Or both? How much overlap with
> hosted projects?

The python snippet repository is:
http://code.activestate.com/recipes/langs/python/

Advantages:

1. It already has a tagging and comment system

2. It is "blessed" in that
http://wiki.python.org/moin/PublishingPythonModules points to it as the
only active snippet repository

3. It is hosted and maintained by someone else (so no hassle for us)

4. It allows revisions and "forking" a snippet, so has some community
development advantages. These don't seem as easy as gist, though.

5. You can select a license for a snippet: Apache, BSD, GPL 2/3, LGPL,
MIT, PSF, and others are options

6. It already has thousands of recipes.

Disadvantages

1. It is hosted by someone else, and that someone else seems to have
commercial interests (for example, the sign-in form asks for your
company, and you constantly have ads on the site).

2. There is no execution of snippets (needed for the Sage interact
database we're building, so I'll have to do something more than it anyway)


So: thoughts on the scope of this new project, and how it differentiates
enough from the existing sites to be useful enough to build and maintain?

Thanks,

Jason


[1] http://mail.python.org/pipermail/catalog-sig/2011-April/003542.html
(and many, many replies to the thread)

Pauli Virtanen

unread,
Apr 21, 2011, 8:37:01 AM4/21/11
to scipy...@scipy.org
Thu, 21 Apr 2011 03:07:29 -0500, Jason Grout wrote:
[clip]
>>> So when you say "Hosted software", are you thinking of a PyPi type of
>>> site, where the release tarball might be hosted on the site, rather
>>> than the development repository?
>>
>> Precisely so. The aim would be to make it less hassle to use than PyPi
>> for a relative Python newbie. (Although uploading packages to PyPi is
>> not hugely hassle-ful at the moment, as it is possible to do it using
>> only the web interface.) So there would be a bit of an overlap with
>> PyPi; one could however add some recommendations etc. to push people to
>> use PyPi, if they are willing to jump through some extra hoops.
>
> What extra hoops?

Mainly, for small pieces of code, you might not even want to create a
named Python package. So some form of code hosting seems to be useful ---
whether it is called "snippets" (multi-file) or "hosted projects". (The
wiki-style Cookbook content is then the third category of items that
could be useful to have.)


Re: PyPi usability

You cannot just upload a .py file onto PyPi. PyPi checks that the
uploaded file (i) is a tarball, zip, or egg, and (ii) is named in a
specific way, and, (iii) you need to go to a different site.

So there are user experience issues with the web upload. Sure, the upload
is manageable once you practice a bit, and many of the issues are
probably fixable.

[clip]


>> At the moment, one thing seems clear:
>>
>> - Pointers to externally hosted projects (& semi-automatic
>> import from PyPi)
>
> I just read up more on PyPi, and in particular, read up on their recent
> discussion which led to the disabling of the rating system [1]. I'm not
> convinced that this point is clear. How is our pointing to packages on
> PyPi and elsewhere improving on PyPi? Are there a number of other
> packages out there that are not cataloged on PyPi and which should not
> be cataloged there?

By the "clear" thing, I mean pointing to packages on PyPi (= almost all
externally hosted projects), augmented with community tags etc. Pointing
to packages outside PyPi is not essential --- but would be easy to add.

One thing running for allowing "external" links is that adding packages
to PyPi can only be done by their authors. There is currently a small
number of relevant packages usable from Python that are not on PyPi
(although they should); for example PyTrilinos. But I guess it should be
possible to browbeat their authors to add a PyPi entry.

> I could see us adding value by having a better tagging system that was
> customized more for scientific software. On the other hand, maybe we
> could just improve the PyPi entries for such software so that a keyword
> search would pull up the packages.

It seems clear to me that this feature would be useful. Also, combining
the PyPi data with smaller code snippets would create a one-stop-shop.

As you can surmise from the discussion you linked to, there is resistance
in adding new community-oriented features to PyPi itself, as some people
feel that such features are out-of-scope for it. Doing it externally also
makes sense from the usability and branding point of view --- a site
called "Python in science" with filtered package selection can be more
convincing and convenient to navigate than browsing "Topic :: Scientific/
Engineering" on PyPi.

The discussion on rating systems there is an useful read --- it's why I
left out any star-based rating systems so far. Just adding "I use this"
popularity measure probably works around most issues. (The PyPi download
data is not a very reliable measure, as many of the bigger packages host
their files externally.)

***

On improving the entries on PyPi -- the keywords etc. there are editable
only by the original submitter, and this will probably not change, so I
don't think that will be a possible way to go.

The PyPi package classifiers as they are now are assigned solely by the
package authors, more or less at random and from a limited selection, and
are not very reliable. There are several packages in the "Topic ::
Scientific/Engineering" category that don't actually have much focus on
either science or engineering. So, some filtering of PyPi entries would
already be useful.

> > But the following are not so clear:
> >
> > - Hosted projects -- how much to overlap with PyPi?
>
> If it's easy to host on PyPi, it seems like we should point people over
> there. We have far fewer users (and infrastructure maintainers) than
> PyPi, and PyPi itself already has the authoritative blessing of Python.

For small contributions, you might not want to use PyPi. Aside from that,
the hosted projects are not really required, provided PyPi is easy enough
to use (which is not true for the setup.py way, but may be true for the
web interace).

> > - Snippets -- the Wiki or the Knol? Or both? How much overlap with
> > hosted projects?
>
> The python snippet repository is:
> http://code.activestate.com/recipes/langs/python/

[clip]

The activestate snippet library seems not to be very actively used for
Scipy et al. at the moment -- there are only ~15 recipes tagged with
"scipy", "numpy", "scientific", "sage", or "science".

Also:

- Since it's not a focused site, relevant code snippets are mixed
with non-relevant ones, including ones written in languages other
than Python.

- As tags are specified by the users, and free-form, stuff will
be lost in the midst of non-relevant content.

- The tagging feature could perhaps be improved --- it appears they
can only be assigned by the author of the snippet.

- There are some usability problems: e.g. clicking the "Tags" link on
the top takes you away from Python-specific content.

- The search feature is not especially good: it's just Google's site:
search, so it does not explicitly know about tags or metadata.

Other than that, it seems to do a reasonable work.

[clip]


> So: thoughts on the scope of this new project, and how it differentiates
> enough from the existing sites to be useful enough to build and
> maintain?

Already focusing on "scientific" content is a differentiation big enough,
IMHO. It's mostly a social question of creating a hub for exchanging this
type of content; and also a question of branding. Technically, sure,
there is not so much new under the sun. The first point would just to be
to make the implementation slick and useful enough to attract people to a
single place. The second point would be to provide a one-stop-shop for
whatever you need related to Python in science --- which would have the
extra benefit of showcasing that it is doing well, and is a credible tool
for many purposes.

At least based on earlier discussions on this list, it seems that at
least the people who chimed in would prefer such a central hub over what
is currently available.

The current situation is, if I want to share a something science-related
written in Python, it is not obvious where I should put it so that there
would be some audience. For small contributions, in generic snippet sites
your stuff gets easily lost in the middle of non-relevant content ---
also, I'm not convinced many people (e.g. those on this mailing list)
follow those. The scipy.org/Cookbook is also not very usable, as it's a
generic wiki. For larger contributions, PyPi works (although it's mildly
clumsy to use), but it does not offer much visibility. If I name my
package as scikits.* it goes to scikits.appspot.com, but I guess that's
not very widely used either.

So that's the motivation. The snippet/hosted-projects part alone would
address a part of what is missing, but I think one might as well go and
make a one-stop-shop out of it.

Best,
Pauli

josef...@gmail.com

unread,
Apr 21, 2011, 9:46:05 AM4/21/11
to SciPy Users List
On Thu, Apr 21, 2011 at 8:37 AM, Pauli Virtanen <p...@iki.fi> wrote:
> Thu, 21 Apr 2011 03:07:29 -0500, Jason Grout wrote:
> [clip]

> As you can surmise from the discussion you linked to, there is resistance


> in adding new community-oriented features to PyPi itself, as some people
> feel that such features are out-of-scope for it. Doing it externally also
> makes sense from the usability and branding point of view --- a site
> called "Python in science" with filtered package selection can be more
> convincing and convenient to navigate than browsing "Topic :: Scientific/
> Engineering" on PyPi.
>
> The discussion on rating systems there is an useful read --- it's why I
> left out any star-based rating systems so far. Just adding "I use this"
> popularity measure probably works around most issues. (The PyPi download
> data is not a very reliable measure, as many of the bigger packages host
> their files externally.)

Sorry to add a comment here.

The thread on PyPi rating system was dominated by a few developers of
big or famous packages. Ratings for Django or numpy or scipy might not
be useful, but ratings (with required comments) are very useful for
the huge number of smaller packages and will be for "snippets".

matlab fileexchange is a good example for this.

Josef

Robert Kern

unread,
Apr 21, 2011, 10:31:31 AM4/21/11
to SciPy Users List
On Thu, Apr 21, 2011 at 08:46, <josef...@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 8:37 AM, Pauli Virtanen <p...@iki.fi> wrote:
>> Thu, 21 Apr 2011 03:07:29 -0500, Jason Grout wrote:
>> [clip]
>
>> As you can surmise from the discussion you linked to, there is resistance
>> in adding new community-oriented features to PyPi itself, as some people
>> feel that such features are out-of-scope for it. Doing it externally also
>> makes sense from the usability and branding point of view --- a site
>> called "Python in science" with filtered package selection can be more
>> convincing and convenient to navigate than browsing "Topic :: Scientific/
>> Engineering" on PyPi.
>>
>> The discussion on rating systems there is an useful read --- it's why I
>> left out any star-based rating systems so far. Just adding "I use this"
>> popularity measure probably works around most issues. (The PyPi download
>> data is not a very reliable measure, as many of the bigger packages host
>> their files externally.)
>
> Sorry to add a comment here.
>
> The thread on PyPi rating system was dominated by a few developers of
> big or famous packages. Ratings for Django or numpy or scipy might not
> be useful, but ratings (with required comments) are very useful for
> the huge number of smaller packages and will be for "snippets".
>
> matlab fileexchange is a good example for this.

Actually, there were a number of us who objected to the ratings as
*users* of PyPI. I find them abhorrent and worse than useless. An "I
use this" button works well enough, I think. Comments can be helpful
if done properly (and doing it properly is a *lot* more involved than
most people think). Provide comparison grids like those on Django
Packages to help people compare features of different packages. Make
it trivial to see the code, and most people will be able to come to
their own, much better judgements. Ratings and comments are the best
you can do for catalogs of items that you need to pay for to examine
(e.g. Amazon), but for catalogs of open source software, you can do a
lot better.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

william ratcliff

unread,
Apr 21, 2011, 10:37:20 AM4/21/11
to SciPy Users List
For code snippets, I think comments are useful--ala stack overflow. 


William

josef...@gmail.com

unread,
Apr 22, 2011, 5:22:43 PM4/22/11
to SciPy Users List

Sturla Molden

unread,
Apr 22, 2011, 5:40:55 PM4/22/11
to scipy...@scipy.org, Core developer mailing list of the Cython compiler

This is indeed something I miss for scientific Python as well.

I also miss a similar file exchange for Cython (not limited to
scientific computing).


Here is another site for comparison (yes I know about the SciPy cookbook):

http://code.activestate.com/recipes/

Sturla

Reply all
Reply to author
Forward
0 new messages