|Scientific Python Packaging||Andy Terrel||10/19/12 9:10 AM|
It looks like our doodle poll comes in with several options, but
Anthony and I like Tuesday 11 CST (GMT -5).
My preference for the call would be to use G+ hangouts, so if you are
interested, friend me now.
The topic is somewhat fluid but here are the high points.
1) What do we want to be able to package?
-- MPI dependent libraries (which means compiler dependent).
-- Portable setup on various machines
-- Benchmarks and tests to show things are actually working.
-- A large set of complete libraries
2) My focus is on pyHPC tools, but what do we need from the community?
-- A better disttools
-- Funding for products
-- User stories for getting vendor buy in
3) Anything else?
If you are coming please reply to this thread and introduce yourself
so we can skip that in the call.
|Re: Scientific Python Packaging||Andy Terrel||10/19/12 9:15 AM|
To start off introductions. My name is Andy Terrel, I primarily work
on CFD codes and run Python on TACC's supercomputers .
|Re: Scientific Python Packaging||David||10/19/12 9:45 AM|
I'd like to join the discussion as well. My name is David Cournapeau,
and I have worked on Numpy/scipy for a couple of years now. One of the
things I have been working on is build/packaging issues around
numpy/scipy, and I am currently working on Bento, a sane packaging
solution for python (https://github.com/cournape/Bento)
|Re: Scientific Python Packaging||ilan||10/19/12 9:57 AM|
I'd also like to join in. Tuesday (Oct 23rd) at 11am CST sounds good.
I used to maintain EPD for many years and am aware of many of the
pain points in Python packaging.
|Re: Scientific Python Packaging||Aron Ahmadia||10/19/12 10:43 AM|
Thanks for hosting, I will join as well and volunteer for taking minutes unless anybody has any objections.
I'm formerly a staff scientist at the Supercomputing Laboratory at KAUST, I was responsible for the Python stack on our BlueGene/P supercomputer. I would also like to invite Samuel John, the maintainer of a good chunk of the Python stack on OS X, to the call. I don't have an email address for him, can somebody with more Twitter experience than me ping him @samueljohn_de?
Should I set up a Calendar invite so there's no time zone confusion?
|Re: Scientific Python Packaging||Andy Terrel||10/19/12 10:48 AM|
On Fri, Oct 19, 2012 at 12:43 PM, Aron Ahmadia <ar...@ahmadia.net> wrote:I'll do that.
|Re: Scientific Python Packaging||Yung-Yu Chen||10/19/12 4:21 PM|
My name is Yung-Yu Chen. Thank you very much for hosting this discussion. I hope I can join it and learn more about packaging from you. I am currently working in semiconductor industry for electromagnetic waves, lithography, and some other staff. My contribution to open source projects is yet little, except my CESE code written in the past few years (http://solvcon.net/).
As I mentioned in the previous thread, my interest is mainly in deploying scientific packages. But I would also like to know how can I start to contribute python packaging.
+886 (99) 129 4763
|Re: Scientific Python Packaging||Travis||10/19/12 8:41 PM|
I will be traveling but Ilan will phone-in from Continuum. He is traveling this weekend and so may not be able to respond.
|Re: Scientific Python Packaging||Chris Kees||10/20/12 10:50 AM|
Hello. My name is Chris Kees, and I'm a research hydraulic engineer at the Coastal and Hydraulics Laboratory (US Army Corps of Engineers). I'm one of the developers of Proteus , which is a Python toolkit for computational methods and simulation, mainly focused on solving PDE's describing coastal and hydraulic process.
We develop for platform independence but also specifically for DoD HPCMP machines . I have some experience running on the TACC machines that Andy mentioned and many other linux and mac os x platforms. Proteus builds it's own stack out of necessity (many actively developed dependencies and non-Python libraries). The way we do it is not maintainable so I have been one of the instigators of the hashdist tools development.
I will add that I'm really looking forward to hearing about improvements in the packaging tools like and hope that we can all work together on a community solution to the stack problem. In particular, I hope we can do something for the windows platform, which the large majority of US government employees are locked into and which I know next to nothing about.
|Re: Scientific Python Packaging||Aron Ahmadia||10/21/12 1:31 PM|
Just a heads-up, we have 11 confirmed attendees already on the G+ Hangout, and 7 more tentatives. I have not yet hosted a G+ Hangout of this size, but I will be on a high-speed network at Oxford University, so hopefully there will not be any problems with me acting as technical host. If you have not done a G+ Hangout before and would like to test your connection, I will try to host a "pre-hangout" in the 30 minutes prior for those who have time to call in and decide on their Hangout Hat. I am going to suggest that Andy act as moderator, and I will play Sergeant at Arms since I am hosting the call.
Andy, can you set the draft agenda? Now would be a good time for other people on this thread to suggest anything they would like discussed specifically.
If you would like to do some homework for the call, you could dig through the numpy and scipy mailing list archives discussing packaging, read the recent thread on numfocus (https://groups.google.com/forum/#!topic/numfocus/4dF5z0g-Bj8) discussing this, and also see the community blog Dag put together last year: (http://fixingscientificsoftwaredistribution.blogspot.co.uk)
|Re: Scientific Python Packaging||Matthew Turk||10/21/12 1:56 PM|
My name is Matthew Turk, and I'm an astronomer working on an analysis
and visualization package ( yt; http://yt-project.org/ ) for the
output of simulation codes. My work has focused on enabling
comparison of astrophysical results independent of the simulation
platform that generates them; this includes things like IO, data
structures, units, and so on. Additionally, on top of this level of
base support for different codes, we have endeavored to build not just
an astrophysical analysis toolkit, but a generic volumetric analysis
toolkit that can be applied to astronomy.
yt has been deployed on platforms ranging from netbooks, laptops,
local clusters, etc to supercomputer facilities in the US-based NSF
and DOE systems as well as European supercomputer centers. In the
past, yt has primarily been distributed to these systems using an
install script; this builds a (virtualenv-based) isolated environment
of libraries that cover the majority of dependencies. We only
reluctantly took this route, as it provides maintenance overhead on
our part, but for the most part we have found it to increase the size
of our community and the ease of installation. However, I'd like to
move away from this in the long run, and to that end we've begun
efforts to provide PPAs, MacPorts, and easier pip-installability.
We've also worked with consultants at several supercomputer centers to
build yt modules, which has been quite successful for us.
As noted by many others in this thread, it seems to me that the best
way to address this in the long term is to build a coherent set of
packages and to standardize on that. For me, this is not just about
enabling people to use yt, but opening up to them the ability to use
yt in conjunction with other awesome python packages out there, and
reducing the overhead to this process. One issue I have been
particularly interested in is creating a statically linked python
library that included all dependencies -- such as matplotlib (which
was particularly difficult as it was C++), numpy, h5py, etc. In the
past we have done this for Crays running Compute Node Linux, but the
overhead to modifying the libraries linked in, upgrading, etc etc was
simply too high. We explored using the CMake-ified Python build
system provided by Kitware (which in principle should make statically
linking Python libraries much easier) but were unable to make this
Looking forward to the hangout,
|Re: Scientific Python Packaging||Pat Marion||10/21/12 8:37 PM|
Along the lines of what Matt described, I recently created an example project that compiles a hello world program with python and numpy builtin. It uses static linking to embed C extension modules, and it uses the python freeze tool to embed all other .py files. At runtime, the program can use numpy and the python standard library without reading any files from disk whatsoever. I posted the example project on github, and there is additional information in the README.
I'm a contributer to the ParaView CoProcessor, a library that provides in-situ filtering and visualization that can be scripted with python, and I've used this combination of static linking and python freeze to completely eliminate the startup cost that is normally associated with using python at scale on supercomputers. I think that supercomputing, and other domains like mobile computing, can benefit from the availability of scientific python packages that do not require dynamic library loading, or any other filesystem I/O at runtime.
p.s. Matt, I have used the cmakeified python build system to cross-compile and static link on Cray and BlueGene, but it doesn't produce an install tree capable of driving numpy builds via distutils. I think that an improved version of the cmake build system for python could be a useful thing to have available.
|Re: Scientific Python Packaging||Anthony Scopatz||10/21/12 8:59 PM|
Call me Anthony Scopatz. I am a postdoc at the FLASH Center at the University of Chicago. The only two constants in my career have been Python and Linux. I am also the NumFOCUS treasurer. So it goes.
|Re: Scientific Python Packaging||Dag Sverre Seljebotn||10/22/12 2:02 AM|
I'm Dag Sverre Seljebotn, a PhD student in astrophysics (cosmology) at
University of Oslo. I got involved in scientific Python by being a
developer on Cython. If all goes well with the funding, I'll take two
months off my PhD soonish to work with Ondrej Certik and Chris Kees on
hashdist and related tools.
My own usecases are in the "mid-HPC" camp -- we run on tens rather than
hundreds of nodes, but since the computing time we use is on central
clusters targeted to large HPC users this means I'm basically on the
"HPC stack" with its problems:
a) Must recompile for different MPI versions, Fortran compilers etc.
b) More importantly I believe, it puts me in the "Debian packages
doesn't work for me" group.
I agree with Yung-Yu Chen, I'm very much in the "better tools are
needed" camp and care very little about the "standardized set of
packages". There's a lot of "set of packages" around already in the
plethora of scientific Python distributions, yet several new such
distributions are created from scratch each year, which I think is an
indication that current packaging practices doesn't work as well as they
|Re: Scientific Python Packaging||Thomas Kluyver||10/22/12 3:17 AM|
On Sunday, 21 October 2012 21:31:53 UTC+1, Aron Ahmadia wrote:Just a heads-up, we have 11 confirmed attendees already on the G+ Hangout, and 7 more tentatives. I have not yet hosted a G+ Hangout of this size, but I will be on a high-speed network at Oxford University, so hopefully there will not be any problems with me acting as technical host. If you have not done a G+ Hangout before and would like to test your connection, I will try to host a "pre-hangout" in the 30 minutes prior for those who have time to call in and decide on their Hangout Hat. I am going to suggest that Andy act as moderator, and I will play Sergeant at Arms since I am hosting the call.
I hope to be able to join, but unfortunately my home internet connection is terribly slow, and I can't really take a conference call in the open plan office. So don't wait for me if I'm not there.
My background: I'm a biology PhD student, and an IPython developer. Recently, I've been pushing the idea that we should make a more unified face for the Scipy ecosystem, rather than being separate parts that we rely on distributions to integrate. This has come up several times before, but it's never got very far. As part of this, I'd love to see a better solution for distributing and installing packages.
|Re: Scientific Python Packaging||Wolfgang Kerzendorf||10/22/12 5:54 AM|
My name is Wolfgang Kerzendorf and I'm an astrophysicist working at the University of Toronto. I'm one of the developers of astropy (www.astropy.org) - a new collaboration that is building a data reduction and analysis tool for astronomy. We're interested to see what the community is up to.
|Re: Scientific Python Packaging||Andy Terrel||10/22/12 6:17 AM|
On Sun, Oct 21, 2012 at 3:31 PM, Aron Ahmadia <ar...@ahmadia.net> wrote:I don't think G+ cares about the host too much as I've been dropped
from hangouts and the other participants saw no interruptions.
G+ hangouts can only host 19 people. We should find a more traditional
conference line as a back up. Another solution is Skype, which would
be nicer for our European chatters, but then we need to manage all the
skype users there. I have used FreeConferenceCall.com with some
success in the past, but then its an international call (for Europeans).
Since we have so many what we might do is have a couple of calls if
people aren't able to get on.
See the first message in this thread.
A few other things to look at:
|Re: Scientific Python Packaging||Aron Ahmadia||10/22/12 6:33 AM|
I just looked it up and it's a hard limit of 10 users. Perhaps we can
split into the 10 who responded first in this thread as "panelists" of
the hangout, then everybody else can join in as "In Air" observers
(and send comments/questions ahead of time via email or live via
Google Chat to myself or Andy). I don't want to change communication
strategies at the last minute. Here's the list of 10 "panelists" I
If we do the G+ Hangout as "In Air", then it's recorded automatically
and an unlimited number of observers can join, so I suggest we do it
|Re: Scientific Python Packaging||Yaroslav Halchenko||10/22/12 7:00 AM|
NB Bloody google seems to keep swallowing my emails to the group so re-posting via web UI
To present myself to new members of the list and to precondition them
toward my possible future comments: I am a postdoctoral researcher with
a PhD in computer science now working in the neuroimaging field. I am
also a Debian developer.
Having been confronted with a bloom of open-source solutions in the
neuroimaging and associated difficulties with their deployment,
together with Michael Hanke we started the NeuroDebian project
(http://neuro.debian.net). It aims to make life of neuroscientists easy
through the integration of neuroscience FOSS within the largest complete
software distribution -- Debian. With that, as leading developers of
PyMVPA (http://www.pymvpa.org), we also "natively" got ourselves (and
our users) a resolution for the deployment of Python-based FOSS. As
many of you might be aware already, nowadays installation and
maintenance of any software "integrated" in Debian (and thus its
derivatives) is very easy pretty much regardless of its deployment
If you are interested to know more about our "approach" and benefits of
going on the train instead of running after it, see recent
Open is not enough. Let’s take the next step: an integrated,
community-driven computing platform for neuroscience
Yaroslav O. Halchenko
Postdoctoral Fellow, Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
|Re: Scientific Python Packaging||Anthony Scopatz||10/22/12 7:19 AM|
While I understand the limitations of the technology, I think that this is
less than ideal. Therefore I suggest that we have a back channel chat
where everyone who isn't a "panelist" can type and contribute. Then,
we deputize one of the panelist to watch the chat and read off relevant
comments. I would be happy to do this if no one else wants to.
BTW, does anyone know if the on-air participants can type in the normal
hangout's chat box, or will this have to be a separate link, google doc, etc.
|Re: Scientific Python Packaging||Aron Ahmadia||10/22/12 7:25 AM|
Agreed, thanks for volunteering to determine what chat/collaboration
software we're going to use, hosting (or letting me know what I need
to do to enable it from the Hangout I'm hosting tomorrow), and
ensuring that non-panelists get their say in :)
|Re: Scientific Python Packaging||Anthony Scopatz||10/22/12 9:11 AM|
So what I think we will do is the following: If any viewer can use the hangout chat
box, we should just do that. If not, then we should invite people to the google doc
of the agenda and use the chat box there. If that fails we can simply use something
|Re: Scientific Python Packaging||Travis||10/22/12 9:14 AM|
Ilan could also host a Goto Meeting if that would help. I believe we can have 25 people on the call.
|Re: Scientific Python Packaging||Dag Sverre Seljebotn||10/22/12 9:27 AM|
Do you mean the solution at gotomeeting.com? From the FAQ: "Currently
Linux operating systems are not supported by GoToMeeting, either to host
or join a meeting". That pretty much rules it out I'd say.
|Re: Scientific Python Packaging||Andy Terrel||10/22/12 9:33 AM|
G+ works well for 10 participants, but I would rather have a meeting
with all the voices can be heard. Gotomeeting can't handle lnux, so
let's use Skype.
Add me to your contacts: andy_terrel
Then at 11 tomorrow I will start a session and connect everyone who
sends me a skype id.
|Re: Scientific Python Packaging||Samuel John||10/22/12 9:54 AM|
Thanks for the invite Aron and Andy!
My name is Samuel John (twitter @samueljohn_de, adn: @samueljohn), I am a (still) a PhD student in computer vision and volunteering in Homebrew to maintain and improve Python/numpy/scipy and related software on Macs.
Perhaps, I can add two cents about compiler, linking, distutils and site-packages on Macs and where to be careful.
looking forward to cu on Skype then (if it can handle the number)!
|Re: Scientific Python Packaging||Yung-Yu Chen||10/22/12 2:28 PM|
My skype account: yungyuc
I think hosting any meeting with more than 10 participants can be challenging, it can be more fun if it's remote. To have a chat room as a secondary channel would be nice, and Skype automatically provides it. But as to the voice quality, I've had pretty bad experience with 5+ participants (3 years ago). Hopefully they have improved it.
BTW the limit of a Skype conference call is 25: https://support.skype.com/en/faq/FA194/can-i-make-conference-calls-with-skype-for-linux.
Look forward to talk/chat with you tonight!
On Tue, Oct 23, 2012 at 12:33 AM, Andy Ray Terrel <andy....@gmail.com> wrote:
+886 (99) 129 4763
|Re: Scientific Python Packaging||moorepants||10/22/12 2:57 PM|
FYI: EVO doesn't have limits on number of participants in a meeting: https://evo.caltech.edu/evoGate/ (or it is just really high, I cant find the limit).
On Friday, October 19, 2012 9:10:31 AM UTC-7, Andy Terrel wrote:
|Re: Scientific Python Packaging||Anthony Scopatz||10/22/12 3:00 PM|
On Mon, Oct 22, 2012 at 4:57 PM, moorepants <moore...@gmail.com> wrote:FYI: EVO doesn't have limits on number of participants in a meeting: https://evo.caltech.edu/evoGate/ (or it is just really high, I cant find the limit).
EVO has never really worked for me, though skype has. I guess you have had better luck.
|Re: Scientific Python Packaging||Yaroslav Halchenko||10/22/12 3:57 PM|
If someone would be eager to setup -- there is a FOSS solution for VOIP conferences: http://mumble.sourceforge.net
Client comes with built-in chat (at least IIRc on linux) and I definitely saw more than a dozen participants once (during Debian squeeze release). If only someone on a thick channel could set it up -- it might be a good solution. There might be publicly available servers (e.g. http://www.mumble-servers.com/public-mumble-server) but not sure how reliable they would be and/or would there be anyone who would lurk there to interrupt. Just in case I setup a room on that public server and I am there ATM (login yoh, room "Python distribution")
|Re: Scientific Python Packaging||Peter Wang||10/22/12 6:34 PM|
Just to clarify: there is a landline call-in number. You won't be able to see any screen sharing on Linux, but I don't know how much of that people planned on doing. If you want robust voice conferencing for 25 people, I think gotomeeting is a pretty good bet. There are free conference solutions, but in my quest for solutions last year, most of them turned out to be pretty bad.
Skype does work pretty well (although I have never tried it with 25 people). One of the nice things about something like gotomeeting is that even if the moderator disconnects, it doesn't drop everyone. That is a limitation of the Skype approach, afaik.
My $.02. I hope you guys have a productive discussion tomorrow!
On Oct 22, 2012, at 11:27 AM, Dag Sverre Seljebotn <d.s.se...@astro.uio.no> wrote:
> Do you mean the solution at gotomeeting.com? From the FAQ: "Currently Linux operating systems are not supported by GoToMeeting, either to host or join a meeting". That pretty much rules it out I'd say.
> Dag Sverre
> On 10/22/2012 06:14 PM, Travis Oliphant wrote:
>> Ilan could also host a Goto Meeting if that would help. I believe we
>> can have 25 people on the call.
>> On Oct 22, 2012, at 11:11 AM, Anthony Scopatz wrote:
>>> On Mon, Oct 22, 2012 at 9:25 AM, Aron Ahmadia <ar...@ahmadia.net
>>> <mailto:ar...@ahmadia.net>> wrote:
>>> Hi Anthony,
>>> > While I understand the limitations of the technology, I think
>>> that this is
>>> > less than ideal. Therefore I suggest that we have a back
>>> channel chat
>>> > where everyone who isn't a "panelist" can type and contribute.
>>> > we deputize one of the panelist to watch the chat and read off
>>> > comments. I would be happy to do this if no one else wants to.
>>> Agreed, thanks for volunteering to determine what chat/collaboration
>>> software we're going to use, hosting (or letting me know what I need
>>> to do to enable it from the Hangout I'm hosting tomorrow), and
>>> ensuring that non-panelists get their say in :)
>>> So what I think we will do is the following: If any viewer can use
>>> the hangout chat
>>> box, we should just do that. If not, then we should invite people to
>>> the google doc
>>> of the agenda and use the chat box there. If that fails we can simply
>>> use something
>>> like http://zippychat.com/
>>> Be Well
|Re: Scientific Python Packaging||Andy Terrel||10/23/12 8:44 AM|
Okay call is in fifteen minutes. IM me on skype if you haven't
received a call from me by 11 AM CST (GMT -5) Chicago time.