The time for the Scipy'09 conference is rapidly approaching, and we
would like to both announce the plan for tutorials and solicit
feedback from everyone on topics of interest.
Broadly speaking, the plan is something along the lines of what we
had last year: one continuous 2-day tutorial aimed at introductory
users, starting from the very basics, and in parallel a set of
'advanced' tutorials, consisting of a series of 2-hour sessions on
specific topics.
We will request that the presenters for the advanced tutorials keep
the 'tutorial' word very much in mind, so that the sessions really
contain hands-on learning work and not simply a 2-hour long slide
presentation. We will thus require that all the tutorials will be
based on tools that the attendees can install at least 2 weeks in
advance on all platforms (no "I released it last night" software).
With that in mind, we'd like feedback from all of you on possible
topics for the advanced tutorials. We have space for 8 slots total,
and here are in no particular order some possible topics. At this
point there are no guarantees yet that we can get presentations for
these, but we'd like to establish a first list of preferred topics to
try and secure the presentations as soon as possible.
This is simply a list of candiate topics that various people have
informally suggested so far:
- Mayavi/TVTK
- Advanced topics in matplotlib
- Statistics with Scipy
- The TimeSeries scikit
- Designing scientific interfaces with Traits
- Advanced numpy
- Sparse Linear Algebra with Scipy
- Structured and record arrays in numpy
- Cython
- Sage - general tutorial
- Sage - specific topics, suggestions welcome
- Using GPUs with PyCUDA
- Testing strategies for scientific codes
- Parallel processing and mpi4py
- Graph theory with Networkx
- Design patterns for efficient iterator-based scientific codes.
- Symbolic computing with sympy
We'd like to hear from any ideas on other possible topics of interest,
and we'll then run a doodle poll to gather quantitative feedback with
the final list of candidates.
Many thanks,
f
I hope the above isn't considered too off-topic here: I'm asking for
possible Sage tutorials and obviously this group would be the most
likely to provide presenters :)
Feel free to provide feedback to me off-list to keep the list traffic
on topic to sage-dev.
Cheers,
f
In order to proceed with contacting speakers, we'd now like to get
some feedback from you. This Doodle poll should take no more than a
couple of minutes to fill out (no password or registration required):
http://doodle.com/hb5bea6fivm3b5bk
So please let us know which topics you are most interested in, and
we'll do our best to accommodate everyone. Keep in mind that speaker
availability and balancing out the topics means that the actual
tutorials offered probably won't be exactly the list of top 8 voted
topics, but the feedback will certainly help us steer the decision
process.
Thanks for your time,
Dave Peterson and Fernando Perez
On Mon, Jun 1, 2009 at 10:22 PM, Fernando Perez<fpere...@gmail.com> wrote:
Does "you" mean "people attending scipy09", or does it mean "sage
developers, whether or not you are attending scipy09"?
Thanks,
Jason
Any feedback is welcome, though it seems to make more sense if you're
coming to the conference, since the poll is about picking topics for
the tutorials sessions. But ideas, feedback, suggestions are in
general welcome either via the poll or via list or private email.
Since Sage is listed as a candidate topic, votes on a Sage tutorial
with comments on specific ideas that someone would be willing to
present would be very welcome. In the end, the actual choices of
topics will be determined by what we can get speakers to present, so
that kind of information is obviously very relevant.
Cheers,
f
On Mon, Jun 1, 2009 at 10:20 PM, Fernando Perez<fpere...@gmail.com> wrote:
> The time for the Scipy'09 conference is rapidly approaching, and we
> would like to both announce the plan for tutorials and solicit
> feedback from everyone on topics of interest.
rather than rehash much here, where it's not easy to paste a table,
I've posted a note with the poll results here:
http://fdoperez.blogspot.com/2009/06/scipy-advanced-tutorials-results.html
The short and plain-text-friendly version is the final topic ranking:
1 Advanced topics in matplotlib use
2 Advanced numpy
3 Designing scientific interfaces with Traits
4 Mayavi/TVTK
5 Cython
6 Symbolic computing with sympy
7 Statistics with Scipy
8 Using GPUs with PyCUDA
9 Testing strategies for scientific codes
10 Parallel computing in Python and mpi4py
11 Sparse Linear Algebra with Scipy
12 Structured and record arrays in numpy
13 Design patterns for efficient iterator-based scientific codes
14 Sage
15 The TimeSeries scikit
16 Hermes: high order Finite Element Methods
17 Graph theory with NetworkX
We're currently contacting speakers, and we'll let you know once a
final list is made with confirmed speakers.
Cheers,
f
I have to add that not only is Sage very low on the above list, Sage
got the *most* "no" votes from the 30 people who actually voted (tying
only with Networkx), according to the table here:
http://fdoperez.blogspot.com/2009/06/scipy-advanced-tutorials-results.html
I don't know if I should interpret this as:
(1) Sage doesn't at all provide what is needed by "the scipy community", or
(2) The scipy community has a strong opinion that in fact sage is
worse than useless.
It might also be relevant that Sage, Hermes, and Networkx (in the
bottom 4) are all GPL'd, but the top 7 packages by interest in the
list above are all non-GPL (BSD or MIT licensed). It may just be
that whoever voted are mostly people who believe they can't use GPL'd
code.
Anyway, I find Fernando's justification for the ranking "the ranking
roughly follows the generality of the tools" to be an unsatisfactory
explanation or summary of the data. Rather, perhaps the ranking
roughly follows the restrictiveness of the *license*.
William
That's very consistent with the license remark I made, since the scipy
community generally tends to avoid GPL'd projects for reasons that make
a lot of sense for them.
-- William
I'm disappointed with the lack of interest in Sage as well.
It matches my experience though. I believe I'm one of the more
Sage-enthusiastic NumPy mailing list participants (because of Cython +
Sage days last year); however, much as I'd like to, I find I have to
give up Sage for my day-to-day work after 10 minutes each time I try,
and always end up back in IPython.
I do have lots of ideas for improving the state in various areas, and it
shouldn't take all that much work either -- but have been very reluctant
to talk about it because I should have time to actually do something
about it -- ideas are cheap, show me your code, and all that (and
Cython's really been taking the time I have to offer).
The fact that the Cython/NumPy support doesn't even work in the notebook
(*that* I might just have to do something about myself soon though), or
that symbolic expressions can't be evaluated on numeric arrays, using
e.g. numexpr (unless added recently?) says a lot about the situation.
If you're coming to SciPy 09 (I see you're in the committee) then a
Sage+numerics BOF would be very interesting.
> It might also be relevant that Sage, Hermes, and Networkx (in the
> bottom 4) are all GPL'd, but the top 7 packages by interest in the
> list above are all non-GPL (BSD or MIT licensed). It may just be
> that whoever voted are mostly people who believe they can't use GPL'd
> code.
>
> Anyway, I find Fernando's justification for the ranking "the ranking
> roughly follows the generality of the tools" to be an unsatisfactory
> explanation or summary of the data. Rather, perhaps the ranking
> roughly follows the restrictiveness of the *license*.
--
Dag Sverre
On Wed, Jul 1, 2009 at 3:57 AM, William Stein<wst...@gmail.com> wrote:
> I have to add that not only is Sage very low on the above list, Sage
> got the *most* "no" votes from the 30 people who actually voted (tying
> only with Networkx), according to the table here:
>
> http://fdoperez.blogspot.com/2009/06/scipy-advanced-tutorials-results.html
>
> I don't know if I should interpret this as:
>
> (1) Sage doesn't at all provide what is needed by "the scipy community", or
>
> (2) The scipy community has a strong opinion that in fact sage is
> worse than useless.
I have to say that *personally* I was both surprised and disappointed
at the low ranking Sage got, since I was hoping to have a Sage
tutorial again this year. On every talk I give on scientific
computing with Python I always make a point of highlighting Sage, I
use it myself and I think it's a key asset of a larger ecosystem of
open source tools for scientific computing based on Python. But all
I'm doing here is reporting the numbers as they came: the only change
I made to the raw Doodle data was to remove the voters' names and
transpose the table for readability (Doodle returns each topic as a
column, which is annoying to read).
> It might also be relevant that Sage, Hermes, and Networkx (in the
> bottom 4) are all GPL'd, but the top 7 packages by interest in the
> list above are all non-GPL (BSD or MIT licensed). It may just be
> that whoever voted are mostly people who believe they can't use GPL'd
> code.
>
> Anyway, I find Fernando's justification for the ranking "the ranking
> roughly follows the generality of the tools" to be an unsatisfactory
> explanation or summary of the data. Rather, perhaps the ranking
> roughly follows the restrictiveness of the *license*.
That was just a hand-wavy argument, since ultimately I can't know why
people voted the way they did. I should note that Networkx is LGPL,
not GPL; see https://networkx.lanl.gov/trac/browser/networkx/trunk/doc/GNU_LGPL.txt
and https://networkx.lanl.gov/trac/browser/networkx/trunk/networkx/release.py,
which specifically contains
1 """Release data for NetworkX."""
2
3 # Copyright (C) 2004-2008 by
4 # Aric Hagberg <hag...@lanl.gov>
5 # Dan Schult <dsc...@colgate.edu>
6 # Pieter Swart <sw...@lanl.gov>
7 # Distributed under the terms of the GNU Lesser General Public License
8 # http://www.gnu.org/copyleft/lesser.html
So the bottommost project (which as I mentioned, I also love and
wanted to get a tutorial on) is not GPL.
I should have definitely qualified my 'generality' comment by pointing
that Sage was the outlier in this (weak) pattern: while finite
elements, time series and graph theory are specialized topics, sage is
a super-broad-spectrum tool, so it definitely doesn't fit that trend.
That point was clear to me and did surprise me very much, I just
failed to mention it last night (I wrote that blog post rather in a
hurry, tired, and was mostly annoyed at blogspot's mangling of the
table html).
I sort of doubt that most people would make their decisions on what
tools to learn based on licenses, or at least I hope that's the case.
But perhaps I'm wrong on that and just naive...
In any case, all I can do is report back the results as they came. In
this case, I'm just the messenger :)
Cheers,
f
To be precise: amongst open source tools. I do use licenses as a
criterion: if choosing between a proprietary and an open source tool I
lean towards the open one whenever feasible. But otherwise I'm happy
to use, and contribute to when I can, anything that's open source in
any license. But that's just me :)
Cheers,
f
Perhaps I'm missing the point, but I'm taking this as a message to
focus in Sage more on the algebraic/symbolic side of mathematics
(e.g., Magma, Maple, Mathematica) rather than the numerical side, at
least for the time being. I don't have a problem with that
personally, since that is what I do best, and where most of my
personal interests are.
My impression is that Enthought is the overall the leader in the
effort to create and distribute scientific computing tools using
Python. The founders of the company have a clear passion and love
for this, and seem from the outside at least to have simultaneously
done well for their clients and developer and user base, while walking
the tightrope of commercial versus open source. Part of that
balance has been for the most part drawing a line and *not* having GPL
or LGPL code in the core of their codebase. I do not in any think
that is "morally wrong" (I obviously prefer it to the situation with
my Microsoft neighbors). However, since Sage is a GPL'd project, this
has the natural corollary that almost no two-way technical interaction
is possible between the two projects. As result, the Sage project and
the Enthought/Python stack tend to compete for users rather than share
them, since they really are two different platforms (at least at some
layers, especially the GUI/graphics layers and distribution system).
I think it's roughly reasonable to call the top 7 most popular topics
in your tutorial list basically "the Enthought scientific computing
stack". The bottom four are (L)GPL'd, one is Sage and another in
Sage.
The best conclusion I can draw from all this is that for now at least
I'm going to focus on symbolic/algebraic computation, and let
Enthought continue to do a great job building the Python numerical
stack. If at some point users in the numerical Python community
really want what Sage has to offer, maybe they will do the extra work
to make Sage work for them. If not, they still have a great
Sage-compatible platform on which to build their work. No matter
what happens users win.
Perhaps "numerical Python people" are the right people to make Sage
very numericaly Python friendly. The vast majority of Sage developers
are not "numerical Python people", and so maybe we have no clue what
should be done or how to make Sage what you guys want. I know very
well what number theory researcher mathematicians need out of Sage,
and I can't imagine that say Dag knows what number theory research
mathematicians need, nor should he, and even if I explained it in
detail, I wouldn't expect him to do the work of implementing it.
The remaining people -- like Brian Granger, Ondrej Certik, etc., --
are clearly already doing what numerical folks want wrt Sage, which is
to remove almost everything in Sage that is of interest to 95% of Sage
users/developers (groups, rings, fields, matrices, 2d and 3d plotting,
etc.)., and making a distribution (SPD) that satisfies precisely their
needs.
I think I'm not uncomfortable with any of the above, unless of course
I'm totally wrong, in which case I would like to know why.
-- William
Personally, I find the combination of sage's notebook interface + numpy
useful, and I hope that eventually sage and numpy will work better
together. I am keeping an eye on SPD, and may eventually switch to it
instead of sage.
I recognize that there are limited developer resources, and it is better
to restrict the target feature set and do a few things very well than it
is be to do a wider range of things poorly. In the longer term though,
I would welcome improved integration of sage and numpy/scipy. I don't
currently have the skills nor the time to write code, but I will be
happy to act as a tester.
--
Kevin Horton
This thread is shaping up to be a pretty good todo list of specific
doable steps one could take to improve the numpy/scipy user experience
in Sage. Keep 'em coming! Next time some undergrad with strong
numerical interests wants to work for me on Sage for a quarter, I
could easily give the exactly this list, tell them that doing
everything on might "save the world", and see what happens.
William
I think you are all drawing too many conclusion from the poll -- it's
just 30 people, so that's nothing and I definitely not think that this
is the whole scipy community (if it was 200 people, then I would take
it seriously). I can see that William is disappointed, and I am half
--- sympy made it through, but finite elements (which is the project I
am actually paid for) didn't and actually ended up just before
NetworkX -- but I need to tease Aric here that even FEM ended above
him. :)
Besides I think the poll was for a tutorial for this particular
conference --- I forgot how I vote for Sage, but I remember that I
picked those tutorials, that I may actually learn something that I
need for my work. E.g. like traits or cython.
As to Sage + numerics --- I want that to work and I am actually trying
to get something done in this area. For example last couple weeks I
already spent *dozens* of hours (yes, it's a lot, but I think it's
just that I am not particularly good at these things) gettting various
packages to compile in SPD, and any Sage users can immediately benefit
from that, since you just install that spkg inside Sage.
Besides that, we want our finite element software to be working well
with Sage too. Here is my latest progress:
http://code.google.com/p/femhub/
I made 4 releases in 5 days. For Sage users, I suggest to just use
Sage. SPD is a base for projects like femhub, where people need to
create a Sage like project, but with our own packages. Sage itself
compiles too long and as William pointed out, we don't need 95% of the
packages there --- also I *need* the thing to build on our clusters
--- and so far Sage can't do it, so I decided to concentrate on
packages that I really need for my work and make sure at least those
compile.
Also, we want our own branding, e.g. call it femhub, so I will soon do
some changes to the notebook spkg.
In the long run, I expect that Sage will grow and it will include our
improvements and packages and distribute everything at once (it will
be couple hundreds MB bigger though). And it will contain both
symbolic and algebraic/number theoretic stuff and numerics stuff like
finite elements + related visualisation tools like mayavi...
And I also expect that people will continue branding Sage for their
needs, e.g. in our case we'll just call it femhub and that will be our
own thing that we do. And that below the hood it's just sage and it's
compatible with sage, then Sage users can just continue using Sage (+
our packages) and don't care.
As to the license, that is not an obstacle for me (I am talking about
the license for the whole thing), but it is for Enthought and I will
definitely talk with them what can be done, so that they can use the
same model as Sage. Honestly I think that if the Sage scripts can be
made BSD, or rewritten from scratch to be BSD (and I have looked at
that and I think it's definitely doable), but keeping it *compatible*
with Sage (that's the crucial part), then Enthought and other people
can just take BSD spkg packages and release what they want with it, or
sell it or do whatever they want with it, and Sage itself (that
contains lots of GPL packages and also lots of new code that is also
GPL) will just continue being GPL.
Honestly I think the build infrastructure should be BSD, but I know
that some people are against it. Rewriting it from scratch is doable
but very low on my priority list, because I also use GPL packages (and
actually most of my colleagues in our group are against BSD too,
ironically:)), so femhub has to be GPL anyway. But getting Enthought
involved in this is imho a good thing, so I will definitely talk with
them about it at scipy.
Ondrej
I think something important is missing from the picture:
NumPy/SciPy isn't exactly a majority player either! In large parts of
science and engineering the big M's (mostly MATLAB), Fortran and to some
extent C++ are the only tools people have even heard of. (In my department
few have even heard about Python.)
Looking ahead, it might be that Mathematica is what is likely to supersede
MATLAB, not any form of Python (according to one source of opinion -- I
don't know much about this myself).
Now SciPy, EPD, SPD etc. is great for people who know programming, and who
want a better mix of software engineering and numerics/science packages.
But, I don't see them ever becoming the simple, unified mathematical
package which engineers could learn as their first tool in college. (And
where 1/10 is by default something decent, yet numerics easily
available...)
I see in Sage (proper, not SPD!) the hope of something I really, really
want, and which I think SciPy/Enthought/SPD isn't even trying to do.
Obviously, the SciPy conference people are the selection of people who
wants what the SciPy stack does though.
The prime audience of a hypothetical numerics-boosted Sage are all of
those who are likely unaware of the existance of Python in the first
place, and those obviously haven't voted here (many of them don't even
have the software skills to attend SciPy 09).
All I can do though is ask you not to close the door for numerics if and
when somebody steps up to lead the charge.
Dag Sverre
At this point we may have drawn more than 30 conclusions :-)
> As to Sage + numerics --- I want that to work and I am actually trying
> to get something done in this area. For example last couple weeks I
> already spent *dozens* of hours (yes, it's a lot, but I think it's
> just that I am not particularly good at these things) gettting various
> packages to compile in SPD, and any Sage users can immediately benefit
> from that, since you just install that spkg inside Sage.
Let me know if any of those should be included in the sage optional
package repo. It's easy for me to add spkg's there if they work well.
This is worth further discussion. The spkg "format" is very
straightfoward/simple. The actual build scripts have had many little
changes by a lot of people, so making them BSD'd is probably
impossible, even if I wanted to. For example, at least one person who
worked with the build scripts a lot is totally MIA.
The design is I think pretty straightforward, and I like the idea of
doing a rewrite that is BSD'd, and which works the same as the current
scripts, with exactly the same spkg files. We could do this in
Python, which might make the whole thing simpler and easier to
maintain, and more portable. (Every standard Linux/OS X install has
some python installed systemwide, and we could just make sure the
scripts will work with that.)
Anyway, +1 to their being a BSD'd build system. Most code in Sage
is GPL'd because either (1) it is derived from code GPL'd a decade
ago, or (2) we'll get ripped off by the Ma's. The build system
doesn't fall into either category.
-- William
I think you're absolutely 100% right. I received other email offlist
from people pointing out exactly the same point. Many thanks for
the above clarification. I indeed did completely miss the point.
OK, any volunteers to lead the charge? :-)
-- William
Exactly, that's what I thought too. Also +1 about the python build
system (but bash/sh is ok too).
On Wed, Jul 1, 2009 at 4:22 PM, William Stein<wst...@gmail.com> wrote:
>
> On Thu, Jul 2, 2009 at 12:15 AM, Dag Sverre
> Seljebotn<da...@student.matnat.uio.no> wrote:
[...]
>> I think something important is missing from the picture:
>>
>> NumPy/SciPy isn't exactly a majority player either! In large parts of
>> science and engineering the big M's (mostly MATLAB), Fortran and to some
>> extent C++ are the only tools people have even heard of. (In my department
>> few have even heard about Python.)
>>
>> Looking ahead, it might be that Mathematica is what is likely to supersede
>> MATLAB, not any form of Python (according to one source of opinion -- I
>> don't know much about this myself).
>>
>> Now SciPy, EPD, SPD etc. is great for people who know programming, and who
>> want a better mix of software engineering and numerics/science packages.
>> But, I don't see them ever becoming the simple, unified mathematical
>> package which engineers could learn as their first tool in college. (And
>> where 1/10 is by default something decent, yet numerics easily
>> available...)
>>
>> I see in Sage (proper, not SPD!) the hope of something I really, really
>> want, and which I think SciPy/Enthought/SPD isn't even trying to do.
In fact, we are trying to do exactly this with femhub, for finite
element calculations, to be the thing that engineers that never heard
of Python (or never used that) could easily use it.
SPD is just the first step that can get us started to customize Sage.
>> Obviously, the SciPy conference people are the selection of people who
>> wants what the SciPy stack does though.
>>
>> The prime audience of a hypothetical numerics-boosted Sage are all of
>> those who are likely unaware of the existance of Python in the first
>> place, and those obviously haven't voted here (many of them don't even
>> have the software skills to attend SciPy 09).
>>
>> All I can do though is ask you not to close the door for numerics if and
>> when somebody steps up to lead the charge.
>>
>> Dag Sverre
>>
>
> I think you're absolutely 100% right. I received other email offlist
> from people pointing out exactly the same point. Many thanks for
> the above clarification. I indeed did completely miss the point.
>
> OK, any volunteers to lead the charge? :-)
You of course. :)
Ondrej
:-) OK everybody, let's charge!!!!!!!!!!!!!!
http://www.americanrhetoric.com/MovieSpeeches/specialengagements/moviespeechbraveheart.html
-- William
-- William
I'm joining this conversation late, so I am glad to see the
conclusions reached so far (not to give up on numerics!).
If I may highlight a distinction (maybe obvious to some) between SAGE
and NumPy-based experiments:
Sage provides a "language" for eloquently expressing
algebraic/symbolical problems. On the other hand, NumPy is mainly a
library (that provides a data structure with accompanying operations).
This means that users of that library expect to run their code
unmodified on any Python platform where it is available (Sage
included). Whether this expectation is reasonable or not is up for
debate, but I certainly found it surprising that I had to modify my
code in order to compute things in Sage.
On a more practical level, it frightens me that Maxima spawns so
easily without my even knowing, simply by refering to a certain
variable or by using the wrong "exp". That's the kind of thing that
kills numerics performance!
I'm not convinced that commercial packages have crossed this barrier
successfully either. I've seen talks where people switch between
Maple and MATLAB to do different tasks, which tells me that this is a
more general problem that is far from solved.
Stéfan
Either that, or you click on the "python" switch at the top of the
notebook or type "sage -ipython", or from within Sage you type
"preparser(False)".
> On a more practical level, it frightens me that Maxima spawns so
> easily without my even knowing, simply by refering to a certain
> variable or by using the wrong "exp".
FYI, that is no longer the case. In Sage-4.0, we replaced Maxima by
the C++ library Ginac (http://www.ginac.de/) for all basic symbolic
manipulation.
> That's the kind of thing that kills numerics performance!
There is often a tension between numerics performance and correct
answers. The following is in MATLAB:
>> format rat;
>> a = [-101, 208, 105; 76, -187, 76]
>> rref(a)
ans =
1 0 -2567/223
0 1 -3839/755
The same echelon form in Sage:
a = matrix(QQ, 2, [-101, 208, 105, 76, -187, 76])
a.echelon_form()
[ 1 0 -35443/3079]
[ 0 1 -15656/3079]
Trying the same computation on larger matrices, and one sees that
matlab is way faster than Sage. But of course the answers are
nonsense... to anybody not doing numerics. To a numerical person they
mean something, because matlab is really just doing everything with
floats, and "format rat" just makes them print as rational
approximations to those floats.
So indeed, mixing numerics with mathematics is a very difficult
problem, and nobody really seems to have solved it to everybody's
satisfaction.
-- William
I think people need both approaches, but I why you cannot just pass an
option to echelon_form() to use fast floating point numbers (besides
nobody yet implementing it)? Then we can have both.
Ondrej
Because it is pretty easy to do:
A.change_ring(RR).echelon_form()
which also allows things like
A.change_ring(RealField(200)).echelon_form()
for extended precision.
Is this not sufficient?
Jason
If it's as fast as numpy, then I think it's sufficient.
Ondrej
Numpy does not do rref because it has limited utility for approximate
numeric matrices. See this thread:
http://www.mail-archive.com/numpy-di...@scipy.org/msg13880.html
If you want to have Sage apply the generic algorithm (*not* using
partial pivoting!) to a numpy matrix, you can do
A.change_ring(RDF).echelon_form() (this actually uses numpy arrays
behind the scenes). As pointed out in the thread noted above, this may
just end up being nonsense (as it is with Matlab in the above example!).
I think the point here is that Matlab obscures the truth, and it can't
do any better. In Sage, if it looks like your matrices contain
fractions, then they really do have exact fractions. If your matrix
actually contains approximate floating point numbers, then Sage doesn't
lie to you and pretend it has nice exact-looking fractions. This makes
for some interesting class discussions in linear algebra, especially
when you have lots of engineering students :).
Jason
Ok, I didn't realize this example is nonsense for floating point
numbers, as explained in the numpy thread. I agree that matlab can't
do better with this.
Ondrej
I think you've hit the nail on the head here. These were my first
thoughts (other than disappointment) when I saw how low Sage got
ranked in the list. I think Sage has a huge amount to offer by
bringing the power of the SciPy stack together with the ease of use
of the notebook (interact, visualization, etc.). Of course there's
still a lot of work to do here...
Currently the way to do numerics in Sage is to import scipy and numpy
(because they really have created a good stack), and turn off
preparsing (because those type issues get really annoying). At this
point, it may become unclear why one is using Sage instead of the
SciPy stack itself. The fact that most Sage examples and discussion
revolves around "esoteric" things like Rings and Categories and
Modular Forms just promotes this view.
Dag, if you put together a list of why Sage isn't good enough for
you, that would be very interesting. I wonder if many things on that
list would be desirable for non-numerical use as well.
- Robert
It's dangerous to ask this of me -- it's roughly in order of priority so
feel free to stop when you got what you wanted :- )
(Any of this might have been changed recently, I haven't upgraded for a
while to be honest).
The minimum to make me start using Sage:
1) I must be able to use NumPy together with the preparser (it's just
too much hassle to turn it on and off, and it kind of defeats the
purpose.). That is, with the preparser on, I should be able to run most
NumPy-using code without changes. (I don't think is a difficult to
achieve, but certainly didn't look at it in detail.)
One example is:
sage: np.arange(4)
array([0, 1, 2, 3], dtype=object)
This is not what I want, and I can never remember to pass in
dtype=np.int64. I don't think it makes sense either -- passing in
np.arange(int(4)) gives the desired behaviour, and a Python int and a
Sage Integer are equally far from an np.int64 anyway.
I think that's all, actually -- this would stop Sage from getting in my
way. But this is on my strongly wanted-list too:
2) sin, sqrt etc. should understand, act on, and return NumPy arrays
(probably by calling corresponding functions in numpy)
3) It would be nice with better plotting support for NumPy arrays, so
that I don't have to use pylab directly, but haven't given any thought
to what I want here.
4) Not sure if this can be done in a reasonable way, but: I'd like to
not have to use Python ints/floats at all, it's just nicer to know that
I have a Sage element. So ideally
sage: a=np.arange(5, dtype=np.uint8)
sage: a[2]/a[4]
1/2
5) #4571 (cimport numpy in notebook), obviously.
6) One of the things drawing me towards using Sage is the "attach"
feature. However, it is too limited for my use (and so makes me
frustrated, so that it is better not to use it at all). Something being
a combination for an "attach" and a pure Python "import" is what I want;
that is, "import" with automatically calling "reload" on change. (The
recent work in pyximport may come into play here; if the sources are in
Cython.)
7) Make NumPy dtypes for all Sage rings. I think that NumPy is quite
extensible in this respect and that it shouldn't be too difficult to
have arrays over Sage rings.
sage: a = np.arange(10, dtype=ZZ.numpy()) # or dtype=ZZ if possible
8) For syntax consistency etc., a simple function to create SIMD arrays.
sage: x = array(RDF, 20, 20, 20) # 20x20x20 array
Whether a wrapper is returned or a raw NumPy array, it is important to
have elementwise operators and allow slices. Whether a slice should be a
view, a copy, or sometimes a view and sometimes a copy (like in NumPy
:-( ) seems to be debatable here though.
9) Long-term, linear algebra in a data processing context Done Right
(According To Myself). I.e.:
sage: parent(img)
Array of shape (100, 100, 3) over uint8
sage: parent(weights)
Array of shape (100, 100, 3) over Real Double Field
Now, componentwise multiply the two the proper way; treating img as a
vector and the reweighting as an operator matrix:
sage: op = diagonal_matrix(weights); parent(op)
Diagonal MatrixSpace of (100, 100, 3) by (100, 100, 3) matrices
over Real Double Field
Access as six-dimensional array; first three indices is the row, the
next three indices the column:
sage: op[1,2,3,4,5,6]
0.5
sage: result = op * vector(img)
sage: result
Vector space of dimension (100, 100, 3) over uint8
(Here, * is the linear algebra *, while result contains the elementwise
product of weights and img).
--
Dag Sverre
Sorry, this would always be 0. op[1,2,3,1,2,3] could be 0.5 though.
I asked on the numpy list a while ago about why numpy was not calling
the conventional .__complex__() to automatically convert sage complex
numbers to python complex numbers. The answer I received indicated that
it would be very difficult for numpy to use the standard python
convention of calling .__complex__() to get a complex representation of
a number. That indicated to me that the problem was probably hard-coded
in numpy!
See
http://thread.gmane.org/gmane.comp.python.numeric.general/25251/focus=25273
"The reason is that PyArray_ISCOMPLEX is used in various places, and
this is a
hard-coded macro. There is no way to extend numpy's complex behavior to
support user added types. I wish
there was."
However, I haven't looked at it beyond that. So, pay no attention to
the complainers and whiners; just do it!
> 2) sin, sqrt etc. should understand, act on, and return NumPy arrays
> (probably by calling corresponding functions in numpy)
>
> 3) It would be nice with better plotting support for NumPy arrays, so
> that I don't have to use pylab directly, but haven't given any thought
> to what I want here.
>
> 4) Not sure if this can be done in a reasonable way, but: I'd like to
> not have to use Python ints/floats at all, it's just nicer to know that
> I have a Sage element. So ideally
>
> sage: a=np.arange(5, dtype=np.uint8)
> sage: a[2]/a[4]
> 1/2
>
> 5) #4571 (cimport numpy in notebook), obviously.
>
> 6) One of the things drawing me towards using Sage is the "attach"
> feature. However, it is too limited for my use (and so makes me
> frustrated, so that it is better not to use it at all). Something being
> a combination for an "attach" and a pure Python "import" is what I want;
> that is, "import" with automatically calling "reload" on change. (The
> recent work in pyximport may come into play here; if the sources are in
> Cython.)
>
> 7) Make NumPy dtypes for all Sage rings. I think that NumPy is quite
> extensible in this respect and that it shouldn't be too difficult to
> have arrays over Sage rings.
>
> sage: a = np.arange(10, dtype=ZZ.numpy()) # or dtype=ZZ if possible
If this is possible, doesn't it take care of item (1) and (4)?
Thanks,
Jason
That's what I'm thinking too -- even if a change isn't accepted
upstream, Sage could patch it's own version of NumPy.
>> 2) sin, sqrt etc. should understand, act on, and return NumPy arrays
>> (probably by calling corresponding functions in numpy)
>>
>> 3) It would be nice with better plotting support for NumPy arrays, so
>> that I don't have to use pylab directly, but haven't given any thought
>> to what I want here.
>>
>> 4) Not sure if this can be done in a reasonable way, but: I'd like to
>> not have to use Python ints/floats at all, it's just nicer to know that
>> I have a Sage element. So ideally
>>
>> sage: a=np.arange(5, dtype=np.uint8)
>> sage: a[2]/a[4]
>> 1/2
>>
>> 5) #4571 (cimport numpy in notebook), obviously.
>>
>> 6) One of the things drawing me towards using Sage is the "attach"
>> feature. However, it is too limited for my use (and so makes me
>> frustrated, so that it is better not to use it at all). Something being
>> a combination for an "attach" and a pure Python "import" is what I want;
>> that is, "import" with automatically calling "reload" on change. (The
>> recent work in pyximport may come into play here; if the sources are in
>> Cython.)
>>
>> 7) Make NumPy dtypes for all Sage rings. I think that NumPy is quite
>> extensible in this respect and that it shouldn't be too difficult to
>> have arrays over Sage rings.
>>
>> sage: a = np.arange(10, dtype=ZZ.numpy()) # or dtype=ZZ if possible
>
> If this is possible, doesn't it take care of item (1) and (4)?
Not necesarrily (4), no, although they may be linked somewhat in the
API, I didn't check that.
Which dtype is used is at least conceptually somewhat independent of
what Python object is used to represent its elements. E.g. both arrays
with uint8 and int64 would convert to/from a Python int on a Python access.
Regarding (1), what I meant was to have ZZ elements input to arange
result in an array of int64 (on a 64-bit system), like if the preparser
wasn't in effect.
I suppose if ZZ is made a possible NumPy dtype it may make sense to have
that be the default -- it is just surprising because then NumPy would
have "different default types" depending on whether the preparser is on
or not.
--
Dag Sverre
Here's an email I recently got from a Sage users that probably nicely
illustrates this point, namely that Sage mainly appeals to those who
use MATLAB/Scilab say (not numpy). This is posted here with his
permission.
from Jordan Alexander to wst...@gmail.com
date Fri, Jul 3, 2009 at 6:15 PM
subject Bravo Sage!
Dear William Stein:
Writing to congratulate all of you for creating Sage!
I am a PhD student of astrophysics (theory and observations of radio
recombination lines) and recently started using web-based Sage to
explore a a mathematical aspect of my topic. I have gotten more done
on this topic using Sage than I thought possible...
As previous user of matlab and maple and a current user of scilab, I
find Sage a great evolutionary step forward!
Much respect,
Jordan
I just want to make a quick comment that -- except for "object" --
there are no dtypes that aren't homogenous data types. ZZ would be
totally different than all other (non object) datatypes in numpy,
since the size of an element depends on the element. There was a
challenge at the Sage days at Enthought to add mpz_t as a dtype to
numpy, but it didn't end up going anywhere. I personally don't
think it is likely that ZZ will ever be a numpy dtype....
-- William
The fix below didn't take long at all to find. Here they're talking
about using a user-defined type as a complex. This would be nice, but
even being able to cast a user-defined type to a complex is the most
important part. I believe this has been resolved in 2.6 (see below).
>>>> 2) sin, sqrt etc. should understand, act on, and return NumPy
>>>> arrays
>>>> (probably by calling corresponding functions in numpy)
Not even the builtin math.sin does this. To bad numpy arrays don't
have a sin method, or it would already work. It shouldn't be to hard
to support (generically), though for speed reasons we avoid importing
numpy unless we want to use it.
Hopefully http://trac.sagemath.org/sage_trac/ticket/6497 isn't too
hackish.
>>>> 3) It would be nice with better plotting support for NumPy
>>>> arrays, so
>>>> that I don't have to use pylab directly, but haven't given any
>>>> thought
>>>> to what I want here.
Sure.
>>>> 4) Not sure if this can be done in a reasonable way, but: I'd
>>>> like to
>>>> not have to use Python ints/floats at all, it's just nicer to
>>>> know that
>>>> I have a Sage element. So ideally
>>>>
>>>> sage: a=np.arange(5, dtype=np.uint8)
>>>> sage: a[2]/a[4]
>>>> 1/2
This really doesn't make sense--a[2] and a[4] are np.uint8 scalars.
We'd really have to hack numpy to get Integers back. However, this
would be like changing range.
>>>>
>>>> 5) #4571 (cimport numpy in notebook), obviously.
+1. Thanks for fixing numpy.pxd.
What may be possible, though I'm not sure how easy, is a dtype that
is internally an int64 (or float64, etc.) with all the overflow
issues, but whose getitem creates an Integer/RDF. It looks like user-
defined types are quite powerful, but still have some rough edges.
See http://trac.sagemath.org/sage_trac/ticket/5081 .
----------------------------------------------------------------------
| Sage Version 4.0.2, Release Date: 2009-06-18 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: from scipy import stats
sage: stats.uniform(0,15).ppf([0.5,0.7])
array([ 7.5, 10.5])
sage: numpy.array([1, 2, 3.0])
array([ 1., 2., 3.])
sage: numpy.array([1, 2, 3.0]).dtype
dtype('float64')
sage: numpy.array([3.000000000000000000000000000000000000000000000])
array([3.000000000000000000000000000000000000000000000], dtype=object)
sage: numpy.array([1, 10, 100]).dtype
dtype('int64')
sage: numpy.arange(5)
array([0, 1, 2, 3, 4], dtype=int64)
Still doesn't fix the complex case (which is a different issue),
sage: import numpy
sage: numpy.array([RDF(2)], dtype=complex)
array([ 2.+0.j])
sage: numpy.array([CDF(2)], dtype=complex)
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
TypeError: a float is required
Better news, this may be fixed with PyComplex_AsCComplex - http://
bugs.python.org/issue1675423 (see also numpy/core/src/
arraytypes.inc.src:174)
- Robert
> See http://trac.sagemath.org/sage_trac/ticket/5081 .
>
> sage: numpy.array([1, 10, 100]).dtype
> dtype('int64')
Following up on this, I've also posted http://trac.sagemath.org/
sage_trac/ticket/6506 . This brings up an interface question, and I
thought I'd bring it up here. With the latest patch, we have the
following behavior:
Integer -> long (if it fits), then int64 (if it fits) and object
otherwise
Real/Complex -> float64 (if it's less than 57 bits, which is the
cuttoff for RealNumber(repr(float(x))), and object for greater prec
Rational -> same as Integer for integral, float64 otherwise
I figured this is mostly for numerical use (whenever I import numpy,
that's the mode I'm in), and the real advantage of numpy is when
using native types. If I want exact linear algebra then I'd be using
Sage types anyways, not numpy. Does this seem reasonable?
- Robert
Sounds reasonable to me.
I must say I disagree rather strongly with your view of NumPy though. The
real advantage of NumPy (for certain purposes) is the semantics, which is
not focused around doing linear algebra at all. (I've written about this
at length earlier though so won't repeat myself.)
The link between NumPy semantics and fixed-size types is mostly historical
and also because number crunching apps use the CPU they can get, but it's
no reason it has to be that way.
Being able to very easily experiment using higher-precision floats to see
if there were numerical and/or overflow problems without changing
calculation formulation has a certain appeal...
(I wouldn't really use it myself though, mainly because when passing data
into Fortran libs it must be of native type anyway. And I see that there's
some technical reasons why Sage types can't trivially be made NumPy
dtypes. So I'm not asking you to do anything differently, just disagreeing
with how you see NumPy.)
Dag Sverre
BTW, thanks a lot for looking at these issues, it looks promising! Come
autumn I'll make a serious attempt at using Sage proper and report back
(or even provide a patch) if I still give it up.
Dag Sverre
That is a good point. I should have said "whenever I use NumPy/
SciPy." It's probably still fair to say that the "average" (whatever
that means) NumPy users as a whole is more concentrated around the
numerics side of things than the "average" Sage user, but you're
right that the power of ndarray is about its broadcasting, slicing, etc.
> BTW, thanks a lot for looking at these issues, it looks promising!
> Come
> autumn I'll make a serious attempt at using Sage proper and report
> back
> (or even provide a patch) if I still give it up.
Good to hear.
- Robert