The bottom line (for me) is at the end of the message :).
Also, you'll notice I'm starting a new thread. We probably should have
done that when the discussion moved to bigger things than Geogebra.
I'll try to summarize main points here; apologies if I misrepresent
someone's comments or contributions. For reference, the original thread
is at
http://groups.google.com/group/sage-devel/browse_thread/thread/7a1420263df5d5e3
First point: a standard API for communication with Sage.
Summary: Ted proposed a standard API to talk with a Sage session, taking
the Mathematica API as an example. He has done a lot of work
researching alternatives and even constructing an optional spkg and
example using JSON (see http://trac.sagemath.org/sage_trac/ticket/1510)
Several other people in the project have expressed views that the
Mathematica API as too heavyweight for Sage and there has appears to be
some confusion about what introducing a standard API would mean for the
project.
My thoughts: I'd like to continue the discussion about an API. I've
extended the notebook (slightly) and so I've had some exposure to the
notebook "API" for communicating with Sage. In my experience, that
communication falls into two parts: the communication between the
notebook and Twisted and the communication between Twisted and the sage
session. The communication between Twisted and the notebook consists of
passing text blocks (the sage commands in a cell) and other fixed-field
information, like the time it took to run a cell, whether to create a
new cell, etc. The API there is very rigid and very notebook-centric
(it was evolved as the notebook evolved). The communication between
Twisted and the sage session is entirely a pexpect stdin/stdout type of
communication. Nothing fancy or anything.
I now focus on the JSON patch above, since that is the relevant code
that is currently on the table. Ted proposed using JSON and gave an
example in the ticket mentioned above a while ago. Personally, I'm
excited about a flexible way to communicate with Sage, especially if it
is standard and lightweight. Robert Bradshaw posed a technical question
about handling Cython classes and (to me and apparently Robert) the
question didn't seem to be answered fully. How would you pass a Cython
object back to the client (I think that's the question, but I don't know
very much about pickling, etc.)? I have been waiting for the discussion
to reach the point that I could contribute something useful, but
apparently it already has: what is needed is cheering, so "Yeah! We're
on the right track; let's keep going!" :). For the record, my lack of
participation has nothing to do with a bias against Java; I'm sorry Ted
got the impression that because some core developers hate Java that all
of us do. My guess is that most of us haven't chimed in because we
don't have an opinion or don't really care what language is used for
client-side projects.
On the technical side, I contributed a patch or two that modified the
display hook in python to display typeset expressions in the notebook,
which wasn't a big contribution, but it did help me understand that it
is very easy in python to commandeer the communication back to the user.
I think developing something so that Sage returned a JSON expression
instead of, for example, typeset output, would be fairly easy, but the
contents of the JSON expression may not be very useful. It would be
trivial to return, for example, a JSON expression representing the
string that would normally have been returned. It would be harder, but
doable after a lot of work, to do something like the current
pretty_print_default code does that queries each object for a
"show(type=JSON)" method and call that. Now I'll be looking at Ted's
patch to see how he did things.
Second point: How sage-edu fits into the larger sage community
Summary: The project leaders see sage-edu as a development area for
creating and refining spkgs to submit to the core sage community, as
well as creating the necessary patches needed to enable Sage to be a
more valuable tool in education circles.
My thoughts: I agree with William and Michael that we ought to keep the
community coherent and focused on Sage; we're too small to fracture and
survive. However, I don't see any obstruction to people releasing spkgs
of their own which have their own applets, source code included or not,
which provide functionality on the client side.
Third point: Contributions versus discussion
Summary: um, this seems to be the point of inflammation. I won't
summarize this, but I'll add my thoughts.
My thoughts: I agree that patches are what count in the end, but we're
at the beginning here. William's idea seems to be that the people that
need to carry out the discussion in the beginning need to be grouped and
organized to do the work effectively. The evolution process need not
concern everyone. I agree.
And now a more personal note to Ted; sorry for including replying to
this part of the discussion, but it probably ought to be publicly stated.
Ted, I really appreciate the contributions you've made. Just yesterday
I pointed several graduate students and my postdoc mentor to your
newbies book. Apparently it's also been a "best-seller" to people
trying to learn Sage. I also appreciate your work on JSON and want to
continue that discussion and development; I was waiting for the
technical objections to be resolved since people were talking about
things that I had no experience in. I don't have any experience
developing applets, but I appreciate what you've done in that area. At
one point I proposed some changes to the notebook organization to make
it easier to have applets have embedded cells in the notebook, but I
haven't contributed patches, so I'm not surprised that the interest
wasn't noted :).
As for the java front end, on the one hand I haven't been that
interested since the notebook fills my needs and I don't have (a lot of)
experience developing in java. However, one reason I think it's a good
thing to have multiple front ends to Sage is the same reason that
mabshoff thinks we ought to support multiple platforms: better code
quality (and in this case, communication code especially) and better
versatility. I think there are definitely things that you could do in a
dedicated front end that would not be very feasible in a web-based front
end. I started researching using the Enthought platform for a front
end, so I'm excited to see the results of SD8 (and wish I could have
taken off the time to be there!). Personally I would be more
comfortable contributing to an Enthought or py-QT front end because I'm
not experienced in Java, but I think you ought to go for it if you think
it is the way to go.
Personally, I'm realizing that there are far too many things I'd like to
contribute patches for than I have time for. I'm trying to figure out
what would help Sage get ahead the best now. I agree that having some
nice way to communicate with a Sage session would open up lots of
possibilities for others to do more work, so I'm willing to help with that.
Whew. That was a long response and it's taken enough of my time. Is
there anyone (left :) who wants to work on a communication API or wants
to work on documenting the notebook with me? If not, then I'll probably
just continue learning about the notebook and document it as I have
time, as well as continue work on interactive widgets.
Over on sage-edu, we ought to pretty quickly get a focus so that we
don't become too fragmented to do any good.
Jason
...
>
> First point: a standard API for communication with Sage.
>
> Summary: Ted proposed a standard API to talk with a Sage session, taking
> the Mathematica API as an example. He has done a lot of work
> researching alternatives and even constructing an optional spkg and
> example using JSON (see http://trac.sagemath.org/sage_trac/ticket/1510)
> Several other people in the project have expressed views that the
> Mathematica API as too heavyweight for Sage and there has appears to be
> some confusion about what introducing a standard API would mean for the
> project.
>
> My thoughts: I'd like to continue the discussion about an API. I've
> extended the notebook (slightly) and so I've had some exposure to the
> notebook "API" for communicating with Sage. In my experience, that
> communication falls into two parts: the communication between the
> notebook and Twisted and the communication between Twisted and the sage
> session. The communication between Twisted and the notebook consists of
> passing text blocks (the sage commands in a cell) and other fixed-field
> information, like the time it took to run a cell, whether to create a
> new cell, etc. The API there is very rigid and very notebook-centric
> (it was evolved as the notebook evolved). The communication between
> Twisted and the sage session is entirely a pexpect stdin/stdout type of
> communication. Nothing fancy or anything.
>
...
>
>
> Whew. That was a long response and it's taken enough of my time. Is
> there anyone (left :) who wants to work on a communication API or wants
> to work on documenting the notebook with me? If not, then I'll probably
> just continue learning about the notebook and document it as I have
> time, as well as continue work on interactive widgets.
Can you recommend an "API for dummies"-type reference?
I might be interested in working on this.
>
> Over on sage-edu, we ought to pretty quickly get a focus so that we
> don't become too fragmented to do any good.
Maybe this should be another thread but possible topics:
(1) writing education materials which are integrated with SAGE, such as
http://sage.math.washington.edu/home/wdj/teaching/granville-calculus/
(2) API related stuff?
(3) package for Geogebra? others?
>
> Jason
>
>
> >
>
I started a simple API at the AMS meeting in January, and I think the
time is ripe to have a discussion about this. I'll start a new thread
summarizing my ideas (probably on sage-edu) when I have a bit more
time (probably later today).
- Robert
...
> >
> > Over on sage-edu, we ought to pretty quickly get a focus so that we
> > don't become too fragmented to do any good.
>
> Maybe this should be another thread but possible topics:
> (1) writing education materials which are integrated with SAGE, such as
> http://sage.math.washington.edu/home/wdj/teaching/granville-calculus/
Regarding educational material, I'm currently using SAGE in an
undergraduate course on Digital Signal Processing (an area where
Matlab is by far the tool of choice). I will gladly compile and
translate into English (since we're working in Spanish) something like
"DSP with SAGE". I'm truly sorry that this will have to wait until May
though (... I guess we're all crazy busy).
Best,
--
Hector
I personally am not too familiar with web development, so it's always
great to hear from someone who has (which is exactly why this
discussion was started). Regarding XML-RPC vs Pexpect:
- how slow is one compared to the other? I expect xml-rpc to be
slower, but not so slow to render it unusable.
- I understand xml-rpc working for inter-communication, ie SAGE ->
outside world, but I don't see how it would work for
intra-communication, SAGE -> maxima. Maxima would have to be already
running in the background, right? If that is the case, then every sage
session would have to spawn singular, maxima, maple, etc sessions at
start-up. I don't like that. Is there something I'm not getting here?
didier
I personally am not too familiar with web development, so it's always
On Fri, Feb 22, 2008 at 5:57 PM, alex clemesha <clem...@gmail.com> wrote:
> In Knoboo we *decouple* the idea of a kernel, it could be another
> Python (Sage) process, with communication through Pexpect
>
> ... but it also couple be another Python (Sage) process running a very
> minimal XML-RPC server, and all communication occurs through
> *** HTTP instead of Pexpect ***.
great to hear from someone who has (which is exactly why this
discussion was started). Regarding XML-RPC vs Pexpect:
- how slow is one compared to the other? I expect xml-rpc to be
slower, but not so slow to render it unusable.
- I understand xml-rpc working for inter-communication, ie SAGE ->
outside world, but I don't see how it would work for
intra-communication, SAGE -> maxima. Maxima would have to be already
running in the background, right? If that is the case, then every sage
session would have to spawn singular, maxima, maple, etc sessions at
start-up. I don't like that. Is there something I'm not getting here?
didier
I'm actually pretty curious about how pexpect and XMLRPC both
done locally compare speedwise. I've done some simple benchmarks
below. The short answer is that pexpect is between several hundred
to several thousand times faster than XMLRPC, depending on the
platform.
Here's a good pexpect benchmark to do in sage-2.10.2:
sage: gp('2+2')
4
sage: timeit("gp.eval('2+2')")
625 loops, best of 3: 136 µs per loop
This benchmarks adding 2 and 2 over the pexpect interface
to pari and getting back the result. It takes 136 microseconds
to do that on my OS X 10.5 intel 2.6Ghz computer. On sage.math
it's about 500 times faster, only 303 nanoseconds:
sage: timeit(gp.eval('2+2'))
625 loops, best of 3: 303 ns per loop
Pexpect may have a reputation for being "dog slow", but on small
transactions it's actually surprisingly fast. Seriously. It's only bad
when the input is large.
Now let's try xmlrpc (with Yi's help). Copy in the setup from
http://docs.python.org/lib/simple-xmlrpc-servers.html
then time adding two integers (we put an r after them below
so that they are *Python* integers, which avoid preparsing):
On my 2.6Ghz Intel OS X computer:
sage: timeit('s.add(2r,3r)')
25 loops, best of 3: 43.7 ms per loop
On sage.math:
sage: timeit('s.add(2r,3r)')
625 loops, best of 3: 1.38 ms per loop
So let's compare:
On OS X 2.6Ghz machine pexpect is 321 times faster
than XMLRPC:
sage: 43.7 / (136 * 10^(-3))
321.323529411765
On sage.math pexpect is 4554 times faster than XMLRPC:
sage: 1.38 / (303 * 10^(-6))
4554.45544554455
Obviously there may be tricks for speeding up XMLRPC (and
for speeding up pexpect), and I would really love for an expert
in XMLRPC to retry the above benchmarks but with their tricks.
However, XMLRPC is using the TCP/IP networking stack, and
probably will have to use encryption if we're to do this seriously,
whereas pexpect is just all happening in RAM (vis unix named
pipes), so it's maybe not surprising that pexpect would
have a big advantage speedwise.
>
>
> >
> > - I understand xml-rpc working for inter-communication, ie SAGE ->
> > outside world, but I don't see how it would work for
> > intra-communication, SAGE -> maxima. Maxima would have to be already
> > running in the background, right? If that is the case, then every sage
> > session would have to spawn singular, maxima, maple, etc sessions at
> > start-up. I don't like that. Is there something I'm not getting here?
>
> Sage intra-communication would stay exactly the same. The way that
> works (pseudo-tty) won't be changing any time soon (ever?).
It may change for MS Windows; I don't know. Windows doesn't suppose
pseudo-tty's so well (?), so we may have to come up with a new approach
there, since now we're serious about fully porting Sage to Windows.
> It's fundamental to how Sage encapsulates all it's external programs.
>
> Behind the scenes, when you use a function that requires,
> for example, singular or maxima to be used, then that new process
> would start only when you call that function, not at startup,
> which is already a working, built in part of Sage.
Yep.
> The method of using some light-weight RPC server,
> (could be Python's built in XML-RPC server, could be a DSage instance, etc)
> to act as the running namespace for a notebook is almost identical
> to how the Sage notebook uses it's separate processes for each notebook,
> it's just that communication to the *outside world* is done with
> standard RPC methods instead of with a psuedo-tty (Pexpect). This is the
> key point.
Very clear explanation. And yes, I think XML-RPC is a good
external API for Python to talk to other programs. And it's already
done -- we don't have to write anything -- it's been done for years.
And it's completely "industry standard".
> By using standard RPC methods,
> (which by the way is the bread and butter of most web services, regardless
> of programming language)
> to communicate with the running Sage process,
> you can then support some API which others can use too.
>
Yep. So basically Sage (because of Python) already has *the*
standard API. Very cool.
-- William
-- William
Agreed. The benchmarks are just supposed to clarify which
is the right tool for the job in each case.
I noticed I made one mistake in the benchmarks. On sage.math I did
sage: timeit(gp.eval('2+2'))
625 loops, best of 3: 303 ns per loop
but should have done
sage: timeit("gp.eval('2+2')") # note the extra quotes
625 loops, best of 3: 147 µs per loop
so the time on OS X and Linux is very similar. It's still over 300
times faster to use pexpect than XMLRPC for this benchmark.
By the way, on sage.math using pexpect one can do
10,000 pexpect transactions (so more than 625)
in just over a second:
sage: time for _ in xrange(10^4): a = gp.eval('2+2')
CPU times: user 0.99 s, sys: 0.46 s, total: 1.45 s
Wall time: 1.49
As a consistency check note that
sage: 1.49/10^4 # thing above
0.000149000000000000
sage: 147.0*10^(-6) # timeit
0.000147000000000000
> The above is not meant to be precise in any way,
> just to kind of express the issue we are dealing with here.
> If both methods allow a cell's code to be transfered in
> sub-second time, let's aim for functionality, not speed.
That is very sensible. I'm definitely excited to us XMLRPC
somehow when it is the best solution. It should be fun.
-- William
More benchmarks:
pyxmlrpc (http://sourceforge.net/projects/py-xmlrpc/) is a c
implementation that is roughly 2 times faster than xmlrpclib on small
inputs. Here are some numbers on linux running sage 2.10.1:
pyxmlrpc:
sage: %timeit ("c.execute('add', [2r, 3r])")
1000000 loops, best of 3: 65.8 ns per loop
sage: timeit c.execute('add', [2r, 3r])
1000 loops, best of 3: 1.46 ms per loop
xmlrpclib:
sage: %timeit ("s.add([2r,3r])")
1000000 loops, best of 3: 158 ns per loop
sage: timeit s.add([2r,3r])
100 loops, best of 3: 1.94 ms per loop
overall:
sage: time for _ in range(10^3): gp.eval('2+3')
CPU times: user 0.34 s, sys: 0.10 s, total: 0.45 s
Wall time: 0.82
sage: time for _ in range(10^3): c.execute('add', [2r, 3r])
CPU times: user 0.08 s, sys: 0.02 s, total: 0.10 s
Wall time: 1.66
sage: time for _ in range(10^3): s.add([2r,3r])
CPU times: user 0.99 s, sys: 0.38 s, total: 1.37 s
Wall time: 2.51
Note: The pyxmlrpc is currently unmaintained (last realease was in
2004) and needs to be patched against 2.5 (there's a fix at the bottom
of http://sourceforge.net/tracker/index.php?func=detail&aid=1734819&group_id=23992&atid=380301).
didier
Thanks. What are you timing this on? On sage.math the
same benchmark is consistently much faster:
sage: time for _ in range(10^3): gp.eval('2+3')
CPU times: user 0.09 s, sys: 0.05 s, total: 0.14 s
Wall time: 0.15
I just tried using a for loop instead of timeit to
time using xmlrpc (exactly as in
http://docs.python.org/lib/simple-xmlrpc-servers.html)
and the timing on sage.math is:
sage: time for _ in range(10^3): s.add(2r,3r)
CPU times: user 0.66 s, sys: 0.35 s, total: 1.01 s
Wall time: 1.41
This is exactly a factor of 10 difference in speed
(in favor of pexpect). This is *not* what I claimed
in my initial benchmark, which was wrong -- my apologies
again -- benchmarking libraries one has never used before
is always tricky. Still 10 times faster is a lot faster.
In your benchmark pexpect is only twice as fast
as c-xml and only 3 times faster than python xml.
Summary: pexpect is anywhere between 2 and
10 times faster than xmlrpc depending on machine and
xml implementation.
Thanks Didier!
> sage: time for _ in range(10^3): c.execute('add', [2r, 3r])
> CPU times: user 0.08 s, sys: 0.02 s, total: 0.10 s
> Wall time: 1.66
>
> sage: time for _ in range(10^3): s.add([2r,3r])
> CPU times: user 0.99 s, sys: 0.38 s, total: 1.37 s
> Wall time: 2.51
>
> Note: The pyxmlrpc is currently unmaintained (last realease was in
> 2004) and needs to be patched against 2.5 (there's a fix at the bottom
> of http://sourceforge.net/tracker/index.php?func=detail&aid=1734819&group_id=23992&atid=380301).
>
>
>
> didier
>
> >
>
--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org