I finally learned javascript and AJAX, so that I can help with the
notebook. I also studied it's sources.
First things I like:
* I like the user interface, it's usable, especially the attention to
little details, like borders around the cells, tab completion, tab
indentation and things like that.
Things I don't like:
* the javascript is really hackish overall, but two things really
caught my attention:
a) the keyboard handling is horrific, why not to use some standard
library for that, that works across all browsers
b) it uses some custom format for transfering data (which has bugs,
like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
why not to use JSON?
* it doesn't run on the google appengine (William mentioned in the
past, that he doesn't see any benefit to do that, or that it would be
slow)
Well, talk is cheap, so here is the code (a sample Firefox screenshot
is also attached in case it didn't work in your browser):
it uses jQuery all over, it uses a keyboard plugin for jQuery, it uses
JSON and it runs on the google appengine (and anywhere else too, it's
just a standard django app). I tested in Firefox and IE8. The keyboard
works, there are just some subtle bugs on IE8, see here:
http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/todo
but none of it seems major to me, the keyboard seems to be working
just fine (or is IE8 not the most problematic? I'll try to test in
other browsers like Opera and Safari too). Here is the index.html with
all the javascript that I wrote:
http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/templates/index.html
It handles most of the keyboard interaction. It doesn't have TAB
completion and inspection yet.
Well, let me say that I really like to run things on the appengine,
rather than to constantly maintain our own servers. I see no reason
why the notebook cannot run on the appengine, only the AJAX would talk
to our own server with Sage to actually evaluate the cells (and for
many people, I think appengine itself could actually be enough). I
have to think though what the best way to transfer data to the
database with worksheets is though.
I wanted to ask --- which parts of the Sage notebook are BSD licensed?
I used a bit of the CSS styles and and maybe one javascript function,
everything else was written by me. If possible, I'd like to use the
BSD license for the notebook (if I find time to work on it further),
so that ipython can use it by default.
Also, question to all, do you like the In [3] and Out[3] lines? I
don't have an opinion on it yet myself, so I implemented them, to see
how it looks like. Also, please let me know if it works in your
browser.
Ondrej
Tom Boothby wrote all that in early 2006, and there wasn't something
good then. I don't think jquery even existed then.
> b) it uses some custom format for transfering data (which has bugs,
> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
> why not to use JSON?
That would be a good idea.
> * it doesn't run on the google appengine (William mentioned in the
> past, that he doesn't see any benefit to do that, or that it would be
> slow)
Just because I don't see a benefit to something, doesn't mean there
aren't tons of benefits.
> Well, talk is cheap, so here is the code (a sample Firefox screenshot
> is also attached in case it didn't work in your browser):
>
> http://pythonnb.appspot.com/
>
> it uses jQuery all over,
Cool!
> it uses a keyboard plugin for jQuery, it uses
> JSON and it runs on the google appengine (and anywhere else too, it's
> just a standard django app). I tested in Firefox and IE8. The keyboard
> works, there are just some subtle bugs on IE8, see here:
>
> http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/todo
>
> but none of it seems major to me, the keyboard seems to be working
> just fine (or is IE8 not the most problematic? I'll try to test in
> other browsers like Opera and Safari too). Here is the index.html with
> all the javascript that I wrote:
>
> http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/templates/index.html
>
> It handles most of the keyboard interaction. It doesn't have TAB
> completion and inspection yet.
How are you doing the auto input cell resizing?
> Well, let me say that I really like to run things on the appengine,
> rather than to constantly maintain our own servers. I see no reason
> why the notebook cannot run on the appengine, only the AJAX would talk
> to our own server with Sage to actually evaluate the cells (and for
> many people, I think appengine itself could actually be enough). I
> have to think though what the best way to transfer data to the
> database with worksheets is though.
>
> I wanted to ask --- which parts of the Sage notebook are BSD licensed?
> I used a bit of the CSS styles and and maybe one javascript function,
> everything else was written by me. If possible, I'd like to use the
> BSD license for the notebook (if I find time to work on it further),
> so that ipython can use it by default.
Make precise what you used and we'll get it BSD licensed for you. We
have to see who wrote the particular code you're using.
> Also, question to all, do you like the In [3] and Out[3] lines?
> I don't have an opinion on it yet myself, so I implemented them, to see
> how it looks like. Also, please let me know if it works in your
> browser.
>
> Ondrej
-- William
I'll reply from a purely codenode point of view. You sent this
email to both lists, but I'm only qualified to describe the details
of codenode's current architecture.
> a) the keyboard handling is horrific, why not to use some standard
> library for that, that works across all browsers
There is an *excellent* jQuery library for this called "js-hotkeys"
http://code.google.com/p/js-hotkeys, which is surely the one you are mentioning
that just did not exist when both notebooks began to really get going.
That said, it would be extremely beneficial to delegate the key-handling
to that library.
> b) it uses some custom format for transfering data (which has bugs,
> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
> why not to use JSON?
codenode only sends data encoded in JSON. This is very important because
it totally decouples data from presentation. This is in fact one reason why the
switch to Django went very smoothly.
> * it doesn't run on the google appengine (William mentioned in the
> past, that he doesn't see any benefit to do that, or that it would be
> slow)
The codenode backend (as you know) does run on app-engine, and
I feel that this is the most important part because this is where all the
arbitrary code execution (the big security risk) happens. codenode
is now mostly Django so it does seem feasible to make everything work on
app-engine, but this would take a little work.
> I wanted to ask --- which parts of the Sage notebook are BSD licensed?
> I used a bit of the CSS styles and and maybe one javascript function,
> everything else was written by me. If possible, I'd like to use the
> BSD license for the notebook (if I find time to work on it further),
> so that ipython can use it by default.
We are actually going to be completely switching the codenode license to BSD,
(as nothing we depend on is GPL) and we hope to allow more people
to utilize what codenode has to offer.
Dorian and I have talked about this, and we feel that it is best. The
scipy/numpy/sympy/matplotlib
communities are ones that we know can benefit from a really good notebook,
and we hope that all our efforts combined can make it so.
We have not made the official switch yet, but we will be officially switching
to the BSD license in the next couple weeks.
thanks,
Alex
very nice work!
On Mon, Jul 20, 2009 at 9:02 PM, Ondrej Certik<ond...@certik.cz> wrote:
> a) the keyboard handling is horrific, why not to use some standard
> library for that, that works across all browsers
> b) it uses some custom format for transfering data (which has bugs,
> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
> why not to use JSON?
another option of course would be to use pyjamas:
http://code.google.com/p/pyjamas/
It has a lot of features and also the option to run it standalone,
without a browser, as a
desktop app.
Kilian
I take the text, count number of "\n", handle line wrapping, calculate
the number of lines *occupied* in the textbox and set the number of
rows of the textbox. It just works in firefox, there is a little
glitch in IE8, that I have to put the backspace and enter into the
text before the calculation (e.g. the text is updated after the
keyboard handler). But I don't need to put the text to some div first,
measure it's height and set the height.
Seems like a similar glitch is in Opera.
As to which functions I used, I used this one:
function get_selection_range(input) {
/*
Return the start and end positions of the currently selected text
in the input text area (a DOM object).
INPUT:
input -- a DOM object (a textarea)
OUTPUT:
an array of two nonnegative integers
*/
// If the attribute input.selectionStart is present, use that:
if (input.selectionStart || input.selectionStart == 0) {
return Array(input.selectionStart, input.selectionEnd);
} else {
var start, end;
var range = document.selection.createRange();
var tmprange = range.duplicate();
tmprange.moveToElementText(input);
tmprange.setEndPoint("endToStart", range);
start = tmprange.text.length;
tmprange = range.duplicate();
tmprange.moveToElementText(input);
tmprange.setEndPoint("endToEnd", range);
end = tmprange.text.length;
return Array(start, end);
}
}
(I rewrote it it a bit, and I may have broken it on IE8, but I'll fix it. :)
Besides that, I used the following styles (again, I modifed them a
bit, but left the Sage notebook borders, because I like them). I fixed
the padding, so that (at least on firefox) if you focus a cell, only
the border changes, but the text doesn't move (in Sage notebook, the
text moves by 1 pixel, and I find it annoying).
textarea.cell_input {
color:#000000;
background-color: white;
border: 1px solid #a8a8a8;
font-family: monospace;
font-size:12pt;
overflow:hidden;
padding-left:6px;
padding-top:4px;
padding-bottom:4px;
margin-bottom:0px;
margin-top:0px;
line-height:1.2em;
float: left;
}
textarea.cell_input_active {
background-color: white;
border: 2px solid #8888FE;
font-family: monospace;
font-size:12pt;
overflow:hidden;
padding-left:5px;
padding-top:3px;
padding-bottom:4px;
margin-bottom:0px;
margin-top:0px;
line-height:1.2em;
float: left;
}
Besides that, I wrote everything from scratch.
Ondrej
Ondrej
Yes. In fact, one reason I wrote it is so that you can use it in
codenode if you like it --- I really like the Sage style with borders
around cells etc.
>
>
>> a) the keyboard handling is horrific, why not to use some standard
>> library for that, that works across all browsers
> There is an *excellent* jQuery library for this called "js-hotkeys"
> http://code.google.com/p/js-hotkeys, which is surely the one you are mentioning
> that just did not exist when both notebooks began to really get going.
> That said, it would be extremely beneficial to delegate the key-handling
> to that library.
Yes, that's exactly what I use. It seems to be working just fine
everywhere and the interface is really nice and super easy, you just
attach a function for every key combination --- no need to have one
ugly handler for everything.
>
>
>> b) it uses some custom format for transfering data (which has bugs,
>> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
>> why not to use JSON?
> codenode only sends data encoded in JSON. This is very important because
> it totally decouples data from presentation. This is in fact one reason why the
> switch to Django went very smoothly.
Yes, that's the way to go.
>
>
>
>
>> * it doesn't run on the google appengine (William mentioned in the
>> past, that he doesn't see any benefit to do that, or that it would be
>> slow)
> The codenode backend (as you know) does run on app-engine, and
> I feel that this is the most important part because this is where all the
> arbitrary code execution (the big security risk) happens. codenode
> is now mostly Django so it does seem feasible to make everything work on
> app-engine, but this would take a little work.
In fact, the backend can only run on the appengine if it's pure python
(like sympy), but not if it's some heavy C++ stuff, like our FEM
solvers. But the frontend can run in there, that's my idea.
>
>
>
>
>> I wanted to ask --- which parts of the Sage notebook are BSD licensed?
>> I used a bit of the CSS styles and and maybe one javascript function,
>> everything else was written by me. If possible, I'd like to use the
>> BSD license for the notebook (if I find time to work on it further),
>> so that ipython can use it by default.
>
> We are actually going to be completely switching the codenode license to BSD,
> (as nothing we depend on is GPL) and we hope to allow more people
> to utilize what codenode has to offer.
>
> Dorian and I have talked about this, and we feel that it is best. The
> scipy/numpy/sympy/matplotlib
> communities are ones that we know can benefit from a really good notebook,
> and we hope that all our efforts combined can make it so.
>
> We have not made the official switch yet, but we will be officially switching
> to the BSD license in the next couple weeks.
Ah, that is very nice! Indeed, there should be some default notebook
for python stuff, I view it like a part of the common platform, that
everyone needs.
How hard would be to (maybe optionally) use the Sage like look & feel
to codenode?
Ondrej
That should be easy. It's written in a way, so that people can just
trivially modify the html template and very easily change things like
this.
>
>> Also, please let me know if it works in your
>> browser.
>
> In a brief test, it works in Safari and Firefox on my intel mac.
Thanks!
On Tue, Jul 21, 2009 at 12:18 AM, killian koepsell<koep...@gmail.com> wrote:
>
> Hi Ondrej,
>
> very nice work!
>
> On Mon, Jul 20, 2009 at 9:02 PM, Ondrej Certik<ond...@certik.cz> wrote:
>> a) the keyboard handling is horrific, why not to use some standard
>> library for that, that works across all browsers
>> b) it uses some custom format for transfering data (which has bugs,
>> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
>> why not to use JSON?
>
> another option of course would be to use pyjamas:
> http://code.google.com/p/pyjamas/
> It has a lot of features and also the option to run it standalone,
> without a browser, as a
> desktop app.
Yes, that's the next thing that I want to learn and try to rewrite the
thing that I wrote so far into pyjamas, so that we can have everything
in Python.
I wanted to get my hands dirty first, to learn how javascript works,
because even though theoretically you don't have to touch it with
pyjamas, but in practise I am sure I will need to debug it why it
doesn't work in some particular browser.
Ondrej
You did an amazing work. I didn't realize it was already in 2005. It
must have been terrible to make sure it works everywhere.
Fortunately, today there seems to be good javascript libraries for
everything and they seem to work pretty well almost everywhere.
>
>> a) the keyboard handling is horrific, why not to use some standard
>> library for that, that works across all browsers
>
> As noted, I did this before jQuery existed -- I searched hard and long before deciding to write my own keyboard handler, and every "clean" approach I took failed in a browser or two -- the "horrific" result works in every platform I've tried, so long as one stays away from the alt key in Safari, IIRC.
>
> Now that Opera is no longer an obstruction, there's only one reason not to use a standard library: it's been written, and it works. Rewrite it! I love seeing my javascript get rewritten!
I am thinking of using pyjamas, if it works for this, *that* be
awesome. Having everything in Python.
>
>> b) it uses some custom format for transfering data (which has bugs,
>> like http://groups.google.com/group/sage-devel/browse_thread/thread/5ecd104b0aa85439),
>> why not to use JSON?
>
> Again... it worked after we wrote it. It became too much work to replace, so we kept cobbling more on.
Right. By no means it was meant to criticize your work. :) I was just
saying that we can do better today with all those nice js libraries.
>
>> * it doesn't run on the google appengine (William mentioned in the
>> past, that he doesn't see any benefit to do that, or that it would be
>> slow)
>>
>> Well, talk is cheap, so here is the code (a sample Firefox screenshot
>> is also attached in case it didn't work in your browser):
>>
>> http://pythonnb.appspot.com/
>
> Might have to take back what I said earlier... Shift-enter causes an extra newline to be placed in the cell below the current one in Opera 9.
This newline is a bick hackish still ---- basically the textarea
really sucks, it doesn't have a function for getting a cursor position
and it cannot resize automatically. Everything has to be written
indirectly.
>
>>
>> it uses jQuery all over, it uses a keyboard plugin for jQuery, it uses
>> JSON and it runs on the google appengine (and anywhere else too, it's
>> just a standard django app). I tested in Firefox and IE8. The keyboard
>> works, there are just some subtle bugs on IE8, see here:
>>
>> http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/todo
>>
>> but none of it seems major to me, the keyboard seems to be working
>> just fine (or is IE8 not the most problematic? I'll try to test in
>> other browsers like Opera and Safari too). Here is the index.html with
>> all the javascript that I wrote:
>>
>> http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/templates/index.html
>>
>> It handles most of the keyboard interaction. It doesn't have TAB
>> completion and inspection yet.
>
> Initial reaction: NICE!!! But... I only see about 20% of the functionality we really need, and the last 10% typically takes as long as the first 90%.
That's right.
>
> Criticism: when one presses the up arrow accidentally at the top of a cell, it is obnoxious for the cursor to jump to the top of the next cell up.
Yes, in fact this is the first thing in my the TODO file:
http://github.com/certik/notebook/blob/375a2026ee7ea721904d05068724b3a7663d018e/todo
>
> Suggestion: the introspection interface, as written, is utter shit. It's literally the first thing that I got to work, and it's never been reworked. I've been wanting to move the introspect "window" to a floating div that can be torn out of the window -- but I have little skill when it comes to using the new-fangled javascript libraries, so I haven't done this. At the very least, I think it should appear on the right-hand side of the window so one can both read the documentation, and the text at the top of their long cell.
Yes, I think this is the second thing that I want to write. Maybe
after pyjamas --- now, when I understand how to debug all those AJAX
requests, I am eager to look into that.
I really tried to avoid javascript and all this AJAX thing, but I must
say it's really exciting! :)
I think I will try to write GUI for our FEM stuff in the browser.
Browser is the best thing.
>
>>
>> Well, let me say that I really like to run things on the appengine,
>> rather than to constantly maintain our own servers. I see no reason
>> why the notebook cannot run on the appengine, only the AJAX would talk
>> to our own server with Sage to actually evaluate the cells (and for
>> many people, I think appengine itself could actually be enough). I
>> have to think though what the best way to transfer data to the
>> database with worksheets is though.
>>
>> I wanted to ask --- which parts of the Sage notebook are BSD licensed?
>> I used a bit of the CSS styles and and maybe one javascript function,
>> everything else was written by me. If possible, I'd like to use the
>> BSD license for the notebook (if I find time to work on it further),
>> so that ipython can use it by default.
>
> Every single line I have written for the notebook is BSD licensed. However, William, Alex Clemesha, Jason Grout, and Robert Bradshaw have all contributed javascript code, so I'd like to hear from them from making a blanket statement about the file. I believe that Dorian Raymer and Mike Hansen may have contributed, too. Am I missing anybody? Robert Miller?
>
>> Also, question to all, do you like the In [3] and Out[3] lines? I
>> don't have an opinion on it yet myself, so I implemented them, to see
>> how it looks like. Also, please let me know if it works in your
>> browser.
>
> NO! I think they're terrible. The more space a cell can occupy, the better. I dislike how much border & space the current Sage notebook has.
That was another thing --- I really want the notebook to be
configurable, so that it's easy to rebrand it (e.g. change Sage to
something else), easy to change look & feel, like the thing above.
Ideally just by changing the cell html prototype and CSS styles.
Ondrej
I think it would be great to have the notebook (linked to Sage) run in
Google Apps.
Ondrej Certik wrote:
> [snip]
>
> Also, question to all, do you like the In [3] and Out[3] lines? I
> don't have an opinion on it yet myself, so I implemented them, to see
> how it looks like. Also, please let me know if it works in your
> browser.
>
I am used to the In [3] and Out [3] display from Mathematica and I liked
it. The sage notebook has similar tags internally, but for some reason
they are not displayed in the notebook. The good thing is that if you do
Evaluate all, the In [1] etc. are numbered consecutively and you can use
this to refer to bits of notebook code in a published notebook or a pdf
version of your notebook. This comes in very handy if you use a notebook
to derive equations used in e.g. Fortran and you would like to point to
the right place in the notebook in your Fortran code.
Stan
Very nice! The log shows you've been committing to it for only one day!
That's amazing.
It seems to work on Firefox 3.5.1 on Ubuntu 9.04 32-bit.
Jason
> Hi,
>
> I finally learned javascript and AJAX, so that I can help with the
> notebook. I also studied it's sources.
>
> First things I like:
>
> * I like the user interface, it's usable, especially the attention to
> little details, like borders around the cells, tab completion, tab
> indentation and things like that.
>
> Things I don't like:
>
> * the javascript is really hackish overall, but two things really
> caught my attention:
> a) the keyboard handling is horrific, why not to use some standard
> library for that, that works across all browsers
> b) it uses some custom format for transfering data (which has bugs,
> like http://groups.google.com/group/sage-devel/browse_thread/thread/
> 5ecd104b0aa85439),
> why not to use JSON?
> * it doesn't run on the google appengine (William mentioned in the
> past, that he doesn't see any benefit to do that, or that it would be
> slow)
Very cool! AJAX, and javascript libraries, and browsers have improved
a lot since the notebook was first written--I think a lot of this can
be cleaned up now.
> Well, talk is cheap, so here is the code (a sample Firefox screenshot
> is also attached in case it didn't work in your browser):
>
> http://pythonnb.appspot.com/
>
> it uses jQuery all over, it uses a keyboard plugin for jQuery, it uses
> JSON and it runs on the google appengine (and anywhere else too, it's
> just a standard django app). I tested in Firefox and IE8. The keyboard
> works, there are just some subtle bugs on IE8, see here:
>
> http://github.com/certik/notebook/blob/
> 375a2026ee7ea721904d05068724b3a7663d018e/todo
>
> but none of it seems major to me, the keyboard seems to be working
> just fine (or is IE8 not the most problematic? I'll try to test in
> other browsers like Opera and Safari too). Here is the index.html with
> all the javascript that I wrote:
>
> http://github.com/certik/notebook/blob/
> 375a2026ee7ea721904d05068724b3a7663d018e/templates/index.html
>
> It handles most of the keyboard interaction. It doesn't have TAB
> completion and inspection yet.
>
> Well, let me say that I really like to run things on the appengine,
> rather than to constantly maintain our own servers. I see no reason
> why the notebook cannot run on the appengine, only the AJAX would talk
> to our own server with Sage to actually evaluate the cells (and for
> many people, I think appengine itself could actually be enough). I
> have to think though what the best way to transfer data to the
> database with worksheets is though.
+1, though for Sage we rely heavily on compiled code. I wonder how
much introduced latency there would be if the backend were served on
a university computer, and the front end in appengine.
> I wanted to ask --- which parts of the Sage notebook are BSD licensed?
> I used a bit of the CSS styles and and maybe one javascript function,
> everything else was written by me. If possible, I'd like to use the
> BSD license for the notebook (if I find time to work on it further),
> so that ipython can use it by default.
I release everything I've contributed under sage/server/* under BSD.
Here's a complete list. It looks longer than it is, and I bet most of
these people only contributed once. It'll be cleaner when it's
separate into a separate spkg.
$ hg log sage/server/*/*.py* | grep "user:" | sort | uniq
user: "Justin C. Walker <jus...@mac.com>"
user: 'Martin Albrecht <ma...@informatik.uni-bremen.de>'
user: Alex Clemesha <clem...@gmail.com>
user: Alexandru Ghitza <agh...@alum.mit.edu>
user: Bobby Moretti <mor...@u.washington.edu>
user: Carl Witty <cwi...@newtonlabs.com>
user: Christian Wuthrich <christian...@gmail.com>
user: Dan Drake <dr...@kaist.edu>
user: Dan Drake <dr...@mathsci.kaist.ac.kr>
user: Dorian Raymer <deld...@gmail.com>
user: Harald Schilly <harald....@gmail.com>
user: Igor Tolkov <ito...@gmail.com>
user: J. H. Palmieri <palm...@math.washington.edu>
user: Jason Grout <gr...@rayunion.org>
user: Jason Grout <jason...@creativetrax.com>
user: John H. Palmieri <palm...@math.washington.edu>
user: Karl-Dieter Crisman <kcri...@gmail.com>
user: Marshall Hampton <hamp...@gmail.com>
user: Martin Albrecht <ma...@informatik.uni-bremen.de>
user: Mike Hansen <mha...@gmail.com>
user: Mitesh Patel <qed...@gmail.com>
user: Nick Alexander <ncale...@gmail.com>
user: Paul Dehaye <pauloli...@gmail.com>
user: Paul Zimmermann <zimm...@loria.fr>
user: Rob Beezer <bee...@ups.edu>
user: Robert Bradshaw <robe...@math.washington.edu>
user: Robert L. Miller <r...@rlmiller.org>
user: Robert Miller <rlmil...@gmail.com>
user: Timothy Clemans <timothy...@gmail.com>
user: Tom Boothby <boo...@u.washington.edu>
user: Wilfried Huss <hu...@finanz.math.tugraz.at>
user: William Stein <wst...@gmail.com>
user: William Stein <wst...@ucsd.edu>
user: Yi Qiang <yqi...@gmail.com>
user: agc@kubuntu
user: boo...@eight.math.washington.edu
user: boothby@localhost
user: boo...@localhost.localdomain
user: boo...@u.washington.edu
user: mabshoff@localhost
user: mabs...@sage.math.washington.edu
user: root@sage
user: sa...@ubuntu-server.localdomain
user: w...@bsd.local
user: w...@keyah.local
user: was@localhost
user: w...@localhost.localdomain
user: was@ubuntu
user: wst...@gmail.com
> Also, question to all, do you like the In [3] and Out[3] lines?
No, but maybe that's just me.
> I don't have an opinion on it yet myself, so I implemented them, to
> see
> how it looks like. Also, please let me know if it works in your
> browser.
Works great for me.
- Robert
> I release everything I've contributed under sage/server/* under BSD.
I also release everything I've contributed up to this point under
sage/server/* under BSD.
Jason
--
Jason Grout
http://code.google.com/appengine/docs/python/gettingstarted/
There are some sample projects at
http://code.google.com/p/google-app-engine-samples/
Actually, I was motivated not to rewrite the notebook, but to adapt a
Python web server for controlling and monitoring the process of building
and testing Sage.
For example, we might use a local web dashboard to run doctests and
quickly get a list of the failures. Machines on a build farm could
occupy individual tabs. Maybe individual Sage developers could send
automated build reports (and logs, as necessary) to sagemath.org or a cloud.
How soon into the build process can we bring Sage to life, that is,
start running at least a minimal server?
Awesome! I'll wait after you do it and then I'll just use your
templates. Just use jinja, it works great and django can use it too.
Ondrej
I added this to the TODO.
> new cell the numbering gets out of order which looks messy. What is
> the value in having them numbered?
The cells have to have some numbers, but they can be internal of
course. It helped me as a developer to see which cell is what,
especially when merging them. It seems most people don't like the
In/Out labels, so I will make them off by default and implement an
option to turn them on.
>
> By the way, if you print you don't see the results.
Yes, I need to catch stdout and send it to the browsers. I added it to
the TODO list.
Many thanks for the feedback.
Ondrej
>> Well, let me say that I really like to run things on the appengine,
>> rather than to constantly maintain our own servers. I see no reason
>> why the notebook cannot run on the appengine, only the AJAX would talk
>> to our own server with Sage to actually evaluate the cells (and for
>> many people, I think appengine itself could actually be enough). I
>> have to think though what the best way to transfer data to the
>> database with worksheets is though.
>
> +1, though for Sage we rely heavily on compiled code. I wonder how
> much introduced latency there would be if the backend were served on
> a university computer, and the front end in appengine.
I think none, it would be as fast as it is now (e.g. the browser
communicating directly with the engine).
I would like to decouple Sage as the *engine* from the rest. The
engine should handle evaluating cells and storing and retrieving the
state (I guess). Then it can be used in services like Google Wave that
Harald is experimenting with etc.
The AJAX in the browser should be talking directly to the engine (e.g.
just like it is now). Where the rest of it is running, that doesn't
really matter imho and it should be possible to run it on the
appengine.
Ondrej
What I meant is that the latency in typing 1+1 into the cell and get
the output cell saying 2 should not change at all, because the
javascript in the browser sends a POST request to the Sage engine
(e.g. a web app with the url interface, just like it is now) and it
returns it back directly to the browser.
What changes is the database storage, e.g. either the javascript in
the browser, once it receives the output of the cells also sends it to
the appengine (or whenever the database is running), or the engine
sends it itself, I don't know yet which approach is better. So there
are some issues involved, like if one of those connections fail etc.
But as long as both connections are up and running, the user would not
recognize anything at all.
> latency, i.e., whatever there is between appengine and the "sage
> engine". That said, the internet is pretty fast these days :-). And
> the scalability of a decoupled approach like we're talking about is a
> big plus, if it works.
Right, it has to be tried to see if it works. But I think it's worthy.
>
> By the way, if you haven't already, I personally think you should
> start a mailing list, web page, trac, etc. for a separate notebook
> project, since you're already writing code. There's already some
> confusion about where we are supposed to have this discussion -- and a
> funny mix of sage-devel and codenode doesn't seem right.
Well, I hope codenode guys could pick this up and they would be the
notebook. I unfortunately probably can't spend too much time on this,
until september. But I wanted to get this going to see which approach
to take.
I wrote the above in about 2 days (roughly), but it's only the first
90%, e.g. the cells sort of works, but the rest 10%, like tab
completion, worksheets, saving. loading, publishing, users, fixing it
so that it works 100% in all browsers..... That would take a lot more,
and I can't do it yet. But I hope it's encouraging to all of you to
learn some AJAX too till September, so that we can work on this
together. :)
There is one more thing I want to try -- pyjamas, as pointed out
above. I already played with it yesterday, and what I saw so far is
*impressive*. So my next step will be to rewrite what I did into
pyjamas (e.g. just pure python both on the server and in the browser).
If that works and I think it could, well, that would be the way to go,
since I could debug all those functions like for calculating cursor
positions etc. in Python.
Ondrej
What changes is the database storage, e.g. either the javascript in
the browser, once it receives the output of the cells also sends it to
the appengine (or whenever the database is running), or the engine
sends it itself, I don't know yet which approach is better. So there
are some issues involved, like if one of those connections fail etc.
But as long as both connections are up and running, the user would not
recognize anything at all.
Right, it has to be tried to see if it works. But I think it's worthy.
> latency, i.e., whatever there is between appengine and the "sage
> engine". That said, the internet is pretty fast these days :-). And
> the scalability of a decoupled approach like we're talking about is a
> big plus, if it works.
Well, I hope codenode guys could pick this up and they would be the
>
> By the way, if you haven't already, I personally think you should
> start a mailing list, web page, trac, etc. for a separate notebook
> project, since you're already writing code. There's already some
> confusion about where we are supposed to have this discussion -- and a
> funny mix of sage-devel and codenode doesn't seem right.
notebook. I unfortunately probably can't spend too much time on this,
until september. But I wanted to get this going to see which approach
to take.
I wrote the above in about 2 days (roughly), but it's only the first
90%, e.g. the cells sort of works, but the rest 10%, like tab
completion, worksheets, saving. loading, publishing, users, fixing it
so that it works 100% in all browsers..... That would take a lot more,
and I can't do it yet. But I hope it's encouraging to all of you to
learn some AJAX too till September, so that we can work on this
together. :)
There is one more thing I want to try -- pyjamas, as pointed out
above. I already played with it yesterday, and what I saw so far is
*impressive*. So my next step will be to rewrite what I did into
pyjamas (e.g. just pure python both on the server and in the browser).
If that works and I think it could, well, that would be the way to go,
since I could debug all those functions like for calculating cursor
positions etc. in Python.
Well, I hope codenode guys could pick this up and they would be the
>
> By the way, if you haven't already, I personally think you should
> start a mailing list, web page, trac, etc. for a separate notebook
> project, since you're already writing code. There's already some
> confusion about where we are supposed to have this discussion -- and a
> funny mix of sage-devel and codenode doesn't seem right.
notebook. I unfortunately probably can't spend too much time on this,
until september. But I wanted to get this going to see which approach
to take.
I wrote the above in about 2 days (roughly), but it's only the first
90%, e.g. the cells sort of works, but the rest 10%, like tab
completion, worksheets, saving. loading, publishing, users, fixing it
so that it works 100% in all browsers..... That would take a lot more,
and I can't do it yet. But I hope it's encouraging to all of you to
learn some AJAX too till September, so that we can work on this
together. :)
As to me, definitely use CSS for everything and remove all tables if
there are some. CSS is easy to customize by people.
Ondrej
I agree with everything you wrote.
Only one suggestion -- could you take my simple frontend for the cells
and incorporate it in codenode? I mean how things *look* like, so that
it looks like the Sage notebook. The default codenode look & feel
doesn't work well in my browser, since I can't figure out where to
click to find the cell, the cursor changes in some weird way and
generally it's confusing to me etc. So that's a major problem, but the
fix is really easy, just change the bit of the javascript + CSS styles
and it will look like Sage. There could be some option to choose
between the two designs if you prefer the current codenode style.
Ondrej
That's exactly correct.
Another possibility is to change 5) into 5'):
5') the Sage engine talks to the appengine database server directly.
The advantage of 5') over 5) is that the Sage engine should be running
on some fast network anyways (thus the communication Sage engine <->
app engine server will be fast), but the user's laptop can be on some
crappy connection.
>
> I think there could be some weird security issues/tricks involved with the
> javascript in the browser directly doing AJAX calls to the "compute engine"
> above, but there are hacks to get around that. There's also twice the
Right.
> communications overhead between the user's javascript and remote machines
> than in the current Sage notebook model where everything goes through the
> notebook server. E.g., if the output of a Sage command (in step 4 and 5
> above) is large, e.g., a 10MB image, then that image is going to go all
> over the place, both uploaded and downloaded, which will be incredibly
> expensive.
I agree, I think we should use 5'). E.g. if the database engine and
Sage engine is running on the same machine, that's the current design,
but if they are decoupled, but connected using fast internet, it could
work.
The appengine database backend has to have some notion of the engine
anyways, so it might as well retreive from it the results.
I agree that it might be too complex/tricky/error prone. I simply don't know.
>
>>
>> What changes is the database storage, e.g. either the javascript in
>> the browser, once it receives the output of the cells also sends it to
>> the appengine (or whenever the database is running), or the engine
>> sends it itself, I don't know yet which approach is better. So there
>> are some issues involved, like if one of those connections fail etc.
>> But as long as both connections are up and running, the user would not
>> recognize anything at all.
>
> This is an interesting design. It hadn't occured to me before. It would be
> interesting to see whether it is any good or not (I can't tell).
Me neither.
>
> I can tell you one thing, which is that when I start working on the notebook
> again seriously this September, my first goal will be to create a powerful
> system for simulating the load of n people all using the notebook at once in
> a potentially heterogenous way (say from several different computers,
> etc.). This testing code will be hopefully generic enough to work with
> codenode, sagenb, etc. I think having actual benchmark testing code will
> in the longrun be a better litmus test for designs than us just thinking
> about them in the abstract.
>
> I could pronounce the design you suggest above as "bad" for several reasons,
> but what if I'm wrong and in fact the design above, with some tweaks and
> insights that would result from testing, turns out to be amazingly good?
Exactly. I don't know myself and I am not sure about exact technical
details of my design, e.g. 5) vs 5') etc. But my motivation is that I
really want it to be able to run on the appengine completely if
needed, because there are tons of situations, where I just want to
show off some simple thing, be it sympy, or just some simple algorithm
in python and I really *don't* want to maintain my own server for
that.
At the same time however, I really would like to just create a simple
engine with web API (be it Sage, or anything else), and I would like
to maintain just this engine and if it dies, the frontend (running
somewhere else) would just use a different engine, or whatever.
So I would like to have that, but if it's possible to get everything
right and robust and fast, I simply don't know.
> I strongly encourage you to test pyjamas with the above. I think that's the
> best possible next step.
I will report later on this. It seems to work, but I can already see a
big issue -- it seems a bit slow (e.g. the generated javascript in the
browser). But it's too early to tell, once I implement the same thing,
we can then compare which approach is the best in the long run.
Ondrej
(1) Can pyjamas cleanly make use of arbitrary javascript libraries?
(2) Is there a list of nontrivial examples where pyjamas is actually
used to implement javascript apps?
-- William
Yes. But you don't have to write javascript and if you do things
correctly, the same file executes on your desktop using python, thus
you can doctest the whole thing.
I already implemented the textbox resizing, here is the code:
---------------
import pyjd # this is dummy in pyjs.
from pyjamas.ui.RootPanel import RootPanel
from pyjamas.ui.Button import Button
from pyjamas.ui.HTML import HTML
from pyjamas.ui.Label import Label
from pyjamas import Window
from pyjamas.ui.TextArea import TextArea
from pyjamas.ui import KeyboardListener
def greet(fred):
print "greet button"
Window.alert("Hello, AJAX!")
class InputArea(TextArea):
def __init__(self, echo):
TextArea.__init__(self)
self.echo = echo
self.addKeyboardListener(self)
self.addClickListener(self)
self.set_rows(1)
self.setCharacterWidth(80)
def onClick(self, sender):
print "on_click"
def rows(self):
return self.getVisibleLines()
def set_rows(self, rows):
if rows in [0, 1]:
# this is a bug in pyjamas, we need to use 2 rows
rows = 2
# the number of rows seems to be off by 1, another bug in pyjamas
self.setVisibleLines(rows-1)
def cols(self):
return self.getCharacterWidth()
def occupied_rows(self):
text = self.getText()
lines = text.split("\n")
return len(lines)
def cursor_coordinates(self):
"""
Returns the cursor coordinates as a tuple (x, y).
Example:
>>> self.cursor_coordinates()
(2, 3)
"""
text = self.getText()
lines = text.split("\n")
pos = self.getCursorPos()
i = 0
cursor_row = -1
cursor_col = -1
#print "--------" + "start"
for row, line in enumerate(lines):
i += len(line) + 1 # we need to include "\n"
# print len(line), i, pos, line
if pos < i:
cursor_row = row
cursor_col = pos - i + len(line) + 1
break
#print "--------"
return (cursor_col, cursor_row)
def insert_at_cursor(self, inserted_text):
pos = self.getCursorPos()
text = self.getText()
text = text[:pos] + inserted_text + text[pos:]
self.setText(text)
def onKeyUp(self, sender, keyCode, modifiers):
#print "on_key_up"
x, y = self.cursor_coordinates()
rows = self.occupied_rows()
s = "row/col: (%s, %s), cursor pos: %d, %d, real_rows: %d" % \
(self.rows(), self.cols(), x, y, rows)
self.set_rows(rows)
self.echo.setHTML("Info:" + s)
def onKeyDown(self, sender, key_code, modifiers):
if key_code == KeyboardListener.KEY_TAB:
self.insert_at_cursor(" ")
print "TAB"
#def onKeyDownPreview(self, key, modifier):
# print "preview"
def onKeyPress(self, sender, keyCode, modifiers):
#print "on_key_press"
pass
if __name__ == '__main__':
pyjd.setup("../templates/Hello.html")
b = Button("Click me", greet, StyleName='teststyle')
h = HTML("<b>Hello World</b> (html)", StyleName='teststyle')
l = Label("Hello World (label)", StyleName='teststyle')
echo = HTML()
t = InputArea(echo)
RootPanel().add(b)
RootPanel().add(h)
RootPanel().add(l)
RootPanel().add(t)
RootPanel().add(echo)
pyjd.run()
---------------
And it mostly works, up to some bugs in pyjamas (like that the textbox
can't be set to just 1 row, only 2 rows or more), that I will try to
solve with pyjamas developers, hopefully they are simple to fix.
If you look at the cursor_coordinates() function, this is really PITA
to debug in javascript -- I mean, you essentially need tests for the
javascript etc. If this could be avoided, that'd be a huge win. The
generated javascript for the function above is:
cls_definition.cursor_coordinates =
pyjs__bind_method(cls_instance, 'cursor_coordinates', function() {
if (this.__is_instance__ === true) {
var self = this;
} else {
var self = arguments[0];
}
var text = self.getText();
var lines = text.split(String('\x0A'));
var pos = self.getCursorPos();
var i = 0;
var cursor_row = -1;
var cursor_col = -1;
var __temp_row = pyjslib.enumerate(lines).__iter__();
try {
while (true) {
var temp_row = __temp_row.next();
var row = temp_row.__getitem__(0); var
line = temp_row.__getitem__(1);
i += ( pyjslib.len(line) + 1 ) ;
if (pyjslib.bool((pyjslib.cmp(pos, i) == -1))) {
cursor_row = row;
cursor_col = ( ( ( pos - i ) +
pyjslib.len(line) ) + 1 ) ;
break;
}
}
} catch (e) {
if (e.__name__ != 'StopIteration') {
throw e;
}
}
return new pyjslib.Tuple([cursor_col, cursor_row]);
}
So that doesn't look bad. But it's definitely slower than it could be
if used javascript for loop directly. But as I said, let's wait until
I implement the whole thing and let's see. Also pyjamas allow to embed
javascript code, so we may write the critical code in javascript
itself.
Ondrej
I think it can:
http://groups.google.com/group/pyjamas-dev/browse_thread/thread/639dffd00d6b7c7/
but I am still learning it.
>
> (2) Is there a list of nontrivial examples where pyjamas is actually
> used to implement javascript apps?
the examples directory is full of interesting examples --- well, it
depends what you mean nontrivial. Let me just implement it and let's
see after it.
Ondrej
I don't know about how the web server is implemented. I know it did not
work on my Solaris box, but that is another matter.
But actually including Apache might be a sensible choice. A lot of
people know how to administer Apache. It offers a lot of flexibility.
You can for example only serve pages to particular IP addresses.
Worth a thought anyway.
dave
here is an early preview of the pyjamas version:
http://2.latest.pythonnb.appspot.com/
So far my experience is:
* it doesn't work in IE8 (that's a showstopper)
* it's fast enough
* implementing the cursor positions and resizing was a piece of cake
(I was very impressed)
* learning the whole framework took me some time, one has to read sources a lot
* if it doesn't work, it's a bit difficult to debug (because it's not
just javascript, I need to figure out where I made the mistake in the
python code), I basically use git a lot and always do a small change
and test, small change and test. If it fails, I break my changes in
half and test, etc.
Essentially, pyjamas provide a complete DOM access (just like jQuery),
but in Python, and then builds its own widgets on top of it.
I am now learning how to do AJAX with it. So far only the cursor
movement and cells work (the focus is not yet shown by a blue line,
I'll do that later). Try this:
def f(x):
<hit TAB couple times>
and then hit <backspace>, you will see that it deletes 4 spaces, but
in a clever way, e.g. if you are at a position 7, it goes to 4 first
and then to 0. This is how my vim is setup for python editing and I
like it a lot.
Let me know if it works in your browser so far. I only tested firefox
3.5, that works fine. IE8 doesn't load the javascript, e.g. you will
see no textbox. Also, in Firefox I get frequent error messages
(printed in the actual HTML):
"
JavaScript Error: Permission denied to get property
HTMLDivElement.parentNode at line number 9254. Please inform
webmaster.
"
It's a bug in pyjamas.
Ondrej
Also cell joining is not yet implemented. But it works in the Chrome
browser, so at least something.
Ondrej
I implemented the AJAX thing as well, here is a working example (sort of):
http://3.latest.pythonnb.appspot.com/media_files/output/index.html
Besides what I wrote above, there is some problem with CSS styles,
which shows up when you evaluate some cell and see the output, the
"insert new cell blue thin line" is misplaced.
I implemented it in the "pyjamas" branch in my repository (link is on
the webpage), so there are some leftovers (like the jQuery.js library,
which is *not* needed anymore, etc.). The django backend didn't change
at all and now the whole notebook doesn't contain a single line of
javascript (everything is pure python). I think that itself is pretty
impressive. Here is the file, that gets translated:
http://github.com/certik/notebook/blob/001b4ddf444b480822adf9216419afa1adaf4818/media/index.py
In terms of lines of code, it's about the same as my previous version
in javascript:
http://github.com/certik/notebook/blob/ca4e6a90a3f0c10c78c8c99d4d55055ba5019c28/templates/index.html
But there are still some things missing (e.g. joining and deleting the
cells). The python code should be refactored first though, there
should be a class Cell, that should have references to it's children,
like the input/output cells etc. and this class should now how to turn
on/off the output cell etc. Currently I am using the DOM directly in a
bit hackish way, this should be polished. Essentially I was fighting
pyjamas to access the elements in the DOM, for example jQuery's
analogs of insert before and after are not available (resp. it's
tricky).
Nevertheless, overall, I like the pyjamas approach and the above
things can be fixed. Also now there is a nice possibility to mock up
the controls and run regular python unittests on the whole thing.
That's a big plus.
The remaining big problem is the IE8 support and the errors that
sometimes popup in firefox. I reported it here:
http://groups.google.com/group/pyjamas-dev/browse_thread/thread/f170c3709c7f12ed
and I was told I am pretty much on my own with IE8. So that's very
disappointing of course, but maybe the fix is easy. If it is not, then
that's a big problem.
Ondrej
Many thanks for testing it. That's a bad news. On the very top of
pyjamas website they wrote:
"
Also, the AJAX library takes care of all the browser interoperability
issues on your behalf, leaving you free to focus on application
development instead of learning all the "usual" browser
incompatibilities.
"
But the reality is that it works, unless you run Firefox and unless
you run linux. :(
We'll see if it can be fixed. If not, we'll have to abandon this
approach and just use jQuery.
Ondrej
I mean as long as.
O.
Basically, the problem is that the Sage sub-process loses control when
>
> > My primary problem is that the Sage subprocess is blocking forever on
> > the other side of the pipe when its not computing... Therefore, I
> > can't have a Sage sub-process that I'm using in the notebook that is
> > also able to communicate with other processes as I can't
> > asynchronously receive data (or get timing interrupts). I've gotten
> > around this in the past by using threads as it was the only choice I
> > had.
>
> Thanks for the clarification. Since I don't really understand the
> problem, without further clarification I don't think it will get fixed
> in the near future.
its done servicing a request from the server. Instead of entering an
event loop, it blocks on the pipe.
The alternative would be to return after any request to an event
loop. Clearly, the primary requester would be the notebook server,
Once this change was made, you'd have a full infrastructure with which
to build much more flexible applications yet still have the notebook
interface. This would also facilitate building distributed
computation engines, data collectors etc.
Furthermore, as things evolved, truly dynamic AJAX could be built
because the underlying Sage process could be asynchronously receiving
data, talking with other Sage processes, periodically polling other
servers (e.g. yahoo finance)
The following sage-devel thread is also about using XML-RPC with Sage:
http://groups.google.com/group/sage-devel/browse_thread/thread/202f9b2323d2771b/71fd656651eceb89
--
Regards
Minh Van Nguyen
Oops. I should have noticed and clicked on "Newer >", where I might
have read about several examples. I apologize.
>
> On Jul 22, 9:23 pm, William Stein <wst...@gmail.com> wrote:
>> On Wed, Jul 22, 2009 at 2:19 PM, ghtdak<gl...@tarbox.org> wrote:
>>> My primary problem is that the Sage subprocess is blocking
>>> forever on
>>> the other side of the pipe when its not computing... Therefore, I
>>> can't have a Sage sub-process that I'm using in the notebook that is
>>> also able to communicate with other processes as I can't
>>> asynchronously receive data (or get timing interrupts). I've gotten
>>> around this in the past by using threads as it was the only choice I
>>> had.
It sounds like you're trying to use the notebook as a monitor for
long-running processes, which it wasn't designed for, but could be done.
>
> Are the following relevant, realistic examples? I wish to...
>
> * Start, monitor, stop, and/or steer a long-running computation from
> a browser. The computation runs in a main loop that periodically
> checks for incoming messages upon which to act and sends out new
> messages as necessary.
Sounds like dsage (or what dsage should become).
> * Filter data automatically through a sequence of independent
> worksheet processes.
Again, sounds a lot like what dsage should have been.
- Robert
It might be a bit off topic, but personally I think an actual multi-
threaded app, where some threads may be blocked (and that's not a
problem because the other threads can continue on) is sometimes
easier to reason about then having to do everything asynchronously.
The asynchronous model works well when processing each event is
relatively quick or has a natural callback, but otherwise it often
feels like having to manually enforce multitasking so as to not block
the entire reactor. Multithreading will have to be introduced at one
level or another to scale the notebook to more than a single
processor anyways.
- Robert
Huh? Why? I don't see any need for multithreading to solve the
above problem, or rather I don't understand what problem you're
talking about. The notebook already scales to more than a single
processor.
I also now know precisely what Glenn Tarbox's original problem is,
since I've recently also experimented with using the Interactive
Broker's API from the notebook. It's an interesting nontrivial
problem. I hope to provide some demo code for Glenn once I work this
out...
-- William
>
> On Wed, Aug 19, 2009 at 1:19 AM, Robert
> Bradshaw<robe...@math.washington.edu> wrote:
>> Multithreading will have to be introduced at one
>> level or another to scale the notebook to more than a single
>> processor anyways.
>>
>> - Robert
>
> Huh? Why? I don't see any need for multithreading to solve the
> above problem, or rather I don't understand what problem you're
> talking about. The notebook already scales to more than a single
> processor.
I am talking about the case where there are enough users that the notebook
process itself becomes the bottleneck. It all depends on how lightweight
the shuffling data between the underlying processes and the browser is,
and how many concurrent users one wants to support for a single notebook.
In the asynchronous model there is only one thread handling all of the
connections. (Also, anything long-running, e.g. taring up all a users
worksheets for download, needs to spawn a separate thread/process.)
Of course if the whole setup is running on a single machine, it may be
that the computational processes are always the bottleneck.
> I also now know precisely what Glenn Tarbox's original problem is,
> since I've recently also experimented with using the Interactive
> Broker's API from the notebook. It's an interesting nontrivial
> problem. I hope to provide some demo code for Glenn once I work this
> out...
Yes, my comment was completely independant of his original issue.
- Robert
Thanks for the clarification, which makes sense. There are other
approaches. If one had tens of thousands of simultaneous users,
instead of having multiple threads one could assign users to a
separate process (that could handle up to n users max) when they first
connect. That could scale better than SMP threads, since it is
easier to distribute the load across servers. Maybe this is just
orthogonal though.
Regarding the tar example, one solution might be to run it as a
separate process, then later check if that process finished -- it is
not necessary for the notebook server process to wait for a subprocess
doing tar to finish before continuing. Another possibility if the
tar'ing happens in the same process would be to use fork.
For fun, I just looked at Activity Monitor on OS X, and sorted the
tasks I'm running by number of threads. The top is Firefox and the
bottom is python.
-- William
Yep. If the worksheet data was backed by a (synchronzed) database then
this would work well.
> Regarding the tar example, one solution might be to run it as a
> separate process, then later check if that process finished -- it is
> not necessary for the notebook server process to wait for a subprocess
> doing tar to finish before continuing. Another possibility if the
> tar'ing happens in the same process would be to use fork.
The point is that you have to do this manually, rather than just letting
that thread block for a while. (I think the right asynchronous way to do
it would be to set up a callback for when it's done, rather than
repeatedly coming back to check on it.)
> For fun, I just looked at Activity Monitor on OS X, and sorted the
> tasks I'm running by number of threads. The top is Firefox and the
> bottom is python.
Not surprising. Also, I forgot about the GIL, which truely limits the
performance benifits of threading in Python. If anything ever kills
Python, I bet it'll be the GIL (but I'm hopeful that it'll get removed
before it causes an untimely death...)
- Robert
Yes, this is one of the advantages of using a database (or a shared
filesystem like apache does!) to store the worksheet data.
>> For fun, I just looked at Activity Monitor on OS X, and sorted the
>> tasks I'm running by number of threads. The top is Firefox and the
>> bottom is python.
>
> Not surprising. Also, I forgot about the GIL, which truely limits the
> performance benifits of threading in Python. If anything ever kills
> Python, I bet it'll be the GIL (but I'm hopeful that it'll get removed
> before it causes an untimely death...)
Maybe you can remove it :-)
William
Has anyone here ever experimented with Stackless Python?
I've been wondering about it for a while.
Thanks,
Jason
>>> Not surprising. Also, I forgot about the GIL, which truely limits the
>>> performance benifits of threading in Python. If anything ever kills
>>> Python, I bet it'll be the GIL (but I'm hopeful that it'll get removed
>>> before it causes an untimely death...)
>>
>> Maybe you can remove it :-)
>>
>> William
>
> Of course, this is the penultimate reason that going multi-threaded in
> python is insane... not only do you get the opportunity to learn all
> about synchronization and thread management, you also enjoy non-
> deterministic bugs which only take days or weeks to solve whereas more
> conventional logic bugs take many many minutes, sometimes even hours.
> (From a Keynesian economics perspective, going multi-threaded is
> justified just by the added work)
My point was that there is benifit going mutli-threaded: you don't have to
manually set up callbacks/fork every time you might block. Whether this
simplification is worth the other complexities depends on the program at
hand (and probably the programer as well).
- Robert
This is easy to read for someone who already knows twisted, but I
have a hard time convincing myself that it's easier to read than
def doThings();
result1 = someting_that_blocks()
print("doing ThingOne", result1)
result2 = sqrt(result1)
print("doing ThingTwo", result2)
result3 = another_thing_that_blocks(result2)
print("doing ThingThree", result3)
return result3*3.1415927
I wonder if this has contributed to the lack of contributions to the
notebook relative to other parts of Sage. Also, the painful part is
often to write the "xxx_that_blocks," especially if it's "blocking"
because it's computationally expensive, rather than waiting on an
external event that naturally triggers a callback. One needs to
manually chop the computation into small enough bits that no one
piece takes too long, and this can't be hidden from the implementer.
(Either that, or run the computationally intensive part in a separate
thread/process and set up a callback, but the one isn't avoiding
multiple threads...) And the twisted model is non-deterministic
(hence harder to debug) as well.
I have to admit I'm playing a bit of the devil's advocate here, I've
used both twisted and threads (though admittedly I've never used
Python threads) and the twisted folks have implemented a very nice
model for dealing with this kind of thing. When there's a lot of
synchronization or global state threading can be a pain. But I think
it's far from obvious that the twisted model is a better fit (though
the notebook is a controller for multiple processes and simple
shuffler of data back and forth, so there is a strong case here).
- Robert
The Notebook has a major problem. The pexpect code is opaque and
> I wonder if this has contributed to the lack of contributions to the
> notebook relative to other parts of Sage.
kludgy and includes strings of python code which get injected into the
spawned notebook process amongst other issues. There's also lots of
handling of "corner cases" which have evolved over time. In my not so
humble opinion, it should all be ripped out and replaced with a proper
asynchronous interface. This could be accomplished with very little
effort,
but it hasn't been a priority and it would require William's
buy-in and assistance. Until very very recently, this wasn't
feasible.
Well, I suppose reasonable people can disagree.
(not that I'm reasonable... but I'm pretty sure I'm right)
In the current architecture, a twistd daemon spawns a notebook server
which is responsible for doing "sage" stuff. twistd is fully
asynchronous, but the notebook process itself is a pexpect based
blocking process connected with pipes to twistd. As such, the block
on read by pexpect precludes the sage process servicing asynchronous
events.
IMHO, this architecture is incorrect and limited... Perhaps this is
part of what is being rethought... if not, I believe it should be.
The Sage notebook is a lot like the command line tools bash or screen
or even ssh. The pexpect library is just a collection of Python
bindings to pseudotty that make it easy for one process to spawn and
run subprocesses.
Moreover, as long as the worksheet and the notebook server are
distinct processes (as they should be, IMHO), the difference between
using pexpect, or xmlrpc, or anything else, for them to communicate is
completely and totally irrelevant, since it is a black box to the
entire rest of the program.
Also, to correct another possible misconception, communication between
a processes and a subprocess using pexpect is not blocking. The
master processes can listen for however long it wants to the
subprocess, then stop listening. That's why when you do
for i in range(10):
sleep(1)
print(i)
in the Sage notebook, you see the output as it is computed. The
notebook server just uses pexpect to "peak" at the output of the
subprocess doing the actual work and look to see what has been output
so far.
Another misconception is that pexpect is restricted to local
processes. It's easy to control a process via pexpect over the
network via ssh. This has been in Sage since 2005, and can already
be used for worksheet subprocesses *now* as long as you have a shared
filesystem (just use the server_pool option). Here is an example on
the command line. I have ssh keys setup so I can do "ssh
sage.math.washington.edu" and login without typing a password. I
start Sage on my laptop in a coffee shop, and make a connection to a
remote Sage that gets started running on sage.math, and I run a
calculation.
flat:sageuse wstein$ sage
----------------------------------------------------------------------
| Sage Version 4.1.1, Release Date: 2009-08-14 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: s = Sage(server="sage.math.washington.edu")
No remote temporary directory (option server_tmpdir) specified, using
/tmp/ on sage.math.washington.edu
sage: s.eval("2+2")
'4'
sage: s.eval("os.system('uname -a')")
'Linux sage.math.washington.edu 2.6.24-23-server #1 SMP Wed Apr 1
22:14:30 UTC 2009 x86_64 GNU/Linux\n0'
sage:
The above used pexpect. You can even interact with remote objects:
sage: e = s("EllipticCurve([1..5])")
sage: e.rank()
1
You can do the same with Mathematica, etc. by the way:
sage: s = Mathematica(server="sage.math.washington.edu")
sage: s("Factorial[50]")
30414093201713378043612608166064768844377641568960512000000000000
Compare my laptop to sage.math's mathematica:
sage: s("Timing[Factorial[10^6]][[1]]") # sage.math
1.1099999999999999
sage: mathematica("Timing[Factorial[10^6]][[1]]") # laptop
0.8902620000000001
(I guess Mathematica 7.0 is faster at factorials than Mathematica 6.0.)
This tests latency:
sage: timeit('s.eval("2+2")') # over web via ssh
5 loops, best of 3: 56.3 ms per loop
sage: timeit('mathematica.eval("2+2")') # local
625 loops, best of 3: 209 µs per loop
Of course latency is long over the net, since I'm in a random coffee shop.
This remote server stuff has been in sage since 2005, and hasn't been
changed in the slightest bit since then. That's why I'm advertising
it now, since it would be cool to see some people work on it and
improve it. For example, for people without ssh keys, one could
*easily* make it so the following works:
sage: s = Mathematica(server="sage.math.washington.edu")
password: xxx
sage: s = Mathematica(server="w...@sage.math.washington.edu")
password: xxx
Scripted logins via pexpect are in fact the raison d'etre for pexpect
in the first place, and would be easy to add. There are also bound
to be all kinds of subtle issues with server=... that haven't been
found due to lack of use. A good test would be to try to force the
gap or maxima interfaces to run 100% remotely (by editing
interfaces/gap.py or interface/maxima.py), then try to run the Sage
test suite and see what goes wrong.
With respect to the notebook, there is currently some reliance on a
shared filesystem for the worksheet processes. This could be I think
easily fixed via some slight redesign, and I'll do this in October.
I could even make it so that there is an option for a given worksheet
(set in say a worksheet configuration pane) for that worksheet to run
as a given user on a given remote system. Then whenever you use that
worksheet, you would have to login to the remote system to start it
running, and afterwards all computations would happen using the
default "sage" command on that remote system over ssh. I think
implementing this would be completely straightforward given the
current notebook design, and already this would provide a level of
flexibility and power that rivals anything the codenode design or
anybody else has suggested. In case the above wasn't clear, one
could go to say https://sagenb.org, login, but then have persistent
worksheet processes that run on sage.math.washington.edu, or any other
powerful specific computer you have an account on. This would give
you access to your own build of Sage, commercial software on that
machine, etc.
So there is still some potential to the pseudotty approach to
controlling processes. The main drawback in my mind is that it
works differently (and maybe not so well) on Windows (though it does
actually work, but via the "Console API").
-- William
Moreover, as long as the worksheet and the notebook server are
distinct processes (as they should be, IMHO), the difference between
using pexpect, or xmlrpc, or anything else, for them to communicate is
completely and totally irrelevant, since it is a black box to the
entire rest of the program.
Also, to correct another possible misconception, communication between
a processes and a subprocess using pexpect is not blocking. The
master processes can listen for however long it wants to the
subprocess, then stop listening. That's why when you do
for i in range(10):
sleep(1)
print(i)
in the Sage notebook, you see the output as it is computed. The
notebook server just uses pexpect to "peak" at the output of the
subprocess doing the actual work and look to see what has been output
so far.
Another misconception is that pexpect is restricted to local
processes. It's easy to control a process via pexpect over the
network via ssh. This has been in Sage since 2005, and can already
be used for worksheet subprocesses *now* as long as you have a shared
filesystem (just use the server_pool option). Here is an example on
the command line. I have ssh keys setup so I can do "ssh
sage.math.washington.edu" and login without typing a password. I
start Sage on my laptop in a coffee shop, and make a connection to a
remote Sage that gets started running on sage.math, and I run a
calculation.
So there is still some potential to the pseudotty approach to
controlling processes. The main drawback in my mind is that it
works differently (and maybe not so well) on Windows (though it does
actually work, but via the "Console API").
William,
Thanks for clarifying some of the details of pexpect. I do really want to understand this because I am starting to use the notebook more and currently IPython's parallel stuff works fine (there are a few things that need to be fixed on our side to make it easier though).
Moreover, as long as the worksheet and the notebook server are
distinct processes (as they should be, IMHO), the difference between
using pexpect, or xmlrpc, or anything else, for them to communicate is
completely and totally irrelevant, since it is a black box to the
entire rest of the program.
I agree with you that for the rest of the program (the notebook) this detail is completely hidden. But I guess I don't quite follow your statement that the differences of using pexpect/twisted to manage this are irrelevant. In my mind there is a big different between pexpect and twisted:
* pexpect simply controls and observes the worksheet process (I now understand that this can be asynchronous). The worksheet process doesn't have *any* custom code to enable this to work and probably doesn't even import pexpect (unless it does so for a completely different reason - like talking to Mathematica, etc.).
* To get Twisted to make two processes talk over TCP/IP (I am ignoring Twisted's ability to talk to a process in the same manner as pexpect, which I think it might be able to do - are you thinking of this?) *both* processes must start the Twisted reactor. Thus, if you wanted the notebook server and a worksheet process to talk over xmlrpc or pb, the worksheet process must be re-designed to run the Twisted reactor. To use IPython or dsage in a context like that, the Twisted reactor and user code must be run in different threads. This is a super subtle aspect of using Twisted in a blocking manner like the dsage and IPython clients do. I can give more details about this aspect if needed.
So there is still some potential to the pseudotty approach to
controlling processes. The main drawback in my mind is that it
works differently (and maybe not so well) on Windows (though it does
actually work, but via the "Console API").
Question then: are you planning on continuing to use pexpect to communicate between the notebook server and worksheet or are you planning on moving to Twisted for that?
I'm reviving this thread, since I just got very curious about how to
solve the problem you (=Robert Bradshaw) were alluding to above in a
particular case, since I'm working on the notebook all weekend. As a
reminder, here is the problem: In the Sage notebook, we want to have
a feature where the user can click "Download all worksheets" and the
notebook server will prepare a zip archive of all their worksheets,
then hand it to the user. Robert Bradshaw implemented a function to
do this a few months ago, but it is disabled on the public Sage
notebook servers. Why? Because while the zip archive is being
created the notebook server simply ignores all other requests. In
particular, let's say I have 500 worksheets and creating sws files and
zipping them all up takes 30 seconds (that's about how long it
actually takes), then when I click that "Download all link" the entire
http://sagenb.org will appear to be down to everybody in the world for
the next 30 seconds. Not good, especially given that
http://sagenb.org has over 4000 more users now than it did a month
ago...
The Sage Notebook is a Twisted application, and Twisted's "deferreds"
might seem like a good idea for solving the above problem. However,
they are actually *not* at all meant to solve the above sort of
problem, which is made I think very clear by the Twisted
documentation, which lists two types of async problems -- cpu bound
and "waiting for a resource" bound. The problem, at its simplest
level, is that no matter you do with Twisted deferreds -- making the
zip file little by little -- everything happens in a single thread,
and a total of at least 30 seconds of CPU time has to be spent by the
Sage notebook server making that zip archive. And that's 30 seconds
that the notebook server isn't responding to users, so overall the
notebook is going to feel sluggish to users. Also, it just seems
dumb to slow the notebook server down like this, given that, e.g.,
sagenb.org is running on an 8-core multicore virtual machine.
Fortunately,
http://twistedmatrix.com/projects/core/documentation/howto/gendefer.html
gives an example similar to this problem as an example, and explains
how to easily solve it in two lines using *threads*. So I took the
big chunk of scary blocking code that Robert Bradshaw wrote, put it in
a closure (a little next function f), and added the following two
lines to the server:
from twisted.internet import threads
return threads.deferToThread(f)
That's it. It worked first try, and solves the problem. What
happens behind the scenes is that Twisted uses a separate thread to
run the one function f, then when f completes it returns the output of
f. So it wraps the idea of "do something in a thread" with a
deferred.
Twisted experts -- please explain the drawbacks of this approach...
By the way, the Twisted documentation has got way way better than I
remember it being in 2006.
-- William
By the way, the Twisted documentation has got way way better than I
remember it being in 2006.
-- William
I'm not a twisted expert, but i know a lot about threads and sub
processes. The basic problem is, that the user calls something
synchronized when he/she requests a zip, but behind the scenes it's
done asynchronous. If you have solved this, perfect. The only case you
should try to catch is, when the subprocess takes a little bit longer
and the user thinks it didn't work and starts the zipping again. Does
it prevent double invocations and is there a feedback if the process
fails / takes too long (timeout) ? These are the things that come to
my mind ... just in case you haven't already thought about them ;)
H