avoiding startup time cost with multiple invocations

43 views
Skip to first unread message

Berkeley Churchill

unread,
Jan 10, 2018, 6:49:33 AM1/10/18
to sage-support
As part of a research project we're using sage as a subroutine for some matrix computations over rings.  We have hundreds or thousands of these computations, but each computation is fairly quick.  Right now, for each computation we write python code into a .sage file, and then start a new process with "sage somefile.sage > output".  This works fine, except each time we incur the startup time, which is about 2.5 seconds (whereas running a tiny python program is < 0.02s).  Since we call sage hundreds of times, this adds up pretty quickly.  

I'm wondering if there's a better way to do this.  Is there a recommended practice?  Otherwise, is it possible to start sage in one process and then reuse it?  Or, if there's a way to start up sage quickly (comparable to python's start time) and then load the modules we need, that would be cool too.  I'm hoping for a solution that doesn't require many changes to our code or setting up server infrastructure.  We also want every computation to be independent of the others.  For instance, we don't want an exception in one computation to cause problems for a future one.  Any suggestions are appreciated.  If it makes any difference, our code base is C++.

Jori Mäntysalo

unread,
Jan 10, 2018, 7:04:52 AM1/10/18
to sage-support
On Wed, 10 Jan 2018, Berkeley Churchill wrote:

> As part of a research project we're using sage as a subroutine for some matrix computations over rings.  We have hundreds or thousands of these computations, but each computation
> is fairly quick.  Right now, for each computation we write python code into a .sage file, and then start a new process with "sage somefile.sage > output".  This works fine, except
> each time we incur the startup time, which is about 2.5 seconds (whereas running a tiny python program is < 0.02s).  Since we call sage hundreds of times, this adds up pretty
> quickly.  

In Sage - actually in Python - you can handle exceptions:

L = [2, 0, -3]
for x in L:
try:
print(1/x)
except ZeroDivisionError:
print("Can't compute inverse of %x." % x)

so why you can't just make you C++ program to output, say, 10000 of matrix
computation to a .sage file having some try-except -structure on them?

--
Jori Mäntysalo

Vincent Delecroix

unread,
Jan 10, 2018, 1:30:02 PM1/10/18
to sage-s...@googlegroups.com
Have a look at

https://docs.python.org/2/extending/embedding.html

The following does work for me

1. Create a test.c file with
{{{
#include <Python.h>

int
main(int argc, char *argv[])
{
Py_SetProgramName("python"); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString("from sage.all import *\n"
"print ZZ(3).is_prime()\n");
Py_Finalize();
return 0;
}
}}}

2. Then to compile you need to do

$ sage -sh
$ gcc $(python2-config --cflags) $(python2-config --ldflags) test.c

3. Run the program (still in the sage-sh shell)

$ ./a.out
True

4. Exit sage environment

$ exit

Vincent

Berkeley Churchill

unread,
Jan 11, 2018, 2:35:25 AM1/11/18
to sage-support
Thanks Vincent, this seems like it's in the right direction.  One possible solution would be to combine the embedded python interpreter with some error handling (as Jori suggested) and wrap all the computations in functions to act as a namespace for variables.  Can you explain to me what the sage-sh shell is?  It seems like a tightly-coupled dependency to run the software in such a specific environment.  Can it be compiled and run without the special shell?  If it's mainly setting environment variables, can we find out what they are so that we can instead manage them with our other tools?

(@Jori: unfortunately starting one process and doing all the computations at once won't work for us because we need to dynamically generate the n+1st computation based on the output of the nth computation.  We could theoretically port all that logic to python/sage, but we don't really think that's worth it right now)

Jori Mäntysalo

unread,
Jan 11, 2018, 2:54:26 AM1/11/18
to sage-support
On Wed, 10 Jan 2018, Berkeley Churchill wrote:

> (@Jori: unfortunately starting one process and doing all the
> computations at once won't work for us because we need to dynamically
> generate the n+1st computation based on the output of the nth
> computation.  We could theoretically port all that logic to python/sage,
> but we don't really think that's worth it right now)

Maybe you should then use pipe? First

mkfifo thepipe
./sage -q < thepipe > thepipe

and on the another window I tested with

jm58660@j-op7010:~/sage$ echo 1+2 > thepipe
jm58660@j-op7010:~/sage$ read result < thepipe
jm58660@j-op7010:~/sage$ echo $result
sage: 3
jm58660@j-op7010:~/sage$ tmp=$(echo $result | cut -f 2 -d ' ')
jm58660@j-op7010:~/sage$ echo $tmp+3 > thepipe
jm58660@j-op7010:~/sage$ read anotherresult < thepipe
jm58660@j-op7010:~/sage$ echo $anotherresult
sage: 6
jm58660@j-op7010:~/sage$ echo quit > thepipe

Or maybe use Sage as controller part, i.e. call C++-program from Sage
instead of calling Sage from C++-program?

--
Jori Mäntysalo

Berkeley Churchill

unread,
Jan 11, 2018, 3:18:38 AM1/11/18
to sage-s...@googlegroups.com
I'll look into using a pipe, that's a good suggestion and might be easier than the embedded interpreter.

Jori Mäntysalo

unread,
Jan 11, 2018, 3:24:21 AM1/11/18
to sage-s...@googlegroups.com
On Thu, 11 Jan 2018, Berkeley Churchill wrote:

> I'll look into using a pipe, that's a good suggestion and might be
> easier than the embedded interpreter.

There are many ways to do this. Sage could read a pipe and write to files,
another program could busy-wait to see for a file with given name to apper
(of course Sage should first write to temporary file and then rename it).
Or instead of busy-wait loop Sage could write to pipe when it's done. Also
inotify is one possible solution.

In any case, after you get rid of the startup time I suppose the next
slowest part is converting data structures between Python and C.

--
Jori Mäntysalo

Dima Pasechnik

unread,
Jan 11, 2018, 4:21:35 AM1/11/18
to sage-support


On Thursday, January 11, 2018 at 8:18:38 AM UTC, Berkeley Churchill wrote:
I'll look into using a pipe, that's a good suggestion and might be easier than the embedded interpreter.

making Sage the main "driver" calling your code wrapped as a Cython extension would another option, probably faster
and certainly easier to implement than an embedded python.
In particular, if the main part of your code is already a dynamic library, this would be not much extra coding...

Jeroen Demeyer

unread,
Jan 12, 2018, 4:17:57 AM1/12/18
to sage-s...@googlegroups.com
On 2018-01-10 19:27, Vincent Delecroix wrote:
> Have a look at
>
> https://docs.python.org/2/extending/embedding.html

Even simpler would be to use Cython to interface between C++ and
Python/Sage.

slelievre

unread,
Jan 14, 2018, 4:26:12 PM1/14/18
to sage-support
Forking a Sage process might be a way to achieve your goal.

Maybe someone more knowledgeable could say more about
how to fork a Sage process and where it is documented.

William Stein

unread,
Jan 14, 2018, 6:20:00 PM1/14/18
to sage-support
Type

parallel?

and

fork?

in Sage...

>
> --
> You received this message because you are subscribed to the Google Groups
> "sage-support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sage-support...@googlegroups.com.
> To post to this group, send email to sage-s...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sage-support.
> For more options, visit https://groups.google.com/d/optout.



--
William (http://wstein.org)
Reply all
Reply to author
Forward
0 new messages