Running multiple sage sessions because of memory issues

55 views
Skip to first unread message

Kevin Buzzard

unread,
Aug 4, 2014, 8:11:38 AM8/4/14
to sage-s...@googlegroups.com
TL;DR: I am going to write a bash loop which loops through 1<=N<=10000 and feeds the number N into a function in a sage session, one session per N. Has anyone written a robust way of doing this already?

************************************************************

Gory details: I have memory management issues with sage, of a similar nature to memory management issues I had with magma 10 years ago, and I'm going to solve them by running multiple sage sessions because I am too lazy to learn about memory management issues.

Here is a simple loop:

N=Integer(1)
R.<t>=PolynomialRing(GF(5))
charpolys=[]
while N<=10000:
     f=ModularSymbols(N,2,1).cuspidal_subspace().hecke_operator(3).matrix().change_ring(GF(5)).charpoly('t')
     charpolys=charpolys+[f]
     print N,f
     N=N+1
     sage.modular.modform.constructor.ModularForms_clear_cache()

It's collecting characteristic polynomials of T_3 mod 5 on spaces of weight 2 cusp forms of level N=1,2,3,... and adding them to a list . [this is not what I actually want to do, it's just a simple example indicating the issues I'm having]. The point I'm trying to indicate is that there's a complicated (but standard) modular forms calculation in the middle, but the result (and all I want to keep track of) is a polynomial of relatively small degree. The last line is a naive attempt on my part to persuade sage to forget all the modular form calculations that it has done.

On my machine, by the time N=1200, the underlying python session is occupying about 10 gigs of memory and my machine is swapping like crazy. On the other hand, performing the commands within the loop for N=1200 takes far less memory.

In my mind there are two ways to deal with this issue (because I would like to go far further than N=1200). The first is to actually understand what is going on and try to kill the spaces of modular forms that are presumably piling up in memory. For me this is not an ideal solution because it's the sort of thing where if next week I want to do some other loop of this nature I'll have to come and ask again.

The second is what I'm far keener on; instead of running the loop in sage, I want to run it in something else, e.g. a bash shell, and fire up lots of separate sage sessions, each of which just does the calculation for one value of N and then dies. This might sound like a far more amateur approach to some, but basically at the end of the day it's an extremely efficient way of doing the memory management issues that I'm struggling with, because you 100% kill everything you're not interested in when the session dies, so the bottom line is that it will work.

When I did this before with magma I just wrote my own routines; they were rather amateurish but they worked. I would write all magma output to a file and then extract the lines I wanted (flagged in the magma output by my printing "THE NEXT LINE IS THE LINE YOU WANT" in my magma program just before printing out the polynomial I wanted to keep etc etc) and write it all to another file -- a real mishmash of bad shell script programming and bad magma programming, but because all of this stuff was only taking a few seconds per loop and the actual modular forms calculations were taking several minutes (or longer) it didn't really matter.

Has someone written such scripts already? In short, I have a program function.sage, defining a function m(N), and I want to loop from N=1 to 10000 in a bash shell and for each N, fire up a sage session, load function.sage, compute m(N) and write the result to a file (one file in total, for all values of N). This seems like a sufficiently generic problem that perhaps someone has solved it already; if no-one has done it I'll write some amateurish code to do it myself but I thought there was no harm in asking first.

Kevin

kcrisman

unread,
Aug 4, 2014, 8:46:48 AM8/4/14
to sage-s...@googlegroups.com


TL;DR: I am going to write a bash loop which loops through 1<=N<=10000 and feeds the number N into a function in a sage session, one session per N. Has anyone written a robust way of doing this already?

 
This may be naive, but would using the parallel module be helpful?  Is this a massively parallel thing? I don't know if that would solve your memory issues or not but it seems like what you are trying to do from a cursory reading of your post.


- kcrisman

William

unread,
Aug 4, 2014, 10:01:05 AM8/4/14
to sage-s...@googlegroups.com
On Mon, Aug 4, 2014 at 5:11 AM, Kevin Buzzard <kevin.m...@gmail.com> wrote:
> TL;DR: I am going to write a bash loop which loops through 1<=N<=10000 and
> feeds the number N into a function in a sage session, one session per N. Has
> anyone written a robust way of doing this already?

Yes, I implemented a robust way to do this long ago.  Use the @fork
decorator, and do *NOT* try to mutate a global variable in the
function you're calling -- this makes no sense because it happens in a
subprocess.

@fork
def g(N):
     f=ModularSymbols(N,2,1).cuspidal_subspace().hecke_operator(3).matrix().change_ring(GF(5)).charpoly('t')
     print N,f; sys.stdout.flush()
     return f


N=Integer(1)
R.<t>=PolynomialRing(GF(5))
charpolys=[]
while N<=10000:
    charpolys.append(g(N))
    N += 1
    print get_memory_usage(), charpolys   # for testing


If you want to do several in parallel, you can easily do that too as
follows.  This will both completely eliminate memory leak issues, and
use all processors on your computer.

@parallel
def g(N):
     f=ModularSymbols(N,2,1).cuspidal_subspace().hecke_operator(3).matrix().change_ring(GF(5)).charpoly('t')
     print N,f; sys.stdout.flush()
     return f

R.<t>=PolynomialRing(GF(5))
charpolys={}
for x in g([1..10000]):
    N = x[0][0][0]
    charpolys[N] = x[1]
    save(charpolys, 'charpolys.sobj')   # saves all so far to a single
file -- load later with load('charpolys.sobj')


The save above will save the charpolys dict to disk each time you get
back another charpoly.

Welcome to the modern world (though everything above just uses a few
Python functions from the late 1990s -- pickle and fork). Compared to
Magma, Sage is much, much better at this sort of stuff...

 -- William

Kevin Buzzard

unread,
Aug 4, 2014, 3:08:59 PM8/4/14
to sage-s...@googlegroups.com
Ooh I'm _really_ glad I asked now. Many thanks William.

The first time I wanted such a loop, I was beta testing your magma modular symbols code in 2000 or so :-)

Kevin

William A Stein

unread,
Aug 4, 2014, 3:21:39 PM8/4/14
to sage-support
On Mon, Aug 4, 2014 at 12:08 PM, Kevin Buzzard
<kevin.m...@gmail.com> wrote:
> Ooh I'm _really_ glad I asked now. Many thanks William.
>
> The first time I wanted such a loop, I was beta testing your magma modular
> symbols code in 2000 or so :-)

I discovered the programming language Python around then in order to
script running lots of Magma calculations on MECCAH (=mathematics
extreme computation cluster at Harvard).

By the way, there's another thread today on sage-support also about
@parallel, which you might want to read.
> --
> You received this message because you are subscribed to the Google Groups
> "sage-support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sage-support...@googlegroups.com.
> To post to this group, send email to sage-s...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-support.
> For more options, visit https://groups.google.com/d/optout.



--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org
wst...@uw.edu
Reply all
Reply to author
Forward
0 new messages