Memory leak in loops (SageMath 8.1)

144 views
Skip to first unread message

Marco Caselli

unread,
Dec 15, 2017, 7:06:18 AM12/15/17
to sage-support
Hello there,
I am currently having trouble in memory management, I wrote a code that was not supposed to store anything but actually it was using a massive amount of memory. At a first glance, I thought it was related to some function in my code, such as factor() for polynomials in a multivariate ring over a finite field.  I dug a bit deeper and the problem seems even at a lower level:

def check_memory(k):
a=get_memory_usage()
silly_function(k)
return get_memory_usage()-a

def silly_function(k):
for i in range(10^k):
2+2

Notice that silly_function does not store neither return anything, so the memory usage should be zero.
check_memory(9) returned 23446.37109375, this amount is in MB, so those are 23GB. This information is consistent with the data from top. At a second run it returned just 972.62109375, which is still a lot.

Vincent Delecroix

unread,
Dec 15, 2017, 7:16:04 AM12/15/17
to sage-s...@googlegroups.com
You can fill your memory with something simpler

sage: l = range(10**9)

As far as I can see it has nothing to do with Sage or loops. In Python2
the range functions constructs a list. And in the above example, the
list is huge.

Vincent

John Cremona

unread,
Dec 15, 2017, 7:25:47 AM12/15/17
to SAGE support
I think Marco (who works with me) took my suggestion to simplify his problem code as much as possible before posting a little too literally.  Marco, send in something closer to what you showed me yesterday (which was about factorization of polynomials of degree 4 in F[X,Y,Z] with F a quite small finite field).

Vincent, if you are still in Warwick today Marco could show you his code directly.

John

--
You received this message because you are subscribed to the Google Groups "sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-support+unsubscribe@googlegroups.com.
To post to this group, send email to sage-s...@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.

Dima Pasechnik

unread,
Dec 15, 2017, 7:27:22 AM12/15/17
to sage-support


On Friday, December 15, 2017 at 12:16:04 PM UTC, vdelecroix wrote:
You can fill your memory with something simpler

   sage: l = range(10**9)

As far as I can see it has nothing to do with Sage or loops. In Python2
the range functions constructs a list. And in the above example, the
list is huge.

indeed. In order to avoid this, use xrange() instead of range()
(well, this is not going to make your code Python3-proof, but range() in Python3
is what xrange() is in Python3; see e.g. https://stackoverflow.com/a/15015199/557937
on how to make it portable)

Marco Caselli

unread,
Dec 15, 2017, 9:51:37 AM12/15/17
to sage-support
Thank you very much for your prompt replies. I was sure that range(n) creates an iterator instead of the list itself, my bad. 
In any case, even if the function is creating this list, why is it still stored in memory after the function terminates? It is a local variable, and is not returned so it should be erased from the RAM.


In my original code, I iterate ternary polynomials over a finite field K and factor them. I am constructing them from the vector of the coefficients, which is an element of K^n. In order to create the vector of coefficients, I iterate over K^n (which I think should support iteration, as K does, but after the mistake on range I am no more so confident). Here an example: (actually it behaves badly even with binary polynomials):

def ex_f(q):
K = GF(q)
P.<x,y,z> = PolynomialRing(K,3)
monomials=[x,y,z,x^2,y^2,z^2]
for v in K^6:
f=1+sum([v*m for (v,m) in zip (v,monomials)])
f.factor()

With 7 as input it uses 283MB. By uses, as above, I mean the output of the function "check_memory" which should return the difference of the amounts of RAM dedicated at the Sage process before and after launching ex_f(7). 
Running a similar function with univariate polynomials does not require memory, so I guess should not be an iterator-related problem.

Dima Pasechnik

unread,
Dec 15, 2017, 10:28:29 AM12/15/17
to sage-support


On Friday, December 15, 2017 at 2:51:37 PM UTC, Marco Caselli wrote:
Thank you very much for your prompt replies. I was sure that range(n) creates an iterator instead of the list itself, my bad. 
In any case, even if the function is creating this list, why is it still stored in memory after the function terminates? It is a local variable, and is not returned so it should be erased from the RAM.

it need not happen, unless you explicitly trigger garbage collection

import gc
gc.collect()

Nils Bruin

unread,
Dec 15, 2017, 10:47:02 AM12/15/17
to sage-support
On Friday, December 15, 2017 at 6:51:37 AM UTC-8, Marco Caselli wrote:
Thank you very much for your prompt replies. I was sure that range(n) creates an iterator instead of the list itself, my bad. 
In any case, even if the function is creating this list, why is it still stored in memory after the function terminates? It is a local variable, and is not returned so it should be erased from the RAM.

And undoubtedly it did (a list of integers has no circular references, so it can be deleted just based on reference counts), but obtaining memory from and returning memory to the operating system is an expensive operation, so python is probably reluctant to do so.
 
get_memory_usage only reports how much memory is allocated to the python process, not how much of it python is actually considering as in use.

Marco Caselli

unread,
Dec 18, 2017, 10:48:34 AM12/18/17
to sage-support


Il giorno venerdì 15 dicembre 2017 16:28:29 UTC+1, Dima Pasechnik ha scritto:
it need not happen, unless you explicitly trigger garbage collection

import gc
gc.collect()

 
Thanks! Now I know why the first function needed so much memory. I did not trigger garbage collection but it seems enabled by default (at least this is the case for all Sage's kernels installed in CoCalc). I do not see why it should be enabled by default. 
Unfortunately this does not solve the problem with the above function "ex_f", even if it does not return anything it is still "consuming" memory. Could be that the "f.factor()" output is stored in the polynomial's object and then the object itself is not erased/overwritten when a new polynomial is generated? Or a simpler solution could be that f.factor() calls Singular and there the garbage collector is enabled.

Il giorno venerdì 15 dicembre 2017 16:47:02 UTC+1, Nils Bruin ha scritto: 
And undoubtedly it did (a list of integers has no circular references, so it can be deleted just based on reference counts), but obtaining memory from and returning memory to the operating system is an expensive operation, so python is probably reluctant to do so.
 
get_memory_usage only reports how much memory is allocated to the python process, not how much of it python is actually considering as in use.

So there is a discrepancy between the real amount of memory in use and the one allocated but, when I run a process, my hardware limits the allocated one. So, for instance, I can evaluate ex_f(p) just for very small values of p even if the real amount of memory in use is almost zero for any p. 
 

Nils Bruin

unread,
Dec 18, 2017, 11:37:48 AM12/18/17
to sage-support
On Monday, December 18, 2017 at 7:48:34 AM UTC-8, Marco Caselli wrote:

So there is a discrepancy between the real amount of memory in use and the one allocated but, when I run a process, my hardware limits the allocated one. So, for instance, I can evaluate ex_f(p) just for very small values of p even if the real amount of memory in use is almost zero for any p. 
 
No, not if there's no memory leak. Memory that is freed by python is available for re-use by python. That's actually the reason for python to not immediately return freed memory to the operating system: reusing it for new python objects is considerably faster.

If your routine f_ex runs out of memory then that would indicate a memory leak. If you look at the code for (K^6).__iter__ you'll see it really intends to be an iterator that uses limited memory.

Marco Caselli

unread,
Dec 22, 2017, 6:07:13 AM12/22/17
to sage-support


Il giorno lunedì 18 dicembre 2017 17:37:48 UTC+1, Nils Bruin ha scritto:
If your routine f_ex runs out of memory then that would indicate a memory leak. If you look at the code for (K^6).__iter__ you'll see it really intends to be an iterator that uses limited memory.

Yes, it is definitely a memory leak. Running the same routine on Magma I have not experienced any memory issue.

Nils Bruin

unread,
Dec 22, 2017, 12:07:19 PM12/22/17
to sage-support
I can confirm that

R.<x,y,z>=GF(3)[]
f=x^2 + y^2 + z^2 + x + y + z - 1
while True: _=f.factor()

does appear to be leaking memory. It's not leaking in python, though! If I compare the python heap before and after with gc.get_objects I'm not seeing any serious accumulation of objects. So the leak must be somewhere else; probably in the factorization library or the code interfacing with it.

We have some leaking problems with libsingular, see e.g.:
but this particular example doesn't seem to be exactly of the above reported kinds.


Reply all
Reply to author
Forward
0 new messages