memory management issue: deleted variables not released

239 views
Skip to first unread message

Denis

unread,
May 11, 2016, 11:15:50 AM5/11/16
to sage-devel

When I invoke a function like

BIG=myfunction(somevars)

where BIG gets to ~50% of RAM in a concrete case, re-running that function again results in a memory error. If I release BIG first, say put BIG=0, then it works. However, looking at top, I can see that only about half of the memory allocated in the first call was actually released when I clobbered BIG. So in the second case the memory usage first dropped to about 25% of RAM and then ended at about 75% (and produced the correct result btw). My RAM is 4G so we are talking about 1G gone missing after BIG=0.

I attach the memory failure report but I don't think it is really relevant here. Nothing was done before invoking myfunction(), except to load its definition. Of course it consists of lots of things calling each other, but the point here is that in the actual session, the only variable I hold on to is BIG itself. I think that releasing BIG should then involve a cascade of releasing the hidden variables.

I wonder if the developers think this is a bug. From the user's point of view it is certainly strange that one cannot invoke the same function call twice in a row.

Denis

memory_error.txt

Denis

unread,
May 11, 2016, 11:30:37 AM5/11/16
to sage-devel

Sorry, forgot to mention: all this is happening under Sage 7.0. Under 7.1 it is much worse, I get a silent "Memory exhausted" crash the FIRST time round, without a report. So it seems something really broke between 7.0 and 7.1.

Dima Pasechnik

unread,
May 11, 2016, 11:35:15 AM5/11/16
to sage-devel


On Wednesday, May 11, 2016 at 4:30:37 PM UTC+1, Denis wrote:

Sorry, forgot to mention: all this is happening under Sage 7.0. Under 7.1 it is much worse, I get a silent "Memory exhausted" crash the FIRST time round, without a report. So it seems something really broke between 7.0 and 7.1.

or, perhaps, some bugs were fixed, that made things worse.
Anyway, without seeing an example of code that you think should work it's very hard to say anything.
Could you provide such an example?

Thanks.
 

Nils Bruin

unread,
May 11, 2016, 1:16:59 PM5/11/16
to sage-devel
On Wednesday, May 11, 2016 at 8:15:50 AM UTC-7, Denis wrote:
I wonder if the developers think this is a bug. From the user's point of view it is certainly strange that one cannot invoke the same function call twice in a row.

Memory leaks are usually bugs, although in some cases, the caching behaviour that certain components of sage prescribe mean that what seems like a memory leak is in actuality according to specs.

Most memory leaks can be quite convincingly exhibited by running small cases in succession and showing that memory usage increases rather than stay roughly flat. Once you have such a case, finding what objects leak is usually fairly straightforward as well. How easy the fix is depends entirely ...

If you can find a variant of your code that shows increased memory usage across many smallish cases where you expect flat memory use, it would really help if you could report it. Debugging a run that uses > 1Gb of memory is less likely to easily produce results.

Denis

unread,
May 12, 2016, 5:08:49 AM5/12/16
to sage-devel

Thanks for the suggestions, I will try to do that. In the meantime, I tried 7.1 on a machine with 100G of RAM. Now it works, but it costs me 4.3 G of RAM the first time round - that is the same calculation which used 1G in 7.0. Re-running it a few times with or without clobbering BIG - I am not being precise here - shows a steady creep upwards of RAM use in huge chunks, e.g. it climbed to 7G during the third calculation but dropped to 5.7 when it ended and stayed there after clobbering.

I would like to use 7.0 on the 100G machine because I am planning some much bigger calculations and cannot afford this factor of 4 in memory use, but the tarball is not found among the old sources - did someone forget to put it there?

Denis
 

Denis

unread,
May 12, 2016, 6:19:14 AM5/12/16
to sage-devel

Sorry for the stupid question - I did not realize that some old sources are to be found among the new sources :)

Denis

Denis

unread,
Sep 10, 2016, 2:56:04 AM9/10/16
to sage-devel

As far as I am concerned, development of Sage stopped at 7.0 - I tried 7.3 yesterday, and could not do a mid-level benchmark calculation, which I have been repeating since at least 6.4. The problem is a memory fault. 7.0 is the last version with which I can do it. I am prepared to collaborate with any developer who is prepared to go after this bug seriously. The calculation involves some manipulations over polynomial rings in nine variables. The largest object is a 3838x3838 dense matrix. My own coding is modest, just a bunch of small functions calling each other and invoking user-level Sage routines.

Samuel Lelievre

unread,
Sep 10, 2016, 3:38:08 AM9/10/16
to sage-devel
Sat 2016-09-10 08:56:04 UTC+2, Denis on sage-devel:
Could you tell us your operating system, and how you installed Sage?

Can you reproduce your problem on SageCell [1] or SageMathCloud [2]?

Can you share your benchmark code, by sharing either of the following?
- a .py or .sage file with the code
- a Sage worksheet in the form of a .sws file or a .ipynb file
- a SageCell link
- a public worksheet on SageMathCloud

[2] SageMathCloud: https://cloud.sagemath.com

Samuel

Dima Pasechnik

unread,
Sep 10, 2016, 4:40:46 AM9/10/16
to sage-devel


On Saturday, September 10, 2016 at 6:56:04 AM UTC, Denis wrote:

As far as I am concerned, development of Sage stopped at 7.0 - I tried 7.3 yesterday, and could not do a mid-level benchmark calculation, which I have been repeating since at least 6.4. The problem is a memory fault. 7.0 is the last version with which I can do it. I am prepared to collaborate with any developer who is prepared to go after this bug seriously. The calculation involves some manipulations over polynomial rings in nine variables. The largest object is a 3838x3838 dense matrix. My own coding is modest, just a bunch of small functions calling each other and invoking user-level Sage routines.

Let me repeat that without seeing an example that works in 7.0, but not in later versions, it is basically impossible to help you.
Please make such an example available.

Best,
Dima
 

Denis

unread,
Sep 11, 2016, 9:51:16 AM9/11/16
to sage-devel
Thanks for the idea to post the example on the cloud. I will try to do that, but can't begin before Sept. 20.

Denis

Denis

unread,
Sep 21, 2016, 1:46:18 PM9/21/16
to sage-devel

I tried to reproduce the issue in the cloud, but it cannot do it with the default settings. Although I can check that my code works, the benchmark calculation cannot complete because of the limitations of the free account. To make a realistic comparison I need 4G of RAM and unlimited timeout (on my laptop it completes in ~10 min with an Intel i7 but I do not know into how many CPU shares that translates). If anyone can arrange this for me please inform me.

William Stein

unread,
Sep 21, 2016, 1:58:14 PM9/21/16
to sage-devel
On Wed, Sep 21, 2016 at 10:46 AM, Denis <denis...@gmail.com> wrote:

I tried to reproduce the issue in the cloud, but it cannot do it with the default settings. Although I can check that my code works, the benchmark calculation cannot complete because of the limitations of the free account. To make a realistic comparison I need 4G of RAM and unlimited timeout (on my laptop it completes in ~10 min with an Intel i7 but I do not know into how many CPU shares that translates). If anyone can arrange this for me please inform me.


With your SMC project opened, click on the "Help" button in the upper right and explain the situation...

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscribe@googlegroups.com.
To post to this group, send email to sage-...@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.



--

Denis

unread,
Sep 27, 2016, 4:44:22 PM9/27/16
to sage-devel

Tried but it didn't work out. MathCloud admins say they can't help. Tried also at SageCell but the calculation wouldn't end either way after several hours. Any ideas?

Denis

Jonathan Bober

unread,
Sep 27, 2016, 8:20:12 PM9/27/16
to sage-...@googlegroups.com
I just noticed this thread because of your recent reply, and happened to read through. (I haven't regularly read sage-devel for a while.)

As to your original email: I think there is a subtle python memory management issue there. If you run

sage: BIG=myfunction(somevars)
sage: BIG=myfunction(somevars)

then on the second invocation of the function, I'm pretty sure that the way Python works, it calculates the result of the function call and then assigns it to the variable BIG. In between, the garbage collector will probably run sometimes, but because the variable BIG has not yet been reassigned, the garbage collector might not clean it up. So it seems reasonable to me that

sage: BIG=myfunction(somevars)
sage: BIG = 0
sage: BIG=myfunction(somevars)

may behave differently.

Having said all that... It doesn't sound right that running the function once costs %50 of ram., and running it twice (with the BIG = 0) in between, costs 75%. However, there are certainly situations where that can happen. As was mentioned, Sage caches some computations, and that can occasionally lead to unwanted memory use. Additionally, when running this sort of short test, it seems a good idea to manually invoke the python garbage collector (import gc; gc.collect()) before conclusively declaring that there is a memory leak.

The _best_ way to help (and get help) and to get attention, if there is really a memory leak, is to write a short loop that looks something like

while 1:
    x = some_simple_function()
    gc.collect()
    print get_memory_usage()

and outputs an increasing sequence of numbers.

Going from some complicated code to a simple loop like that may be an arduous debugging task in itself, and is something I would consider a valuable service to Sage if it really finds a bug. In the intermediate regime, just sharing some code could be useful, if you are willing and able. There are at least a few people (such as myself, during the occasionally periods while I am paying attention) with >4 GB of ram and 10 minutes of cpu cycles to spare, who may be willing to help.

Finally (and this is the reason that I read through this thread and replied), there was a change in the way that Sage manages PARI memory usage (between 7.0 and 7.1, I think. See https://trac.sagemath.org/ticket/19883) which probably affects a very small number of users, but affects them very badly. (I know about this because it affects me.) If on your machine with 100 GB of ram, the output of 'cat /proc/sys/vm/overcommit_memory' is 2, then it affects you. Alternatively, if overcommit_memory is 0, then it is possible you are misreading the memory usage: the virtual memory usage will be high, but not the actual memory usage. The problem will hopefully be fixed by 7.4 (see https://trac.sagemath.org/ticket/21582), but the high virtual memory usage confusion will probably persist. Of course, it is also quite possible that you've found some other bad problem that popped up between 7.0 and 7.1.

On Tue, Sep 27, 2016 at 9:44 PM, Denis <denis...@gmail.com> wrote:

Tried but it didn't work out. MathCloud admins say they can't help. Tried also at SageCell but the calculation wouldn't end either way after several hours. Any ideas?

Denis

Denis

unread,
Sep 28, 2016, 8:08:09 AM9/28/16
to sage-devel

Jonathan, thank you for this thoughtful analysis. With a really simple function, I get the output

0
1066.03125
0
1066.03125
0
1066.1640625
0
1066.1640625
0
1066.29296875
0
1066.29296875
0
1066.29296875
0
1066.29296875
0
1066.421875
0
1066.421875
0
1066.421875
0
1066.421875
0
1066.421875
0
1066.421875
0
1066.421875
0
1066.5546875
0
1066.5546875
0
1066.5546875
.... and constant afterwards.

I do not know whether the initial rise is a problem but I suspect not. I get the same
behavior on the small (4G) machine, only with different numbers of course.

The output of overcommit_memory is zero, so that's not it.

I am quite willing to share my code with you and others who think they can help at a
similar level of commitment, but would prefer not to publish the code where anyone else
could read it. Please suggest a private channel if you are interested - I could open
a dedicated gmail address and post it here, unless you have a better idea.

Denis

Bill Hart

unread,
Sep 28, 2016, 8:26:09 AM9/28/16
to sage-devel
You don't need to post all your code, just a small example that demonstrates the problem you are experiencing.

If your computation is using half the memory on the machine, the solution is likely going to be to find a way to make it use less memory or to get a machine with more memory. 

The behaviour you've described so far seems wholly consistent with how Python handles objects, how garbage collectors work and how various libraries used by Sage handle memory. So far it doesn't sound like a bug to me.

Bill.

Denis

unread,
Sep 28, 2016, 9:34:04 AM9/28/16
to sage-devel


Bill: your comment about normal/expected behavior covers my initial post, but not the fact that the same code works in 7.0 and crashes in 7.3.

So the question is, might someone with developer's tools catch this problem quickly even with my vanilla code - it's just 277 lines total, plus the function invocation. I can't whittle it down easily because the interdependencies are pretty tight. The main problem is that I do not get a stack trace, the sage process simply exits, with the message "Fatal: memory exhausted" on one machine, and completely silently on the other. Actually I think that is a system message, i.e. Sage always crashes silently.

Of course, if someone can suggest some way at least to figure out which was the last function invoked, that would help.

Denis

Denis Sunko

unread,
Aug 9, 2017, 11:15:53 AM8/9/17
to sage-...@googlegroups.com
Hi all,

the problem went away when I switched from 32-bit debian jessie to 64-bit stretch. One would suspect a bug in some of the system libraries, because the version of Sage is the same (and compiled locally from source). Obviously the motivation to find it went down somewhat.

I thank everyone for their friendly input,

Denis


--
You received this message because you are subscribed to a topic in the Google Groups "sage-devel" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sage-devel/jCjc_Spo20o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sage-devel+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages