save is a memory hog

58 views
Skip to first unread message

Salvatore Stella

unread,
Aug 2, 2024, 5:49:03 AM8/2/24
to sage-...@googlegroups.com
Dear All,
while working with some big dictionaries I realized that save consumes a lot
of RAM. Here is an example.

Suppose foo is a dictionary whose keys are integer polynomials in three
variables and whose values are integers. In my running example it has 342971
entries and it occupies 10 Mb in memory:

sage: import sys
sage: sys.getsizeof(foo)
10485848

Before saving the file this is what top has to say about my running sage:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13099 VulK 20 0 19.1g 1.8g 84988 R 1.0 2.8 191:55.58 python3 /opt/sage/src/bin/sage-ipython -i

And here is a snapshot I got moments before saving ended

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13099 VulK 20 0 32.6g 15.2g 86140 R 100.3 24.2 197:58.38 python3 /opt/sage/src/bin/sage-ipython -i

I am using the following instruction:

sage: save(foo,"/tmp/bar")

A little poking around shows that, for object without a save method, save
calls _base_save which in turn calls _save_dump

sage: from sage.misc.persist import _base_dumps
sage: baz = _base_dumps(bar, compress=True)
sage: sys.getsizeof(baz)
175066342

baz is 17 times bigger than foo but it still reasonable: is it abot 170Mb;
this figure agrees with the size of the savefile. What seems unreasonable is
that to produce it sage used roughly 13Gb of memory.

Is this the expected behaviour? Can anyone provide me with some insights on
how pickling works in sage?

Thanks
S.

julian...@fsfe.org

unread,
Aug 2, 2024, 6:02:09 AM8/2/24
to sage-devel
Hi Salvatore,

I couldn't reproduce the problem that you are seeing.

sage: R.<a,b,c> = ZZ[]
sage: D = {R.random_element(): ZZ.random_element() for _ in range(2**18)}
sage: save(D, 'deleteme')

The above uses a bit of RAM but not the amounts that you are seeing.

Usually, I use memray to debug such problems. Maybe you can give it a try? Or feel free to send me a (compressed) pickle and I'll have a look. (I'm also over on sagemath.zulipchat.com if you need any help with memray.)

julian

Salvatore Stella

unread,
Aug 2, 2024, 12:14:21 PM8/2/24
to sage-...@googlegroups.com
Hi Julian,
somehow reproducing this from the same dictionary loaded from a savefile
yields less dramatic results. Here is what I am doing:

sage: # this will downlod roughly 150 Mb of data
load("http://people.disim.univaq.it/~salvatore.stella/tmp/bar.sobj")
Attempting to load remote file: http://people.disim.univaq.it/~salvatore.stella/tmp/bar.sobj
Loading started
Loading ended
sage: # make a new dictionary so that we do not use any cached data
sage: foo = dict(bar.items())
sage: save(foo,"/tmp/foo")

This uses "only" 5 Gb of RAM.

I will have a look at memray and report back but first I'll probably try to
reach you on zulip.
Thanks
S.

Reply all
Reply to author
Forward
0 new messages