Memory leak in `NumberField().class_group().order()`

329 views
Skip to first unread message

Georgi Guninski

unread,
Sep 4, 2024, 9:09:37 AM9/4/24
to sage-...@googlegroups.com
Probably this shares the same bug as [1]

Calling `NumberField().class_group().order()` in a loop of size N:
#10^3 leaks: 40.03 MB 40026112 pari= [7950, 1451665]
#10^4 leaks: 338.49 MB 338493440 pari= [83505, 19297360]

The leak appears to be in the pari heap.

Code .sage:
====
#Author Georgi Guninski Wed Sep 4 12:58:18 PM UTC 2024
#10^3 leaks: 40.03 MB 40026112 pari= [7950, 1451665]
#10^4 leaks: 338.49 MB 338493440 pari= [83505, 19297360]

import psutil,gc,sys

from sage.all import EllipticCurve
ps = psutil.Process()
num=10**3 #10**5 // 2
x=var('x')
def leaknf5(N=10**3):
gc.collect()
base = ps.memory_info().rss
for A2 in range(1, N):
Kn=NumberField(x^2+A2,'w')
m=Kn.class_group().order()
gc.collect()
mem = ps.memory_info().rss - base
print(f"{mem/1e6 :.2f} MB ",mem," pari=",pari.getheap())

leaknf5(10**4)
====

[1]: https://groups.google.com/g/sage-devel/c/fWBls6YbXmw
Memory leak in |EllipticCurve([n,0]).root_number()| and problem in
algebraic geometry

Marc Culler

unread,
Sep 4, 2024, 4:13:32 PM9/4/24
to sage-devel
I think that here you are seeing caching taking place, rather than a memory leak.  This is what I tried:

sage: import cypari2
sage: pari = cypari2.Pari()
sage: def test(N):
....:     for a in range(1, N):
....:         K = NumberField(x^2+a, 'w')
....:         m = K.class_group().order
....:     print(pari.getheap())
....:
sage: %time test(10**3)
[7950, 1451665]
CPU times: user 2.86 s, sys: 45.5 ms, total: 2.91 s
Wall time: 2.92 s
sage: %time test(10**3)
[7950, 1451665]
CPU times: user 160 ms, sys: 2.65 ms, total: 163 ms
Wall time: 163 ms
sage: %time test(10**4)
[83505, 19297360]
CPU times: user 29.6 s, sys: 336 ms, total: 30 s
Wall time: 30.1 s
sage: %time test(10**4)
[83505, 19297360]
CPU times: user 1.33 s, sys: 4.32 ms, total: 1.33 s
Wall time: 1.33 s

I assume that the huge reduction in time for the second run of the test function is due to caching, and that the extra space on the heap is used by cached objects.

- Marc

Georgi Guninski

unread,
Sep 5, 2024, 4:00:42 AM9/5/24
to sage-...@googlegroups.com
On Wed, Sep 4, 2024 at 11:13 PM Marc Culler <marc....@gmail.com> wrote:
>
> I think that here you are seeing caching taking place, rather than a memory leak. This is what I tried:
>
>
You call this caching, I call it leak, it can be both ways.
It is natural to compute the class numbers of QQ[sqrt(-n)] and
it shouldn't takes GBs of RAM IMHO.

Default pari is significantly faster with stack 40MB,
is there drama nfinit vs bnfinit?:

allocatemem(40*10^6);
default(timer,1);
{f(N)=
for(a=1,N,
K=bnfinit('x^2+a);
m=K.clgp.no;
);
}

? f(10^4)
cpu time = 5,028 ms, real time = 5,057 ms.
? f(10^5)
cpu time = 1min, 14,328 ms, real time = 1min, 15,146 ms.

Marc Culler

unread,
Sep 5, 2024, 1:36:39 PM9/5/24
to sage-devel
Caching uses memory intentionally, for the purpose of speeding up computations.

Leaks use memory unintentionally, for no purpose.

I don't know where the caching happens. I only deduce that it exists because, when running the same computation twice, the second time is faster.

However, the caching does not happen in cypari2 (nor cypari).  This is what I see with cypari2 in sage:

sage: import cypari2
sage: pari = cypari2.Pari()
sage: def test(N):
....:     for a in range(1, N):
....:         K = pari.bnfinit(pari("x^2 + %s" % a))
....:         m = K.bnf_get_no()
....:
sage: %time test(10**3)
CPU times: user 543 ms, sys: 29.4 ms, total: 572 ms
Wall time: 630 ms
sage: %time test(10**4)
CPU times: user 7.14 s, sys: 48.4 ms, total: 7.18 s
Wall time: 7.2 s
sage: %time test(10**5)
CPU times: user 2min 1s, sys: 854 ms, total: 2min 2s
Wall time: 2min 2s

That uses 190MB, not "GBs".

The same computation with cypari in ipython is a bit faster but not much:
In [1]: from cypari import pari
In [2]: def test(N):

   ...:     for a in range(1, N):
   ...:         K = pari.bnfinit(pari("x^2 + %s" % a))
   ...:         m = K.bnf_get_no()
In [3]: %time test(10**3)
CPU times: user 410 ms, sys: 4.75 ms, total: 415 ms
Wall time: 415 ms
In [4]: %time test(10**4)
CPU times: user 6.05 s, sys: 36.1 ms, total: 6.08 s
Wall time: 6.09 s
In [5]: %time test(10**5)
CPU times: user 1min 51s, sys: 846 ms, total: 1min 52s
Wall time: 1min 53s

That computation uses 51MB.  Also not "GBs".

There is no question that computations which run entirely on the PARI stack are faster than computations which move each PARI GEN to the heap and wrap it in a python object.  That is presumably the reason that the cypari2 project was trying to leave GENs on the stack as long as possible.  Unfortunately, their implementation of that idea caused huge memory leaks.

I think your complaints about the Sage NumberField class are not directly relevant to cypari or cypari2.

Your observation that PARI runs faster than cypari or cypari2 applies to the design of Sage's PARI interface, which goes back to the beginning of Sage.  I am sure that a better design would be welcomed, if you had one to offer.  Any such interface would incur some cost, but maybe it would be possible to do better.

- Marc

On Thursday, September 5, 2024 at 2:00:42 AM UTC-6 Georgi Guninski wrote:

Georgi Guninski

unread,
Sep 6, 2024, 5:26:55 AM9/6/24
to sage-...@googlegroups.com
This is not complaint, it is an observation about bug of type memory leak.
To leak about one GB, run the testcase `leaknf5` from the top of the
thread with argument N=3*10^4:

#3*10^4 leaks: 1084.55 MB in 1m35.208s

Marc Culler

unread,
Sep 8, 2024, 11:12:14 AM9/8/24
to sage-devel
I agree that this is a bug.  I do not think it is the same issue as the leak you reported involving elliptic curves.  The reason I don't think so is that it is possible to compute class numbers with no memory leak using the PARI getno function in either cypari or cypari 2.  There are many things that can cause the PARI heap to grow (and it happens in cypari2 with just ordinary vectors and matrices as discussed in cypari2 issue #112).  One major cause of PARI heap GENs not getting freed is that those GENs are managed by Python Gen objects which are not being deallocated due to references being held by other Python objects.  When a Python Gen is dealloc'ed it should free the PARI GEN which is it managing if that GEN is on the PARI heap.  That was not happening with the t_VEC GEN describiing an elliptic curve, even though the Gen object was calling the PARI gunclone function because the gunclone function was not freeing the "lazy" components of that vector.  (That has been fixed in cypari.)

I think something else is causing Sage NumberField objects to leak memory (i.e. to not be deallocated) in your example.  The fact that both issues involve growth of the PARI heap does not mean that both issues have the same cause.  The statement that they "probably" have the same cause is not supported by any evidence and I do not believe that they do have the same cause.

- Marc

Marc Culler

unread,
Sep 8, 2024, 12:02:56 PM9/8/24
to sage-devel
Below is evidence (again) that the "leak" you are reporting is actually caused by caching NumberField objects or related data.  You can see that when a calculation using a certain NumberField is repeated it does not increase the size of  the PARI heap, although the first time that the calculation is done the size does increase.  Moreover the second time that the calculation is done it runs much faster.  These two things indicate to me that NumberField objects, or at least some of the computed data regarding a particular number field, are being cached.  This behavior is surely intentional, which means that some caution should be used when applying the  the label "bug" to it.   Some bugs are programming errors.  Others are features with unexpected side effects.  I suspect this is the latter case, and that a decision was made that the cost of memory used by the caching is worth paying for the increase in speed that it allows.  (Incidentally, writing loops which create many number fields and then immediately destroy each one is probably not the use case that the designers of the Sage NumberField had in mind.)

Looking at the code in number_field.py I see that when a number field is created a key is generated to uniquely identify that field.  The NumberField is not a class, but a function which accesses:
    class NumberFieldFactory(UniqueFactory)
These things support my belief that aspects of number fields are being cached and that the caching causes GENs on the PARI heap to be retained for the life of the cache, but allows properties of a cached number field to be looked up rather than recomputed.

The file number_field.py does not contain any attribution, so I do not know who the author(s) may be.  But the author(s) would be much able to explain the rationale behind their design than I would.  Maybe some of them read this list ...

- Marc

sage: import cypari2
sage: pari = cypari2.Pari()
sage: pari.getheap()
[1, 9]
sage: for A in range(1, 10):
....:   Kn = NumberField(x^2+A,'w')
....:   m = Kn.class_group().order()
....:
sage: pari.getheap()
[21, 284]
sage: for A in range(1, 10):
....:   Kn = NumberField(x^2+A,'w')
....:   m = Kn.class_group().order()
....:
sage: pari.getheap()
[21, 284]
sage: def test(N):
....:     for A in range(1, N):
....:         Kn = NumberField(x^2+A,'w')
....:         m = Kn.class_group().order()
....:
sage: pari.getheap()
[21, 284]
sage: %time test(100)
CPU times: user 217 ms, sys: 5.09 ms, total: 222 ms
Wall time: 222 ms
sage: pari.getheap()
[201, 4265]
sage: %time test(100)
CPU times: user 22.7 ms, sys: 1.48 ms, total: 24.2 ms
Wall time: 23 ms
sage: pari.getheap()
[201, 4265]
sage:

Nils Bruin

unread,
Sep 8, 2024, 12:03:54 PM9/8/24
to sage-devel
This example is definitely leaving loads of stuff on the python heap, so if there is a leak onto the cython heap then it is not the only one. My guess would be an interaction with the coercion system or UniqueRepresentation, which both keeps global weak references to objects. If the key information ends up containing objects that hold a reference to the values in a weakly valued dictionary, the garbage collector can not remove the cycle. These things are hellish to debug. The code below does find the relevant objects on the heap, though, so you can plot the backwards reference graphs of some of these objects to see what kinds of links are involved. We have resolved some of these bugs in the past.

```
import gc
from collections import Counter

gc.collect()
pre={id(a) for a in gc.get_objects()}

for A2 in range(1, 1000):
    Kn=NumberField(x^2+A2,'w')
    m=Kn.class_group().order()
    del Kn, m

gc.collect()
gc.collect()
T=Counter(str(type(a)) for a in gc.get_objects() if id(a) not in pre)
T
```
Notes for the code above:
 - if you run it twice you get a nearly empty list from T, which is consistent with UniqueRepresentation objects remaining (they would not be recreated if they already exist).
 - this is confirmed by rerunning the loop but now with another name than 'w' for Kn. Now new objects do get created!

Dima Pasechnik

unread,
Sep 8, 2024, 3:02:07 PM9/8/24
to sage-...@googlegroups.com
Can this be reproduced in plain Python with cypari2 installed?
One would need to replace the call to NumberField with the corresponding cypari2 equivalent.
This would at least tell whether it's  a leak in cypari2, or not.

Dima



Marc Culler

unread,
Sep 8, 2024, 4:18:40 PM9/8/24
to sage-...@googlegroups.com
As I said above this does not happen with either cypari or cypari2 when using getno.

This is not a cypari issue.  The issue is that Sage creates a "unique" object for each new number field, where new means that the input parameters for the NumberField function have not been used before. The number field objects are stored in a cache indexed by a key generated from the input parameters.  Those cached objects live for the entire session.  This is by design. The design did not anticipate creating a billion number fields in one Sage session.  That was not necessarily a bad choice.

Perhaps there is a way of disabling caching or clearing the cache.

- Marc

--
You received this message because you are subscribed to a topic in the Google Groups "sage-devel" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sage-devel/kbzd2uhTypU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sage-devel+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/6658541E-8E9B-446A-906E-1CE7E043E16C%40gmail.com.

Dima Pasechnik

unread,
Sep 8, 2024, 5:00:14 PM9/8/24
to sage-...@googlegroups.com
I'd say that, normally speaking, a cache is something of limited size,
and managed - once it is full, the least used objects are removed to
make room for new objects. I don't know if there are CASs which use
such a design.

An unlimited size cache is easier and more efficient - as long as you have RAM.
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CALcZXRGrcVy6g1puui87KR4SJzdbm6nMPhMk9pzYPRvZafy%3DJw%40mail.gmail.com.

Nils Bruin

unread,
Sep 8, 2024, 5:17:31 PM9/8/24
to sage-devel
On Sunday 8 September 2024 at 13:18:40 UTC-7 marc....@gmail.com wrote:
As I said above this does not happen with either cypari or cypari2 when using getno.

This is not a cypari issue.  The issue is that Sage creates a "unique" object for each new number field, where new means that the input parameters for the NumberField function have not been used before. The number field objects are stored in a cache indexed by a key generated from the input parameters.  Those cached objects live for the entire session.  This is by design. The design did not anticipate creating a billion number fields in one Sage session.  That was not necessarily a bad choice.

Perhaps there is a way of disabling caching or clearing the cache.

I don't think that number fields are designed to be immortal. They probably end up being that because of a combination of references in the coercion framework and the UniqueRepresentation store. The idea of UniqueRepresentation objects is that when an object is created *with the same creation parameters* as an object that already exists, then that identical object is returned. One main advantage (and originally one of the important reasons for introducing that design) is that equality can then be reduced to identity, which is a lightning-fast comparison. In the coercion framework this is quite important because arithmetic operations can easily end up in inner loops, so a fast path for deciding coercion doesn't need to do anything is important. Parent equality indicates this!

The reference dictionary that keeps track of UniqueRepresentation objects by itself does NOT try to make the objects immortal: if a parent gets created and then loses all references, it should be garbage collectible, and many UniqueRepresentation objects are. Once the object is garbage collected, a new instance would be created. This is realized by using a WeakValueDictionary for the UniqueRepresentation storage: the global reference to the object held by the UniqueRepresentation constructor does not count towards the reference count. So if there are otherwise no references (or only other weak ones) the object becomes eligible for garbage collection.

In practice, for such complicated objects (particularly objects that participate in the coercion framework), the reference count is very likely never zero. For one thing, rings generally cache their 0 and 1 and those elements hold a reference back to their parent. So parents almost always have to be garbage collected by the cycle-detecting mark-and-sweep.

A classic example for how an object can become immortal through this process is if one of the construction parameters of an object ends up participating in a reference chain to the constructed object. This can happen through caching operations, for caches that are only intended to survive for the lifetime of the object. Now there is a global reference (held by the construction parameter that is stored in UniqueRepresentation) to the object , so the entry from the WeakValueDict is never removed. Normally this key info would get dereferenced once the value gets garbage collected, so those keys would normally not be immortal either. But the cycle prevents removal.

There are other such side-effects of some of the global WeakValueDicts that are use, as well as the MonoDicts and TripleDicts used in the coercion framework. For UniqueRepresentation, it means one needs to be very disciplined about the way the construction parameters are stored. For the most part, these should be rather simple objects that cannot hold references to the kinds of objects that are instantiated from the parameters. But this is a very difficult and non-local thing to establish, so it's almost impossible to get programmers to adhere such discipline [partly because it's difficult to properly codify into simple-to-apply rules, and hence this hasn't been done].

Georgi Guninski

unread,
Sep 9, 2024, 4:44:36 AM9/9/24
to sage-...@googlegroups.com
On Sun, Sep 8, 2024 at 6:12 PM Marc Culler <marc....@gmail.com> wrote:
>
> I think something else is causing Sage NumberField objects to leak memory (i.e. to not be deallocated) in your example. The fact that both issues involve growth of the PARI heap does not mean that both issues have the same cause. The statement that they "probably" have the same cause is not supported by any evidence and I do not believe that they do have the same cause.
>

OK, looks like I was wrong about "probably", sorry.
I no longer claim anything about the origin of this leak.

Dima Pasechnik

unread,
Sep 15, 2024, 1:08:18 PM9/15/24
to sage-...@googlegroups.com
For me this code is rather unpredictable, as ipython and prompt_toolkit kick in
and produce extra objects.

For consistency (at least them the output values are reproducible)
it looks better to experiment with Sage's python (./sage --python)
for which the code needs to be adjusted, with "^" replaced by "**", and
two lines added as the front:

from sage.all import *
var('x')
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/d0195683-29e8-4b6a-b81d-3384407a944an%40googlegroups.com.

Nils Bruin

unread,
Sep 15, 2024, 1:17:25 PM9/15/24
to sage-devel
On Sunday 15 September 2024 at 10:08:18 UTC-7 dim...@gmail.com wrote:
For me this code is rather unpredictable, as ipython and prompt_toolkit kick in
and produce extra objects.

For consistency (at least them the output values are reproducible)
it looks better to experiment with Sage's python (./sage --python)
for which the code needs to be adjusted, with "^" replaced by "**", and
two lines added as the front:

from sage.all import *
var('x')

Generally, to check if there's a leak one would look how object swell on the heap grows with iterations. That drowns out the constant overhead of the shell.
Or you can just wrap the whole measuring process in a function, so that it gets executed in an enclosed scope. There are generally some objects whose creation does scale with iteration and whose type is rather specific. You can grab one of those objects on the heap and look at its backreference graph. That generally gives a pretty good pointer to where the global references are held. It can actually be a fairly interesting challenge for someone with an interest in learning the nitty gritty details of the sagemath memory setup. It doesn't require brilliance; mostly persistence.

Marc Culler

unread,
Sep 15, 2024, 1:46:06 PM9/15/24
to sage-...@googlegroups.com
> You can grab one of those objects on the heap and look at its backreference graph

How does one do that?

- Marc
> --
> You received this message because you are subscribed to a topic in the Google Groups "sage-devel" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/sage-devel/kbzd2uhTypU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/beacd365-e173-4ab9-a014-bb0fa89e2e54n%40googlegroups.com.

Nils Bruin

unread,
Sep 15, 2024, 1:58:46 PM9/15/24
to sage-devel
On Sunday 15 September 2024 at 10:46:06 UTC-7 marc....@gmail.com wrote:
> You can grab one of those objects on the heap and look at its backreference graph

How does one do that?

You can grab one of the objects:

next(a for a in gc.get_objects() if id(a) not in pre and str(type(a)) == "<type you want>")

(if your type doesn't nail down the objects completely you may want a little more than just the next object from this generator)

from there, https://pypi.org/project/objgraph/ . The author has linked some blogposts to illustrate how to use these tools to find memory links
 

Georgi Guninski

unread,
Sep 16, 2024, 9:20:37 AM9/16/24
to sage-...@googlegroups.com
> from there, https://pypi.org/project/objgraph/ . The author has linked some blogposts to illustrate how to use these tools to find memory links

Maybe fighting leaks should start at developer level, then QA.
Waiting to see gigabytes missing in a minute is a very crude way
to recognize leak.

Nils Bruin

unread,
Sep 16, 2024, 11:48:58 AM9/16/24
to sage-devel
On Monday 16 September 2024 at 06:20:37 UTC-7 Georgi Guninski wrote:
Maybe fighting leaks should start at developer level, then QA.
Waiting to see gigabytes missing in a minute is a very crude way
to recognize leak.

It's very common for computer algebra packages to have memory leaks, particularly because useful caching in one situation can be a memory leak in another -- it can really depend on the use case. Some memory leaks may end up unavoidable with certain designs.They usually arise from scenarios not considered by the original authors. It's great that you find them and hopefully it helps educate a new generation of developers.
 
In practice, almost all computer algebra systems seem to benefit (both in memory use and in performance) from frequent restarts, so from a practical perspective one should always look for ways to chop a computation into smaller blocks and extracting meaningful intermediate results so that they can be recreated. This matches with strategies that help in generating reproducible results, so this is slightly less of a burden than one might think initially.

Hunting memory leaks is definitely necessary to maintain a workable system but I don't think that completely eliminating them (or ensuring new ones don't appear!) is an attainable goal.

Georgi Guninski

unread,
Sep 17, 2024, 4:48:12 AM9/17/24
to sage-...@googlegroups.com
> It's very common for computer algebra packages to have memory leaks

At least you are honest about the generalization for the whole
software development theater.

Bill Gates said: "If I had a cent every time Windows crashes,
I would have been billionaire. Oh, wait..."

Dima Pasechnik

unread,
Sep 17, 2024, 10:04:18 AM9/17/24
to sage-...@googlegroups.com
On Tue, Sep 17, 2024 at 9:48 AM Georgi Guninski <ggun...@gmail.com> wrote:
>
> > It's very common for computer algebra packages to have memory leaks
>
> At least you are honest about the generalization for the whole
> software development theater.

that's the legacy of developing software in a platform-independent
machine assembly language
known as C or C++, or Fortran :-)
Then, as we know, Greenspun's tenth rule states that:

Any sufficiently complicated C or Fortran program contains an ad hoc,
informally-specified, bug-ridden, slow implementation of half of
Common Lisp.

(I'd say, contains something that resembles programming in a sane
functional programming language)

But that's only part of the story, the other parts are:

* type checking is either not done, or gets overridden all the time,
* that one has to connect these "ad hoc, informally-specified,
bug-ridden, slow implementations" to each other.
* etc...

cypari(2) is a good example - we have a poorly documented libpari,
with Pari doing its own poorly documented and executed
memory/GC management, and Python, with its own, rather different, GC
management needs to keep track of
Pari objects, with an optional extra headache imposed by the Sage type system.

Dima





>
> Bill Gates said: "If I had a cent every time Windows crashes,
> I would have been billionaire. Oh, wait..."
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAGUWgD9-d-x45FcdfAWUtqhBJfmRRWQY6-z%2BVZeU8iXEUcCn5Q%40mail.gmail.com.

Nils Bruin

unread,
Sep 17, 2024, 11:40:16 AM9/17/24
to sage-devel
On Tuesday 17 September 2024 at 07:04:18 UTC-7 dim...@gmail.com wrote:

that's the legacy of developing software in a platform-independent
machine assembly language
known as C or C++, or Fortran :-)
Then, as we know, Greenspun's tenth rule states that:

Any sufficiently complicated C or Fortran program contains an ad hoc,
informally-specified, bug-ridden, slow implementation of half of
Common Lisp.

That's not actually the kind of bug I was thinking of. The memory leaks that have been particularly hard to find in sage were on python/cython level, which are generally memory-safe languages. They are about how to handle the essentially global data structure that arises from the coercion framework.

For instance, if you'd store on ZZ the coercion maps to any ring in the normal way, every ring now has a reference from an object that will never be garbage collected (ZZ) and hence will not be eligible for collection either. Sage doesn't do that, of course, but the coercion graph does need to hold this information somehow. Doing that in a way that still allows parts to be pruned once some objects are not referenced very much anymore is truly a delicate problem. The globally unique "UniqueParent"s are another example of such a hard-to-manage global data structure, where garbage removal is very delicate. From python's (or common lisp's) perspective these wouldn't be memory leaks: they are just globally accessible structures ineligible for collection.

John Cremona

unread,
Sep 18, 2024, 7:10:21 AM9/18/24
to sage-devel
I don't have anything helpful to add but here is something I just ran into (with version 10.2).  Here, E = EllipticCurve('162a1') -- but rerunning in a fresh Sage did not trigger the error.   The number of bytes in the second line is rather more than my laptop has, I think (and pari.stacksize() is onlu 8000000).

sage: [E.reduction(p).abelian_group().invariants() for p in primes_first_n(50) if p%4==1 and p>3]
/usr/lib/python3.10/inspect.py:3186: RuntimeWarning: cypari2 leaked 140516603778792 bytes on the PARI stack
  return self._bind(args, kwargs)
ERROR: removing wrong instance of Gen
Expected: [Mod(1, 173), Mod(172, 173), Mod(0, 173), Mod(167, 173), Mod(8, 173), Mod(170, 173), Mod(161, 173), Mod(32, 173), Mod(113, 173), Mod(124, 173), Mod(120, 173), Mod(25, 173), Mod(55, 173), Vecsmall([3]), [173, [112, 94, [6, 164, 3, 0]]], [177, [[177], 1], [[Mod(146, 173), Mod(122, 173)]], [177, [3, 1; 59, 1]]]]
Actual:   [Mod(1, 113), Mod(112, 113), Mod(0, 113), Mod(107, 113), Mod(8, 113), Mod(110, 113), Mod(101, 113), Mod(32, 113), Mod(53, 113), Mod(71, 113), Mod(61, 113), Mod(22, 113), Mod(84, 113), Vecsmall([3]), [113, [4, 96, [6, 104, 3, 0]]], [129, [[129], 1], [[Mod(21, 113), Mod(32, 113)]], [129, [3, 1; 43, 1]]]]
ERROR: inconsistent avma when removing Gen from PARI stack
Expected: 0x7fcc924092a8
Actual:   0x7fcc9240d190
ERROR: removing wrong instance of Gen
Expected: [Mod(1, 109), Mod(108, 109), Mod(0, 109), Mod(103, 109), Mod(8, 109), Mod(106, 109), Mod(97, 109), Mod(32, 109), Mod(49, 109), Mod(79, 109), Mod(79, 109), Mod(27, 109), Mod(90, 109), Vecsmall([3]), [109, [47, 94, [6, 100, 3, 0]]], [99, [[33, 3], 3], [[Mod(30, 109), Mod(35, 109)], [Mod(91, 109), Mod(3, 109)]], [33, [3, 1; 11, 1]]]]
Actual:   [Mod(1, 173), Mod(172, 173), Mod(0, 173), Mod(167, 173), Mod(8, 173), Mod(170, 173), Mod(161, 173), Mod(32, 173), Mod(113, 173), Mod(124, 173), Mod(120, 173), Mod(25, 173), Mod(55, 173), Vecsmall([3]), [173, [112, 94, [6, 164, 3, 0]]], [177, [[177], 1], [[Mod(146, 173), Mod(122, 173)]], [177, [3, 1; 59, 1]]]]
/usr/local/sage/sage-10.2/src/sage/schemes/curves/projective_curve.py:221: RuntimeWarning: cypari2 leaked 14760 bytes on the PARI stack
  Curve_generic.__init__(self, A, X)

Georgi Guninski

unread,
Sep 19, 2024, 2:05:44 AM9/19/24
to sage-...@googlegroups.com
I can't reproduce the pari message on 10.4, but your code leaks memory.
This might be the opposite of memory leak: use after free.
Use after free in general is considered security vulnerability.
Reply all
Reply to author
Forward
0 new messages