UniqueRepresentation of Partitions

82 views
Skip to first unread message

Martin R

unread,
Nov 21, 2024, 1:44:00 AM11/21/24
to sage-devel
Dear all,

my intention was to fix a small shortcoming, reported in https://github.com/sagemath/sage/issues/38897, which surfaced a few further small problems.

One thing I'm unsure about is the following:

Partitions currently inherits from UniqueRepresentation.  However, it is frequently used to get the cardinality of certain subsets, e.g., Partitions of restricted length or restricted part size.  Thus, for each combination of restrictions we create an entry in the global cache. In the example code, *very* many such parents a created.

I just checked IntegerVectors, which also accepts such a wide range of combinations of parameters.  In this class, only certain subclasses have UniqueRepresentation.  Should we do the same for Partitions?

It is actually not completely clear to me what UniqueRepresentation is good for in this setting.  Wouldn't it make sense only for parents that are likely to be used very often?

Have a nice day,

Martin 

Nils Bruin

unread,
Nov 21, 2024, 12:06:30 PM11/21/24
to sage-devel
The original reason for making UniqueRepresentation parents was for the coercion framework: If parents are UniqueRepresentation you can use equality-by-id and hash them super-easy (by id). Hence they can be looked up quickly and we can key lookup of coercion-maps easily on fast hashes and identity tests.

I do not think they were designed as a memory-saving measure: scenarios where a lot of elements are created usually have an obvious parent in the vicinity and otherwise parent creation could have an ad-hoc cache.

In fact, in practice UniqueRepresentation quite regularly creates memory leaks, due to the fact that the objects live in a global cache, which makes them hard to garbage-collect (that the cache is a WeakValueDictionary helps a bit but it's still very easy to get reference loops). Another problem is that these objects are truly global, so you can't cut corners on keeping them immutable! Even caching stuff on them can lead to very complicated non-local side effects.

Creating a UniqueRepresentation object also comes with an overhead: the construction parameters must be processed and made into a key for the cache. If the object already exists, the creation-by-lookup may recoup some of that overhead. So parents that are unlikely to be recreated multiple times probably experience reduced performance if they are UniqueRepresentation.

I don't think that the memory overhead of having *many* UniqueRepresentation objects at the same time is very significant: it won't change the order of growth of memory use; just the multiplicative factor. And probably not by much: a key is not likely to be larger than a complete "parent" datastructure (although the processing of the construction parameters can make for strange keys!)

It may be that a lot of code has now grown to expect that parents are UniqueRepresentation. If your non-UniqueRepresentation parent were to come in contact with such code, you'd probably run into trouble. But I don't know if or where that is likely to happen.

Martin R

unread,
Nov 22, 2024, 4:09:11 AM11/22/24
to sage-devel
Thank you, in particular for the explanation!  I tried to remove UniqueRepresentation from Partitions, and indeed, the application I was doing this for becomes a lot faster.  However, there are many doctests failing, so I'll postpone this.

Martin
Reply all
Reply to author
Forward
0 new messages