Vector spaces with immutable vectors by default

215 views
Skip to first unread message

Nils Bruin

unread,
Aug 5, 2021, 7:25:08 PM8/5/21
to sage-devel

Hi all,

I am fully aware of why vectors *can* be mutable and that having them start out life mutable makes sense (since you cannot really change an immutable vector back to mutable), but there is a scenario where it's super annoying. I ran into this recently in an actual practical piece of code. I won't bore you with the details. This is the setup:

V = VectorSpace(GF(2),6)
D = dict()

I end up using elements of V as keys into the dictionary [in this case, theta characteristics, which are naturally indexed by such vectors]. I end up doing arithmetic with these indices, so I end up with expressions such as:

W = V.subspace([V.1,V.2,V.3])

sum( D[v0+w] for w in W)

except that that doesn't work, because v0+w ends up being a fresh vector and hence is mutable; so not hashable.

One can work around it with a little helper function

def imm(v):
    v.set_immutable()
    return v

and then one can write:

sum( D[imm(v0+w)] for w in W)

it's just super annoying to have to do that everywhere, and of course one ends up getting an error message each time before actually putting this in place.

So perhaps for convenience in interactive use, it would be useful to have

VectorSpace(R,n,immutable_vectors=True)

or something similar. It's particularly for these kinds of situations where one wants to use *vectors*, not tuples, for indexing, because one WANTS the vector arithmetic. It's not for high-performance scenarios (that vectors in other situations definitely need to be able to accommodate).

Ideas?

Kwankyu Lee

unread,
Aug 5, 2021, 9:42:55 PM8/5/21
to sage-devel
On Friday, August 6, 2021 at 8:25:08 AM UTC+9 Nils Bruin wrote:
I am fully aware of why vectors *can* be mutable and that having them start out life mutable makes sense (since you cannot really change an immutable vector back to mutable)

I didn't know that a fresh new vector is mutable. I have never used this feature. Why does it make sense that a new vector starts as mutable? Why a mutable vector is necessary?

In general, I am for that every output in Sage should be hashable and hence immutable by default, with exceptions explicitly declared as such.
 

Nils Bruin

unread,
Aug 5, 2021, 10:18:30 PM8/5/21
to sage-devel
On Thursday, 5 August 2021 at 18:42:55 UTC-7 Kwankyu Lee wrote:
I didn't know that a fresh new vector is mutable. I have never used this feature. Why does it make sense that a new vector starts as mutable? Why a mutable vector is necessary?
It can't start out immutable and then changed to mutable since ... well ... an immutable vector is immutable. There's a promise made that it won't change its content in its lifetime. We can't break that promise. So once you think mutable vectors should be an *option*, having them start out as mutable is an easy way of making them available.

Mutable vectors and matrices are very important for numpy-style efficient algorithms: modifying matrices in-place (and I guess vectors as well) saves a lot of memory allocation churn for consecutively-allocated "float" and "int64" arrays. For our RealField and Integer/Rational based vectors this is already much less the case, because generally basic arithmetic *already* generates memory churn. I suspect in those cases a bit of pointer copying may not actually be so costly in comparison, so the overhead of newly allocating memory for results might actually not be so significant. In numpy and Pandas (which, oddly DOES copy a lot), it can be essential to do a lot of in-place modification, because the data structures can be huge, if you're working with high-dimensional vectors or matrices.

In general, I am for that every output in Sage should be hashable and hence immutable by default, with exceptions explicitly declared as such.

So ... I suspect there's a lot of code in sage where immutable vectors wouldn't hurt, because a lot of operations will rely on the default arithmetic, which creates new vectors for results. However, there are some places where it would be a problem. Indeed, I think it would be quite reasonable to ask people to construct a vector explicitly as mutable if they need it. Perhaps an optional argument to _element_constructor_ "mutable" that is by default False?

This could break a LOT of code. It would fail with sensible error messages (hopefully), but it might be a great pain to trace down *where* a vector gets created from the location where a mutation attempt is made. To avoid this breakage, we could make the default behaviour a property of the parent. Then we can maintain the old behaviour by default, while allowing to get "default immutable" behaviour at the cost of having to specify it explicitly on the parent.

In terms of history: https://trac.sagemath.org/ticket/25509 introduced a "immutable" argument to "vector" (with a False default of course). I think there may be other tickets concerning this issue.

Kwankyu Lee

unread,
Aug 5, 2021, 10:53:25 PM8/5/21
to sage-devel
On Friday, August 6, 2021 at 11:18:30 AM UTC+9 Nils Bruin wrote:
On Thursday, 5 August 2021 at 18:42:55 UTC-7 Kwankyu Lee wrote:
I didn't know that a fresh new vector is mutable. I have never used this feature. Why does it make sense that a new vector starts as mutable? Why a mutable vector is necessary?
It can't start out immutable and then changed to mutable since ... well ... an immutable vector is immutable. There's a promise made that it won't change its content in its lifetime. We can't break that promise. So once you think mutable vectors should be an *option*, having them start out as mutable is an easy way of making them available.

Mutable vectors and matrices are very important for numpy-style efficient algorithms: modifying matrices in-place (and I guess vectors as well) saves a lot of memory allocation churn for consecutively-allocated "float" and "int64" arrays. For our RealField and Integer/Rational based vectors this is already much less the case, because generally basic arithmetic *already* generates memory churn. I suspect in those cases a bit of pointer copying may not actually be so costly in comparison, so the overhead of newly allocating memory for results might actually not be so significant. In numpy and Pandas (which, oddly DOES copy a lot), it can be essential to do a lot of in-place modification, because the data structures can be huge, if you're working with high-dimensional vectors or matrices.

I understand this for matrices; we use matrices like lists.They belong to exceptions. But I don't understand it for vectors. 
 
In general, I am for that every output in Sage should be hashable and hence immutable by default, with exceptions explicitly declared as such.

So ... I suspect there's a lot of code in sage where immutable vectors wouldn't hurt, because a lot of operations will rely on the default arithmetic, which creates new vectors for results. However, there are some places where it would be a problem. Indeed, I think it would be quite reasonable to ask people to construct a vector explicitly as mutable if they need it.

I agree.

Kwankyu Lee

unread,
Aug 6, 2021, 12:01:28 AM8/6/21
to sage-devel
So ... I suspect there's a lot of code in sage where immutable vectors wouldn't hurt, because a lot of operations will rely on the default arithmetic, which creates new vectors for results. However, there are some places where it would be a problem.

Precisely this many failures if the default is mutable=False: 

----------------------------------------------------------------------
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/cyclic_covers/cycliccover_finite_field.py  # 29 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/plot/plot3d/shapes2.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/base.py  # 3 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/library.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/triangulation/point_configuration.py  # 4 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/hyperplane_arrangement/arrangement.py  # 4 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modular/modform_hecketriangle/readme.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/quadratic_forms/genera/genus.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/structure/element.pyx  # 3 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modular/abvar/abvar.py  # 9 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/free_module_element.pyx  # 8 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/quadratic_forms/quadratic_form__neighbors.py  # 5 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/stats/distributions/discrete_gaussian_lattice.py  # 6 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/algebras/cluster_algebra.py  # 14 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/chow_group.py  # 17 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/homology/homology_vector_space_with_basis.py  # 122 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/variety.py  # 14 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/divisor.py  # 64 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/tests/books/computational-mathematics-with-sagemath/linsolve_doctest.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/ppl_lattice_polytope.py  # 44 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/triangulation/element.py  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/combinat/finite_state_machine.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/crypto/block_cipher/present.py  # 34 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/parent.py  # 5 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/numerical/interactive_simplex_method.py  # 53 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/numerical/backends/interactivelp_backend.pyx  # 76 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/weierstrass_covering.py  # 26 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/ppl_lattice_polygon.py  # 33 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/integral_points.pyx  # 9 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/weierstrass.py  # 33 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/elliptic_curves/jacobian.py  # 17 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/rings/polynomial/real_roots.pyx  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/homology/homology_morphism.py  # 43 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/homology/algebraic_topological_model.py  # 15 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/vector_double_dense.pyx  # 6 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/vector_real_double_dense.pyx  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/schemes/toric/divisor_class.pyx  # 8 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/fg_pid/fgp_element.py  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/homology/chain_homotopy.py  # 15 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/vector_modn_dense.pyx  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/vector_integer_dense.pyx  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/quadratic_forms/qfsolve.py  # 1 doctest failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/functions/spike_function.py  # 5 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/geometry/polyhedron/lattice_euclidean_group_element.py  # 3 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/modules/vector_complex_double_dense.pyx  # 2 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/topology/cubical_complex.py  # 4 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/topology/cell_complex.py  # 16 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/topology/simplicial_complex_morphism.py  # 12 doctests failed
sage -t --warn-long 116.5 --random-seed=0 src/sage/topology/simplicial_complex.py  # 4 doctests failed
---------------------------------------------------------------------- 

I looked into some. They use vectors like lists or mutable matrices.

Nils Bruin

unread,
Aug 6, 2021, 4:47:54 PM8/6/21
to sage-devel
On Thursday, 5 August 2021 at 21:01:28 UTC-7 Kwankyu Lee wrote:

I looked into some. They use vectors like lists or mutable matrices.

Yes, I expect that this will happen more with code in the future. People who have exposure to numpy et.al. tend to write code that way. I suspect that in many scenarios in sage, there's no benefit over working with a list except for some rare cases where one ALSO needs actual vector arithmetic. But I'd expect the ship has sailed on this: there is too much code out there already that uses sage vectors as arrays to make it practical to change that default. So how bad would it be to potentially double the number of vector spaces in existence by giving them an extra parameter determining whether fresh vectors (such as those resulting from vector arithmetic) should be mutable or not.

William Stein

unread,
Aug 6, 2021, 5:13:19 PM8/6/21
to sage-...@googlegroups.com
On Fri, Aug 6, 2021 at 1:47 PM Nils Bruin <nbr...@sfu.ca> wrote:
On Thursday, 5 August 2021 at 21:01:28 UTC-7 Kwankyu Lee wrote:

I looked into some. They use vectors like lists or mutable matrices.

Yes, I expect that this will happen more with code in the future. People who have exposure to numpy et.al. tend to write code that way. I suspect that in many scenarios in sage, there's no benefit over working with a list except for some rare cases where one ALSO needs actual vector arithmetic.

Possible benefits may include:

- potentially much faster: for similar reasons to numpy, a mutable sage vector can be much faster than a generic Python list.

- type safety: every element of a vector has the same parent ring; a list can have any elements in it

- length safety: you can't change the number of entries in a mutable list

But I'd expect the ship has sailed on this: there is too much code out there already that uses sage vectors as arrays to make it practical to change that default. So how bad would it be to potentially double the number of vector spaces in existence by giving them an extra parameter determining whether fresh vectors (such as those resulting from vector arithmetic) should be mutable or not.

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/12f4e8a8-716e-4b0c-b26b-3ea708eb16f7n%40googlegroups.com.
--
-- William Stein

Kwankyu Lee

unread,
Aug 6, 2021, 6:59:27 PM8/6/21
to sage-devel
On Saturday, August 7, 2021 at 5:47:54 AM UTC+9 Nils Bruin wrote:
So how bad would it be to potentially double the number of vector spaces in existence by giving them an extra parameter determining whether fresh vectors (such as those resulting from vector arithmetic) should be mutable or not.

Having vector.immutable() method that returns the same vector but set immutable would be a simpler solution for your annoyance. No?
 

Nils Bruin

unread,
Aug 6, 2021, 7:15:40 PM8/6/21
to sage-devel
I don't think so. Between writing imm(v+w) and (v+w).immutable() I think the first version is much easier to write. However, the real annoyance is coming from the fact that what I would write every time is a dictionary like { v+w : ... } and every time it will cost me getting an error message before I correct it to { imm(v+w): ...}.
With an option on the parent, it would cost me one error message in the session before I change the parent construction to

V=VectorSpace(k,n, immutable_vectors=True)

and then I'm good for the rest of the session. I've been sceptical about such an option before and for library use I don't think it's such a big deal, but I've now encountered in real life an interactive scenario where it's super-annoying. It seriously made me consider moving the computation over to magma, where hashing sums of vectors is no problem at all (the syntax for associative arrays is a little less convenient, though...).

Kwankyu Lee

unread,
Aug 6, 2021, 7:16:17 PM8/6/21
to sage-devel
On Saturday, August 7, 2021 at 6:13:19 AM UTC+9 wst...@gmail.com wrote:
Possible benefits may include:

- potentially much faster: for similar reasons to numpy, a mutable sage vector can be much faster than a generic Python list.


sage: def build():
....:     v = vector(QQ, 1000)
....:     for i in range(1000):
....:         v[i] = i/(i+1)
....:     return v
....: 
....: 
sage: def build2():
....:     l = list()
....:     for i in range(1000):
....:         l.append(i/(i+1))
....:     return vector(QQ, l)
....: 
....: 
sage: 
sage: timeit('build()')
125 loops, best of 3: 2.93 ms per loop
sage: timeit('build2()')
125 loops, best of 3: 2.74 ms per loop
sage: timeit('build()')
125 loops, best of 3: 2.9 ms per loop
sage: timeit('build2()')
125 loops, best of 3: 2.73 ms per loop 

William Stein

unread,
Aug 6, 2021, 7:31:23 PM8/6/21
to sage-devel
On Fri, Aug 6, 2021 at 4:16 PM Kwankyu Lee <ekwa...@gmail.com> wrote:
>
>
>
> On Saturday, August 7, 2021 at 6:13:19 AM UTC+9 wst...@gmail.com wrote:
>>
>> Possible benefits may include:
>>
>> - potentially much faster: for similar reasons to numpy, a mutable sage vector can be much faster than a generic Python list.
>>

To clarify, by "similar reasons to numpy", I meant that you open up
the possibility of using Cython, vectorized operations, JITs like
numba, etc. Some of these provide order of magnitude speedups and
aren't an option with generic Python lists. Here's a little example:

n = 10^6

def build():
v = vector(RDF, n)
for i in range(n):
v[i] = i/(i+1)
return v


def build2():
l = list()
for i in range(n):
l.append(i/(i+1.0))
return l

a = build()
b = build2()

%time a.sum()
1.49ms
%time sum(b)
188ms

Now imagine you are implementing an algorithm that modifies a bunch of
entries of a, then computes a sum. You can replace "sum" by any other
method on vectors, e.g., a dot product. If there were no mutable
vectors in Sage, then you would make whole classes of algorithms
possibly 100x slower to implement. Numpy (and their ilk, e.g., numba
jit, cython support for numpy, etc.) make these algorithms very fast
to implement...

>
> sage: def build():
> ....: v = vector(QQ, 1000)
> ....: for i in range(1000):
> ....: v[i] = i/(i+1)
> ....: return v
> ....:
> ....:
> sage: def build2():
> ....: l = list()
> ....: for i in range(1000):
> ....: l.append(i/(i+1))
> ....: return vector(QQ, l)
> ....:
> ....:
> sage:
> sage: timeit('build()')
> 125 loops, best of 3: 2.93 ms per loop
> sage: timeit('build2()')
> 125 loops, best of 3: 2.74 ms per loop
> sage: timeit('build()')
> 125 loops, best of 3: 2.9 ms per loop
> sage: timeit('build2()')
> 125 loops, best of 3: 2.73 ms per loop
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/15652d60-1f36-43cf-bb64-94622d1438acn%40googlegroups.com.



--
William (http://wstein.org)

Nils Bruin

unread,
Aug 6, 2021, 7:32:17 PM8/6/21
to sage-devel
On Friday, 6 August 2021 at 14:13:19 UTC-7 wst...@gmail.com wrote:

Possible benefits may include:

- potentially much faster: for similar reasons to numpy, a mutable sage vector can be much faster than a generic Python list.

I wonder about that. The main reason why it's fast in numpy because in-place modification is really *in-place*. In the generic setting, your entries are just going to be references to independently allocated python objects. Mutations would most likely involve decreffing (and often deleting) the old object and allocating a new one. It would also be much faster to have mutable mpfr reals, because those could be manipulated without memory allocation activity, and yet we don't have those. For special implementations (RDF, GF(q) matrices etc) of course we have fully in-place storage versions, so there it works well. But to get the speed, you'd probably have to kick down to cython level, where the "immutable" status is advisory anyway. So I'm not so sure if we're really getting so much speed benefit from it.

In much user code, one would definitely start out writing vector expressions using expression notation, in which case the vectors aren't getting mutated anyway. Only once you find there's a bottleneck somewhere would you rewrite those bits ... and then making the relevant object mutable explicitly would then be quite natural.

I find it quite telling that pandas by default basically always makes copies. It has an option to make changes in-place instead, and if you're working with very large data sets you'd have to, but it is surprising how far you can get with just working with non-mutated objects. And for them it's not even motivated by wanting to have tables hashable (I guess they're not)

- type safety: every element of a vector has the same parent ring; a list can have any elements in it
yup
- length safety: you can't change the number of entries in a mutable list
an array would do that too.
 

Nils Bruin

unread,
Aug 6, 2021, 7:39:26 PM8/6/21
to sage-devel
On Friday, 6 August 2021 at 16:31:23 UTC-7 wst...@gmail.com wrote:
To clarify, by "similar reasons to numpy", I meant that you open up
the possibility of using Cython, vectorized operations, JITs like
numba, etc. Some of these provide order of magnitude speedups and
aren't an option with generic Python lists.

To clarify, this thread certainly didn't start out with the question if sage should have mutable vectors at all (although I do think that in most cases, an array would then work just as well). I think it is worth reconsidering whether all vectors need to start out their life being mutable, because that does come with a significant usability penalty in hash-related scenarios.

It is indeed the case that we could just spell these examples as {tuple(v+w): ... }, and probably that is what cached_function argument manglers do, but .... to me that just looks super ugly (and the annoying part is that it'll take me a error for every time I need to insert that).

Kwankyu Lee

unread,
Aug 6, 2021, 7:41:09 PM8/6/21
to sage-devel
Actually it depends:

sage: def build():
....:     v = vector(ZZ, 1000)
....:     for i in range(1000):
....:         v[i] = Integer(i)
....:     return v
....: 
....: 
sage: def build2():
....:     l = list()
....:     for i in range(1000):
....:         l.append(Integer(i))
....:     return vector(ZZ, l)
....: 
....: 
sage: timeit('build()')
625 loops, best of 3: 647 μs per loop
sage: timeit('build2()')
625 loops, best of 3: 926 μs per loop
sage: timeit('build()')
625 loops, best of 3: 653 μs per loop
sage: timeit('build2()')
625 loops, best of 3: 922 μs per loop

David Roe

unread,
Aug 6, 2021, 8:44:02 PM8/6/21
to sage-devel
I think Nils' original solution, of adding an immutable keyword option to the parent, is a good one.  I've run into a similar issue with matrices, where I had to make them immutable before using them as dictionary keys.
David

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

William Stein

unread,
Aug 6, 2021, 8:45:40 PM8/6/21
to sage-devel
On Fri, Aug 6, 2021 at 5:44 PM David Roe <roed...@gmail.com> wrote:
>
> I think Nils' original solution, of adding an immutable keyword option to the parent, is a good one. I've run into a similar issue with matrices, where I had to make them immutable before using them as dictionary keys.
> David
>

I agree with both of you. (My example above was just arguing not to
completely remove mutable vectors.)

> On Fri, Aug 6, 2021 at 7:39 PM Nils Bruin <nbr...@sfu.ca> wrote:
>>
>> On Friday, 6 August 2021 at 16:31:23 UTC-7 wst...@gmail.com wrote:
>>>
>>> To clarify, by "similar reasons to numpy", I meant that you open up
>>> the possibility of using Cython, vectorized operations, JITs like
>>> numba, etc. Some of these provide order of magnitude speedups and
>>> aren't an option with generic Python lists.
>>
>>
>> To clarify, this thread certainly didn't start out with the question if sage should have mutable vectors at all (although I do think that in most cases, an array would then work just as well). I think it is worth reconsidering whether all vectors need to start out their life being mutable, because that does come with a significant usability penalty in hash-related scenarios.
>>
>> It is indeed the case that we could just spell these examples as {tuple(v+w): ... }, and probably that is what cached_function argument manglers do, but .... to me that just looks super ugly (and the annoying part is that it'll take me a error for every time I need to insert that).
>>
>> --
>> You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/430ef312-156b-4364-8b73-456429d9decan%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAChs6_nTdsobT2_5_ZAG6Fvt4d3yH9wn-YcJ8013Ws3oMjGEJg%40mail.gmail.com.



--
William (http://wstein.org)

Markus Wageringel

unread,
Aug 7, 2021, 4:58:42 AM8/7/21
to sage-devel
Generally, I would be in favor of having an option for immutable vectors by default, as I had similar uses before. Usually, I just convert everything to tuples.

However, I am not sure if having an additional flag on the parent is worth the extra burden on maintenance. Already the sparsity flag is not always handled correctly (#29359, #29360), immutability may be even easier to miss. In my own code, I tend to construct vectors by `vector(ring, [..])`, which relies on the uniqueness of the parent and will not preserve immutability of vectors. Though, it is fair to say that this code is already broken as it does not preserve sparsity either. On the other hand, code that is careful about preserving the parent could break if now the parent vector space does not allow for mutation. For example, in case of a (fabricated) function like

    def f(vec, i):
        w = vec.parent()(0)
        w[i] = vec[i]
        return w

one would have to convert the vector from an immutable vector space to one that allows for mutable elements before calling it. This just displaces the original problem.


Regarding the performance discussion, (although this was not the question) using vectors also has the potential of being much slower than using numpy arrays directly.

    sage: a = vector(CDF, [CDF(1, i/(i+1)) for i in range(10^6)])
    sage: %time c1 = a.conjugate()
    Wall time: 4.39 s
    sage: %time c2 = a.parent()(a.numpy().conj())
    Wall time: 11.8 ms

Matthias Koeppe

unread,
Aug 7, 2021, 1:37:03 PM8/7/21
to sage-devel
Related:
(Refined protocol for _element_constructor_ in element classes with mutability)

Nils Bruin

unread,
Aug 8, 2021, 3:15:58 AM8/8/21
to sage-devel
On Saturday, 7 August 2021 at 01:58:42 UTC-7 Markus Wageringel wrote:
Generally, I would be in favor of having an option for immutable vectors by default, as I had similar uses before. Usually, I just convert everything to tuples.

However, I am not sure if having an additional flag on the parent is worth the extra burden on maintenance. Already the sparsity flag is not always handled correctly (#29359, #29360), immutability may be even easier to miss.

It's also less important. As long as the usual arithmetic within the parent respects it, it's fine.
 
In my own code, I tend to construct vectors by `vector(ring, [..])`, which relies on the uniqueness of the parent and will not preserve immutability of vectors. Though, it is fair to say that this code is already broken as it does not preserve sparsity either.

And it's inefficient if you have the parent lying around anyway! The globar constructor has to mangle the construction parameters, use them as a key to look up in the global cache, and then get you the parent. If you have the desired parent already in hand, none of that is necessary! So I'd say that's just a bad habit.

Whether "vector" should by default choose a vector space with default mutable or immutable vectors is another matter (it's safer to have them immutable unless the user really wants them mutable, so that would be the sensible default, but it would go against legacy, so may be harder).

Another point is coercion: if we have

V = VectorSpace(GF(5),3,immutable_vectors = True)
W = VectorSpace(GF(5),3,immutable_vectors = False)

what should V.gen(1)+W.gen(1) be? Does V get priority?

Otherwise, I think having the flag will be a life-saver in various interactive scenarios and low-impact outside. So I think it should be fairly uncontroversial to implement it.

Regarding the performance discussion, (although this was not the question) using vectors also has the potential of being much slower than using numpy arrays directly.

    sage: a = vector(CDF, [CDF(1, i/(i+1)) for i in range(10^6)])
    sage: %time c1 = a.conjugate()
    Wall time: 4.39 s
    sage: %time c2 = a.parent()(a.numpy().conj())
    Wall time: 11.8 ms

Yes, while there are some valid vector-mutating algorithms, I think they would usually arise within linear algebra modules. On user level, a mutable vector is usually just an array and would be served perfectly by a numpy (or python) array.

Travis Scrimshaw

unread,
Aug 9, 2021, 1:14:41 AM8/9/21
to sage-devel
I think Nils' original solution, of adding an immutable keyword option to the parent, is a good one.  I've run into a similar issue with matrices, where I had to make them immutable before using them as dictionary keys.

I also ran into this when implementing Lie algebras over matrices. I had to do special casing and subclasses to get around the mutability that doesn't exist for other classes in Sage. I would like to see some way of working with immutable matrices/vectors. (Internally, we need to mutate stuff in place for efficiency, such as in echelonize(). However, I believe we are all in agreement about still supporting mutable matrices/vectors.)

Best,
Travis

Michael Jung

unread,
Aug 9, 2021, 6:54:03 PM8/9/21
to sage-devel
I like the idea proposed in https://trac.sagemath.org/ticket/29101 (as posted by Matthias). Namely, introducing the option mutable=False for the element constructor. Then your code could be rewritten to

sum( D[V(v0+w, mutable=False)] for w in W)

which should be fine, I guess.

Nils Bruin

unread,
Aug 9, 2021, 8:07:09 PM8/9/21
to sage-devel
On Monday, 9 August 2021 at 15:54:03 UTC-7 Michael Jung wrote:
I like the idea proposed in https://trac.sagemath.org/ticket/29101 (as posted by Matthias). Namely, introducing the option mutable=False for the element constructor. Then your code could be rewritten to

sum( D[V(v0+w, mutable=False)] for w in W)

which should be fine, I guess.

No, that is not an improvement at all. imm(v0+w) is shorter than that. Note that in my case, v0 and w are *already* elements of V, so Sage is perfectly capable of working out that their sum v0+w should lie in V again. It's just that currently V creates its new elements by default as unhashable, whereas it should be perfectly fine to use elements from a vector space as indices in a dictionary. What we need is a way to convince the parent V to create its new elements in a hashable way.

An almost-solution is to define

Vinf = PolynomialRing(GF(2), 'e')

Its elements are by default hashable and the space V should be the GF(2)-span of [1,e,e^2,e^3,e^4,e^5].  The space W would be the GF(2)-span of [1,e,e^2].

The problem is that polynomial rings do not have a concept of subsets that are modules over their base ring, so it's difficult to write the sum: you'd have to construct the index set W explicitly by hand. Plus your vectors would print in a weird way.

I think the fact that polynomial rings give an almost-but-not-quite solution is a good illustration that we're missing a part of fundamental infrastructure here: almost all arithmetic elements in sage have hashing by default; including things like real numbers, where it's probably a bad idea. It's just vector spaces and matrices where an extra action is currently always required before hashing works. It breaks an otherwise quite expressive notational system that allows python to almost read like mathematics.

It may well be that it's convenient internally to have element constructors to allow for a "mutable=True/False" flag. It may even help with implementing the feature under consideration here: making sure that vector spaces have an option to produce (immediately) hashable elements as a result from arithmetic operations in that vector space. But it does not replace the need for being able to set that flag on a parent.

Kwankyu Lee

unread,
Aug 9, 2021, 10:03:04 PM8/9/21
to sage-devel
On Tuesday, August 10, 2021 at 9:07:09 AM UTC+9 Nils Bruin wrote:
It may well be that it's convenient internally to have element constructors to allow for a "mutable=True/False" flag. It may even help with implementing the feature under consideration here: making sure that vector spaces have an option to produce (immediately) hashable elements as a result from arithmetic operations in that vector space. But it does not replace the need for being able to set that flag on a parent.

I thought we need a ticket and a chart to guide this discussion: 

Lorenz Panny

unread,
Aug 9, 2021, 10:43:29 PM8/9/21
to sage-...@googlegroups.com

Have we considered the idea to simply call self.set_immutable() whenever
__hash__ is invoked? I think that would solve Nils' original problem and
it would mean things run smoothly for users who don't know about mutable
and immutable objects. In that case, the exception for trying to modify
immutable vectors should probably explain why the vector was frozen.


On Fri, 6 Aug 2021 01:25:08 +0200, Nils Bruin <nbr...@sfu.ca> wrote:
> Ideas?

Nils Bruin

unread,
Aug 9, 2021, 11:01:31 PM8/9/21
to sage-devel
On Monday, 9 August 2021 at 19:43:29 UTC-7 Lorenz Panny wrote:
Have we considered the idea to simply call self.set_immutable() whenever
__hash__ is invoked? I think that would solve Nils' original problem
It would solve my problem and it would avoid a whole bunch of problems we'd have with implementation of "parallel" default mutable/default immutable parents, since they'd mess up the coercion graph considerably.

I suspect it could lead to rather nightmarish debugging scenarios where someone really thought they had a mutable vector and now have to track down which operation lost the problem. Especially in the "easier to ask forgiveness" programming paradigm that is common in Python I'd imagine caching code might work something like this:

try:
    h = hash(v)
except TypeError:
    h = hash(tuple(v))

and that would have perhaps surprising results with the proposal. I like it, though. It's a really lightweight solution. A little hackish, though. But very practical.
 

David Roe

unread,
Aug 9, 2021, 11:02:45 PM8/9/21
to sage-devel
I actually like this idea a lot.  My main concern is that it may not be immediately clear where a hash is being called for a user (for example, using it as an argument to a @cached_method will call __hash__).  So when they try to mutate the vector/matrix and it fails, it may be difficult for them to find out where in the code they need to remove a hash.

An awkward "solution" would be to save the stack when you set it immutable so that you can print it when the user tries to modify the object.  But I certainly don't want to add that amount of memory to each vector.  Maybe a debugging mode for the parent could implement this, but discovery would be a problem....
David

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Aug 10, 2021, 12:46:44 PM8/10/21
to sage-devel
On Monday, August 9, 2021 at 7:43:29 PM UTC-7 Lorenz Panny wrote:
Have we considered the idea to simply call self.set_immutable() whenever
__hash__ is invoked? 

-1 on this; too complicated semantics

 

Nils Bruin

unread,
Aug 10, 2021, 6:39:14 PM8/10/21
to sage-devel
The semantics are actually quite straightforward: it's hashability-on-demand.  It makes the system as permissible as is possible while staying consistent. One would really have no business asking the hash of a structure that is supposed to be mutable, and I expect that there is very little code out there that would be hit by this (because, really, one has to be rather purposeful in sage to actually mutate a vector. You really end up using it in a rather non-mathematical way.

The only concern would be these accidental "try hash(v) except TypeError" scenarios, but I don't know if those actually occur in the sage library.

It would be interesting to see how much of the library breaks if we put this in place. I suspect it might be very little.

Matthias Koeppe

unread,
Aug 10, 2021, 6:46:10 PM8/10/21
to sage-devel
On Tuesday, August 10, 2021 at 3:39:14 PM UTC-7 Nils Bruin wrote:
On Tuesday, 10 August 2021 at 09:46:44 UTC-7 Matthias Koeppe wrote:
On Monday, August 9, 2021 at 7:43:29 PM UTC-7 Lorenz Panny wrote:
Have we considered the idea to simply call self.set_immutable() whenever
__hash__ is invoked? 

-1 on this; too complicated semantics
 
The semantics are actually quite straightforward: it's hashability-on-demand. 

... which are semantics for which there is absolutely no precedence in Python.

Kwankyu Lee

unread,
Aug 10, 2021, 9:03:57 PM8/10/21
to sage-devel
On Wednesday, August 11, 2021 at 7:39:14 AM UTC+9 Nils Bruin wrote:
The semantics are actually quite straightforward: it's hashability-on-demand. 

The demand is implicit and not explicitly demanded, like set_immutable(), by the user. We may imagine a situation that this causes a surprise to the user.

Travis Scrimshaw

unread,
Aug 10, 2021, 9:53:50 PM8/10/21
to sage-devel
I am -1 on hashability-on-demand for the suggested reason: it could lead to hard-to-track-down bugs in code where we end up calling a hash of something that could be *very* implicit, input to a UniqueRepresentation or a @cached_method/function. As Nils said, the user has no business hashing a mutable object, and we should explicitly tell the user that rather than silently changing it. It violates the principle of least surprise.

Imagine a bank that gave 50% of all your deposits to charity because it thought that is what you wanted to do because you checked out a charity website, but you only would find out about this when you checked your balance. I would not be happy with this bank.

Best,
Travis

Nils Bruin

unread,
Aug 10, 2021, 11:54:02 PM8/10/21
to sage-devel
On Tuesday, 10 August 2021 at 18:53:50 UTC-7 Travis Scrimshaw wrote:
I am -1 on hashability-on-demand for the suggested reason: it could lead to hard-to-track-down bugs in code where we end up calling a hash of something that could be *very* implicit, input to a UniqueRepresentation or a @cached_method/function. As Nils said, the user has no business hashing a mutable object, and we should explicitly tell the user that rather than silently changing it. It violates the principle of least surprise.

At the moment, it's just a conjecture that such scenarios *might* be possible. I think we need to see some real-world example where this problem arises before we further assess how much we need to mitigate against this.

But that brings us back to the problem that I did not ASK for a mutable object! I wanted an immutable one!

V=VectorSpace(GF(2),3)
v = V.gen(1)
w = V.gen(2)
v.set_immutable() #this is annoying, but  OK
w.set_immutable() #this is again annoying
D={}
D[v]=1
D[w]=2
D[v+w]=3 #the error here is uncalled for!

If we're going to insist on people expressing intent on whether vectors will need to be mutable or immutable, then in parallel with all other arithmetic structures, vectors should be by default immutable. If someone wants a mutable vector, they should ask for a mutable explicitly. If we'd start from scratch I'm pretty sure that's what we would arrive at. So if we're going for purity of design, I think we'd need to take the leap and go with that.

That's going to
 
Imagine a bank that gave 50% of all your deposits to charity because it thought that is what you wanted to do because you checked out a charity website, but you only would find out about this when you checked your balance. I would not be happy with this bank.

and I don't see the parallel with the topic here, so let's leave the straw men out of it.

William Stein

unread,
Aug 11, 2021, 12:10:43 AM8/11/21
to sage-...@googlegroups.com
Just to add to all the ideas, what about using a with statement, eg:

with Immutable():
    # the things Nils wishes were immutable are immutable by default here

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
--
-- William Stein

Nils Bruin

unread,
Aug 11, 2021, 12:48:24 AM8/11/21
to sage-devel
On Tuesday, 10 August 2021 at 20:54:02 UTC-7 Nils Bruin wrote:

V=VectorSpace(GF(2),3)
v = V.gen(1)
w = V.gen(2)
v.set_immutable() #this is annoying, but  OK
w.set_immutable() #this is again annoying
D={}
D[v]=1
D[w]=2
D[v+w]=3 #the error here is uncalled for!

It's actually worse than that! The stored generators of a vector space are actually immutable! So the session really goes:

V=VectorSpace(GF(2),3)
D={}
D[V.1]=1 #happy
D[V.2]=2 #still happy
D[V.1+V.2]=3 # Error??? !@#?! Why is the sum of two vectors now suddenly mutable??

This is a very surprising scenario. It happens in real life. I know because it happened to me.

Nils Bruin

unread,
Aug 11, 2021, 2:09:59 AM8/11/21
to sage-devel
On Tuesday, 10 August 2021 at 21:10:43 UTC-7 wst...@gmail.com wrote:
Just to add to all the ideas, what about using a with statement, eg:

with Immutable():
    # the things Nils wishes were immutable are immutable by default here
 
That would basically move the "unless otherwise instructed, new elements are constructed mutable/immutable" from a flag on the parent to global state. On the plus side, we don't need to duplicate parents or (worse!) have a toggle on parents that should be immutable. On the minus side, we now have truly global state that governs this behaviour. The "with" context manager hides it a little bit, but the implementation would control the toggle (globally!) based on time, not scope (because there is no such thing in python).
In fact for my scenario (an interactive session) I would not be able to write in a "with Immutable" environment, but that's fine: you can just take the context manager without the "with" and call the "__enter__" on it.

A big plus: since we'll have the option to flick between the two different default behaviours, perhaps it gives a gentler route towards "default immutable", where mutable vectors can be had, but need to be explicitly requested. Which is probably the better place for the long-time future.

So while global state is generally to be avoided, perhaps in this situation it is an instrument that can get us ahead eventually.

Vincent Delecroix

unread,
Aug 11, 2021, 2:27:35 AM8/11/21
to sage-...@googlegroups.com
Very tempting to use environment. It could even be refined to

with Immutable(V):
# elements of V are now immutable by default

But even then, some code elsewhere in sage might use the very same
parent (eg ZZ^2 is unique). This parent might want to deal with
mutable vectors. So toggling options on the parent does not look
like an option to me.

Matthias Koeppe

unread,
Aug 11, 2021, 1:20:10 PM8/11/21
to sage-devel
On Tuesday, August 10, 2021 at 11:27:35 PM UTC-7 vdelecroix wrote:

It could even be refined to

with Immutable(V):
# elements of V are now immutable by default

But even then, some code elsewhere in sage might use the very same
parent (eg ZZ^2 is unique). This parent might want to deal with
mutable vectors. So toggling options on the parent does not look
like an option to me.

Indeed. It's also not threadsafe.

Strong -1 on using context managers for this.

 

Matthias Koeppe

unread,
Aug 11, 2021, 1:41:04 PM8/11/21
to sage-devel
On Monday, August 9, 2021 at 5:07:09 PM UTC-7 Nils Bruin wrote:
On Monday, 9 August 2021 at 15:54:03 UTC-7 Michael Jung wrote:
I like the idea proposed in https://trac.sagemath.org/ticket/29101 (as posted by Matthias). Namely, introducing the option mutable=False for the element constructor. Then your code could be rewritten to

sum( D[V(v0+w, mutable=False)] for w in W)

which should be fine, I guess.

No, that is not an improvement at all. imm(v0+w) is shorter than that.

"imm(v0+w)" is only shorter *after* one defines Nils' local workaround function "imm" and doesn't count the definition toward the length.

Another spelling would be (v0+w).copy(mutable=False), by the way.

Overall my suggestion would be to work out the intended semantics for the element constructor and element methods (copy and arithmetic), as I started in https://trac.sagemath.org/ticket/29101 -- which has not received very substantial comments yet. 

The discussion of semantic innovations for the parents seems like a distraction to me. Note that in Python, the distinction between the mutable and immutable version of a type is usually expressed by different types: list/tuple, set/frozenset; and in libraries like SymPy, Matrix/ImmutableMatrix, ... Changing from mutable to immutable is done by creating a new object of the immutable version of the type. So the very operation "set_immutable" of Sage elements (mutating the mutability...) is already out of line with Python conventions; and in fact, this is what makes workaround functions like Nils' "imm" necessary if one wants to use functional notation in generator expressions etc.

For large vectors, one may of course be concerned with the overhead of making the immutable copy. Standard object ownership management techniques would come to the rescue. For example: Make the immutability flag 3-way: immutable/mutable/copy-on-write. Then mutable_element.copy(mutable=False) makes a shallow copy only in O(1) and sets mutable_element to copy-on-write. Mutating a copy-on-write element would make the actual copy of the vector.



Kwankyu Lee

unread,
Aug 11, 2021, 8:59:55 PM8/11/21
to sage-devel
On Thursday, August 12, 2021 at 2:41:04 AM UTC+9 Matthias Koeppe wrote:
The discussion of semantic innovations for the parents seems like a distraction to me. Note that in Python, the distinction between the mutable and immutable version of a type is usually expressed by different types: list/tuple, set/frozenset; and in libraries like SymPy, Matrix/ImmutableMatrix, ... Changing from mutable to immutable is done by creating a new object of the immutable version of the type. So the very operation "set_immutable" of Sage elements (mutating the mutability...) is already out of line with Python conventions; and in fact, this is what makes workaround functions like Nils' "imm" necessary if one wants to use functional notation in generator expressions etc.

At this point, I want to advertise the ticket


which aims to create parents with immutable elements  living alongside with parents with mutable elements. This would amount to elevate mutability to the level of sparsity in Sage. Depending on how we implement this, we may lose efficiency at some point, but I think we may implement it to be backward compatible.  

Matthias Koeppe

unread,
Sep 2, 2021, 12:47:57 PM9/2/21
to sage-devel
On Wednesday, August 11, 2021 at 10:41:04 AM UTC-7 Matthias Koeppe wrote:

Overall my suggestion would be to work out the intended semantics for the element constructor and element methods (copy and arithmetic), as I started in https://trac.sagemath.org/ticket/29101

Reply all
Reply to author
Forward
0 new messages