# Should matrix spaces keep violating the basic assumption on hash functions, or should they be unique parents?

11 views

### Simon King

Jan 11, 2012, 4:25:47 AM1/11/12
to sage-devel
Hi!

Matrix spaces of dense and sparse matrices are equal and are thus not
unique parents:
sage: M1 = MatrixSpace(ZZ,2,sparse=True)
sage: M2 = MatrixSpace(ZZ,2,sparse=False)
sage: M1 is M2
False
sage: M1==M2
True

Fine. But what should be really bad: They violate the assumption that
equal objects have equal hash.
sage: hash(M1)==hash(M2)
False

By consequence, the two equal matrix spaces will correspond to
different items of dictionaries:
sage: D = {M1:1,M2:2}
sage: len(D)
2

I think that it is an apparent bug. Therefore, at #12290, I am fixing
the hash values (and moreover I speed-up the hash of matrix spaces
considerably).

However, it seems that the coercion framework relies on the bug.
Namely, when fixing it, one gets
sage: A = matrix(ZZ, 5, range(30), sparse=False)
sage: B = matrix(ZZ, 5, range(30), sparse=True)
sage: C = matrix(QQ, 5, range(30), sparse=True)
sage: A.elementwise_product(C).is_dense()
True
sage: B.elementwise_product(C).is_sparse()

---------------------------------------------------------------------------
TypeError Traceback (most recent
call last)
...
TypeError: no common canonical parent for objects with parents:
'Full MatrixSpace of 5 by 6 sparse matrices over Integer Ring' and
'Full MatrixSpace of 5 by 6 sparse matrices over Rational Field'

The problem can be fixed by making matrix spaces unique parents (but I
didn't check yet whether it creates other problems).

Question to you: Do you see an obvious reason why matrix spaces should
not be unique parents?

Best regards,
Simon

### Nicolas M. Thiery

Jan 11, 2012, 5:50:36 AM1/11/12
On Wed, Jan 11, 2012 at 01:25:47AM -0800, Simon King wrote:
> Matrix spaces of dense and sparse matrices are equal and are thus not
> unique parents:
> sage: M1 = MatrixSpace(ZZ,2,sparse=True)
> sage: M2 = MatrixSpace(ZZ,2,sparse=False)
> sage: M1 is M2
> False
> sage: M1==M2
> True
>
> Fine. But what should be really bad: They violate the assumption that
> equal objects have equal hash.
> sage: hash(M1)==hash(M2)
> False
>
> By consequence, the two equal matrix spaces will correspond to
> different items of dictionaries:
> sage: D = {M1:1,M2:2}
> sage: len(D)
> 2
>
> I think that it is an apparent bug.

+1

> Therefore, at #12290, I am fixing the hash values (and moreover I
> speed-up the hash of matrix spaces considerably).

Cool.

> However, it seems that the coercion framework relies on the bug.
> Namely, when fixing it, one gets
> sage: A = matrix(ZZ, 5, range(30), sparse=False)
> sage: B = matrix(ZZ, 5, range(30), sparse=True)
> sage: C = matrix(QQ, 5, range(30), sparse=True)
> sage: A.elementwise_product(C).is_dense()
> True
> sage: B.elementwise_product(C).is_sparse()
>
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent
> call last)
> ...
> TypeError: no common canonical parent for objects with parents:
> 'Full MatrixSpace of 5 by 6 sparse matrices over Integer Ring' and
> 'Full MatrixSpace of 5 by 6 sparse matrices over Rational Field'
>
> The problem can be fixed by making matrix spaces unique parents (but I
> didn't check yet whether it creates other problems).
>
> Question to you: Do you see an obvious reason why matrix spaces should
> not be unique parents?

If at all possible, I for myself would rather have them as unique
parents.

On the other hand, I remember it was argued against it for
VectorSpaces: indeed users might want to create several copies of a
given vector space, in particular to use different scalar products.
I don't know if there are similar needs/use for matrix spaces.

In the case of VectorSpaces, I would be more in favor of having unique
parents, and making the scalar product a part of the data to construct
the vector space. That is:

VectorSpace(QQ,3, scalar_product=...)

would model QQ^3, *endowed* with the given scalar product, which is
different from QQ^3.

Also, we could still allow the user to create his/her private vector
space by specifying an extra key to the constructor (that's what
CombinatorialFreeModule does). But that might require some refactoring
and might break backward compatibility.

Cheers,
Nicolas
--
Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/

### Simon King

Jan 11, 2012, 6:46:00 AM1/11/12
to sage-devel
Hi Nicolas, hi all,

On 11 Jan., 11:50, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>
wrote:
> > Question to you: Do you see an obvious reason why matrix spaces should
> > not be unique parents?
>
> If at all possible, I for myself would rather have them as unique
> parents.
>
> On the other hand, I remember it was argued against it for
> VectorSpaces: indeed users might want to create several copies of a
> given vector space, in particular to use different scalar products.
> I don't know if there are similar needs/use for matrix spaces.

Here, the only extra structure is "sparse versus dense". Namely,
currently, a matrix space is determined as unique object by
base_ring, nrows, ncols, sparse
but equality is tested by
base_ring, nrows, ncols
only.

Of course, "dense versus sparse" is not a mathematical property, but
only concerns implementation (so, it is different from a scalar
product, which carries mathematical information).

From my perspective, it only matters whether one can do arithmetic
involving a sparse and a dense matrix -- that's to say, whether the
coercion model is easily able to handle it. And for the coercion
model, it is easier when the parents are unique.

Two remarks:
1) With my patch from #12290, the time for returning the hash of a
matrix space drops from 13.8 µs to 547 ns. But when using
UniqueRepresentation as a base class of matrix spaces, then one even
comes to 489 ns. So, if you mainly care for speed then using
UniqueRepresentation is the way to go.

2) A more general consideration:
The coercion model prefers to have unique parents. But many people
think that "A == B" should mean "A and B are canonically isomorphic",
and not just "A is B". That could be solved by making the coercion
model consequently use containers that compare by "is" and not by
"==". For example, consequently use
sage.structure.coerce_dict.TripleDict and not dict.

In that way, we could both strengthen the coercion model and make the
people happy. But it would probably involve some tedious work...

Best regards,
Simon

### Marco Streng

Jan 11, 2012, 7:30:07 AM1/11/12
On 11/01/2012 11:46, Simon King wrote:
> 2) A more general consideration: The coercion model prefers to have
> unique parents. But many people think that "A == B" should mean "A
> and B are canonically isomorphic", and not just "A is B". That could
> be solved by making the coercion model consequently use containers
> that compare by "is" and not by "==".

I'm not sure what you mean by ``making the coercion model use containers that compare by "is" and not by "=="'', but I'll just chime in as I think it could be related to something that I found.

Currently, if two parents A and B are equal according to "==" and no good coercion map is found, then A.coerce_map_from(B) returns A._generic_convert_map(B). In case of number fields, this means that if two number fields have same generator name and same defining polynomial, then the generators are identified by coercion. There is a NumberField example where this is incompatible with maps that I would consider natural. So in the NumberField case, I would prefer not to have a coercion in this case, but only a conversion.

Marco

PS. More specifically, the example is

sage: K.<a> = NumberField(x^2-2, embedding=-1)
sage: L.<b> = NumberField(x^2-2, embedding=1)
sage: xK = K['x'].gen()
sage: xL = L['x'].gen()
sage: M.<c> = NumberField(xK^2-3)
sage: N.<d> = NumberField(xL^2-3)
sage: O = M.absolute_field('e')
sage: P = N.absolute_field('e')
sage: b_in_a = K(0)+b
sage: map1 = O.structure()
sage: map2 = P.structure()
sage: b_in_a in map1.domain()
# True
sage: b in map2.domain()
# True
sage: map1(b_in_a) - map2(b)
# e^3 - 9*e, which is non-zero, so the maps don't commute!

The maps O <--> M <-- K <--> L --> N <--> P are natural (structure morphisms and coercions).
The fields O and P have same generator name and same defining polynomials, hence O==P is True. However, O.structure() and P.structure() are distinct, which makes O._generic_convert_map(P) incompatible with the natural maps O <--> M <-- K <--> L --> N <--> P

So I would prefer there not to be a coercion from O to P. I'm fine with conversion.

### Simon King

Jan 11, 2012, 8:25:54 AM1/11/12
to sage-devel
Hi Marco,

On 11 Jan., 13:30, Marco Streng <marco.str...@gmail.com> wrote:
> I'm not sure what you mean by ``making the coercion model use containers
> that compare by "is" and not by "=="'',

If you have a dictionary D with keys A and B such that "A==B" but "A
is not B", then you will always have "D[A] is D[B]" -- of course
unless you screw that hash functions of A and B, which currently is
the case for matrix spaces. This is simply how Python dictionaries
work.

However, sage.structure.coerce_dict.TripleDict has the property that
(1) each key is a triple and (2) the three parts of the key are
compared by identity and not by equality. Hence, "D[A,A,A] is not
D[A,B,A]".

> but I'll just chime in as I
> think it could be related to something that I found.

The topic is certainly (weakly, at least :) related.

> sage: K.<a> = NumberField(x^2-2, embedding=-1)
> sage: L.<b> = NumberField(x^2-2, embedding=1)
> sage: xK = K['x'].gen()
> sage: xL = L['x'].gen()
> sage: M.<c> = NumberField(xK^2-3)
> sage: N.<d> = NumberField(xL^2-3)
> sage: O = M.absolute_field('e')
> sage: P = N.absolute_field('e')

Yep, non-uniqueness strikes again:
sage: O==P
True
sage: O is P
False

But I think in this case there should be other ways to ensure that the
conversion from O to P is really just used as conversion and not as
coercion.
Cheers,
Simon

### Simon King

Jan 11, 2012, 8:44:35 AM1/11/12
to sage-devel
Hi!

#12290 is now ready for review.

For reference, the ticket contains one not-totally-finished patch that
only improves the hash.
However, the patch that is supposed to be reviewed goes beyond and
introduces UniqueRepresentation as a base class of matrix spaces.

First feature: All tests pass, the hash problem is fixed and in
addition the hash is even faster than with the old patch.

Second feature: Coercion of matrices is a lot nicer now, IMHO!!

Namely, in unpatched sage, the parent of the sum of a dense and a
sparse matrix implicitly depends on the summation order:
sage: M = MatrixSpace(ZZ, 10)
sage: N = MatrixSpace(ZZ, 10,sparse=True)
sage: a = M.random_element()
sage: b = N.random_element()
sage: (a+b).parent()
Full MatrixSpace of 10 by 10 dense matrices over Integer Ring
sage: (b+a).parent()
Full MatrixSpace of 10 by 10 sparse matrices over Integer Ring

With my patch, the parent of b+a is the same as the parent of a+b.

Cheers,
Simon