Equality, hashing and polynomial rings

61 views
Skip to first unread message

Jean-Pierre Flori

unread,
Aug 19, 2014, 11:13:36 AM8/19/14
to sage-...@googlegroups.com
Hi,

I'm currently wrapping FLINT fq modules into new Sage classes for finite fiels and got strange coercion errors when running non trivial code.
Indeed, the polynomial ring constructors use a dictionary and so relies on the hash of finite fields (the parents).
By some chance, the hashes of two previous implementations of finite fields using PARI where different because some variable name was replaced by something else in some object and this prevented this problem to occur.

I read http://www.sagemath.org/doc/developer/coding_in_python.html#the-hash-special-method but would welcome any advice.

Though (non-prime) finite fields with the same order and defining polynomial are considered equal, I'd say the hashes should be different.
So, let's add the string of the implementation of the finite field into it's hash (which is cached by the way).
Does that seem sensible?

Or should I modify the finite field code so that these hashes are equal whatever the implementation is?
Then we'd need coercion between all of the implementations (which is not currently the case but should be done anyway), but that would surely yield a non-negligible speed penalty, and you'll never know what the implementation of the base ring of the polynomial ring will be as it will depend on the order finite fields and their polynomial rings are created.

Or should we modify the caching system of the polynomial rings not to use hashes but ids?

Best,
JP

Peter Bruin

unread,
Aug 19, 2014, 1:39:18 PM8/19/14
to sage-...@googlegroups.com
Hi Jean-Pierre,

I'm not sure I understand correctly; do you mean that the problem is caused by fields comparing equal even if the implementations are different?  I think we should in any case make FiniteField inherit from WithEqualityById, so two instances compare equal if and only if they are identical.  Would this be enough?

Peter

Jean-Pierre Flori

unread,
Aug 19, 2014, 2:08:33 PM8/19/14
to sage-...@googlegroups.com
Not sure.
I seem to be able to produce things with different hashes but equal...
sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')
True
Though I have:
sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')
False
Maybe that's because of the __cmp__ functions I copied/pasted/slightly modified.
Or maybe I forgot to implement some methods as the new classes are cdefed.
I'll check that.

Even if it boils down to a different matter, I don't feel that the current other __hash__ method are good enough as well to tell between different finite field.
It mostly work by chance.
And indeed whe copy/paste/modify'ing it, it gave me similar hashes for different implems.
Not sure for what __hash__ is used directly.
In most cases the EqualityById stuff might do the trick.

Jean-Pierre Flori

unread,
Aug 19, 2014, 2:26:13 PM8/19/14
to sage-...@googlegroups.com
I also see that some double underscore methods rather than single underscore ones are implemented, that might be one of the problems here.

Jean-Pierre Flori

unread,
Aug 19, 2014, 2:28:50 PM8/19/14
to sage-...@googlegroups.com


On Tuesday, August 19, 2014 8:26:13 PM UTC+2, Jean-Pierre Flori wrote:
I also see that some double underscore methods rather than single underscore ones are implemented, that might be one of the problems here.
Or the other way around...

Jean-Pierre Flori

unread,
Aug 19, 2014, 2:53:32 PM8/19/14
to sage-...@googlegroups.com
I guess _cmp_ should be used for cython parents and __cmp__ for python ones.
There is in parent.pyx:
            # Both are parents -- but need *not* have the same type.
            if HAS_DICTIONARY(left):
                r = left.__cmp__(right)
            else:
                r = left._cmp_(right)

Simon King

unread,
Aug 19, 2014, 3:05:15 PM8/19/14
to sage-...@googlegroups.com
Hi Jean-Pierre,

On 2014-08-19, Jean-Pierre Flori <jpf...@gmail.com> wrote:
> Though (non-prime) finite fields with the same order and defining
> polynomial are considered equal, I'd say the hashes should be different.

Absolutely no way. If they are considered equal, then the hashes must be
the same.

It may, however, be a possibility to make them compare unequal, but with
coercion maps in both directions.

> So, let's add the string of the implementation of the finite field into
> it's hash (which is cached by the way).
> Does that seem sensible?

No.

> Or should I modify the finite field code so that these hashes are equal
> whatever the implementation is?
> Then we'd need coercion between all of the implementations (which is not
> currently the case but should be done anyway), but that would surely yield
> a non-negligible speed penalty,

Why should that be the case? Once a coercion is established, it will be
looked up from cache, which by the way uses a special kind of dictionary
that compares parents by identity, not equality.

> and you'll never know what the
> implementation of the base ring of the polynomial ring will be as it will
> depend on the order finite fields and their polynomial rings are created.

So what? If possible, avoid making your code depend on types.

> Or should we modify the caching system of the polynomial rings not to use
> hashes but ids?

This might indeed be a possibility. As mentioned above, the coercion
system uses such caches.

Let F1, F2 be two different implementation of "the same" finite field.
Currently, F1[x] and F2[x] are identical objects, which may indeed
constitute a problem. Apart from making the cache use comparison by
identity, there is the possibility to be explicit about the
implementation of the polynomial ring (or do you still want to use the default
implementation for the polynomial rings if there are different implementations
of the base ringst?). Then your cache problem would vanish, since the
implementation is part of the cache key (the PolynomialRing constructor
has an "implementation" optional argument).

Best regards,
Simon


Peter Bruin

unread,
Aug 19, 2014, 3:10:38 PM8/19/14
to sage-...@googlegroups.com
Hi Jean-Pierre,

> I'm not sure I understand correctly; do you mean that the problem is
> caused by fields comparing equal even if the implementations are
> different? I think we should in any
>
> Not sure.
> I seem to be able to produce things with different hashes but equal...
> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')
> True

This should probably be False; even though the fields are canonically
isomorphic, pretending that they are equal sounds like asking for
trouble to me...

> Though I have:
> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')
> False
> Maybe that's because of the __cmp__ functions I copied/pasted/slightly
> modified.
> Or maybe I forgot to implement some methods as the new classes are cdefed.
> I'll check that.

I'm testing a patch that makes FiniteField behave like WithEqualityById
(unfortunately, inheriting from it is not possible because Cython does
not support multiple inheritance) and removes all hashing and comparison
methods from other finite field classes.

By the way, there is no point in relying on hashes being different for
"different" objects if they still compare equal; there is always some
small probability of a hash collision. (I just saw that Simon King
wrote something to the same effect.)

Peter


Simon King

unread,
Aug 19, 2014, 3:13:07 PM8/19/14
to sage-...@googlegroups.com
Hi Peter,

On 2014-08-19, Peter Bruin <pjb...@gmail.com> wrote:
> I'm not sure I understand correctly; do you mean that the problem is caused
> by fields comparing equal even if the implementations are different? I
> think we should in any case make FiniteField inherit from WithEqualityById,
> so two instances compare equal if and only if they are identical. Would
> this be enough?

I think this would make sense---of course with bidirectional coercions,
as mentioned in my previous mail.

Cheers,
Simon

Jean-Pierre Flori

unread,
Aug 19, 2014, 3:13:51 PM8/19/14
to sage-...@googlegroups.com


On Tuesday, August 19, 2014 9:05:15 PM UTC+2, Simon King wrote:
Hi Jean-Pierre,

On 2014-08-19, Jean-Pierre Flori <jpf...@gmail.com> wrote:
> Though (non-prime) finite fields with the same order and defining
> polynomial are considered equal, I'd say the hashes should be different.

Absolutely no way. If they are considered equal, then the hashes must be
the same.
You mean that it should be this way (for parents, or elements, or any class whether from cython or python)?
I assume this is for parents.

It may, however, be a possibility to make them compare unequal, but with
coercion maps in both directions.

> So, let's add the string of the implementation of the finite field into
> it's hash (which is cached by the way).
> Does that seem sensible?

No.

Ok.
> Or should I modify the finite field code so that these hashes are equal
> whatever the implementation is?
> Then we'd need coercion between all of the implementations (which is not
> currently the case but should be done anyway), but that would surely yield
> a non-negligible speed penalty,

Why should that be the case? Once a coercion is established, it will be
looked up from cache, which by the way uses a special kind of dictionary
that compares parents by identity, not equality.

I mean actually applying the coercion might be expensive.
That's why I'd like that when building a polynomial ring over a finite field using a given implementation it uses the same implementation to "represent" its coefficients.

Jean-Pierre Flori

unread,
Aug 19, 2014, 3:26:27 PM8/19/14
to sage-...@googlegroups.com
Thanks for all the answers.

It seems my issues was caused by defining the _cmp_ method with the wrong number of underscores in my cython extension classes.
Do you confirm that for parents defined in a usual Python class __cmp__ with two underscores should be used and for a cython extension class _cmp_ with only one underscore?

I've put back my original __hash__ functions which evaluate equal on the two different classes and have now nice polynomial rings as I wanted.
Do you confirm this "hash collision" is not a problem?

Thanx,
JP

Jean-Pierre Flori

unread,
Aug 19, 2014, 4:04:24 PM8/19/14
to sage-...@googlegroups.com
On top of that, Python3 want have cmp anymore :)

Peter Bruin

unread,
Aug 19, 2014, 4:38:45 PM8/19/14
to sage-...@googlegroups.com
Hi Simon,

>> I'm not sure I understand correctly; do you mean that the problem is
>> caused by fields comparing equal even if the implementations are
>> different? I think we should in any case make FiniteField inherit
>> from WithEqualityById, so two instances compare equal if and only if
>> they are identical. Would this be enough?
>
> I think this would make sense---of course with bidirectional
> coercions, as mentioned in my previous mail.

This is now implemented in #16855 (ready for review).

Peter

Nils Bruin

unread,
Aug 20, 2014, 8:46:06 PM8/20/14
to sage-...@googlegroups.com


On Tuesday, August 19, 2014 12:13:07 PM UTC-7, Simon King wrote:
I think this would make sense---of course with bidirectional coercions,
as mentioned in my previous mail.
 
Bidirectional coercion implies memory leak: The strong references to the codomains on the coercion maps will keep both fields alive. The weakly keyed global coercion dict will therefore keep the coercion maps accessible, which keep the fields alive. We've had these leaks before.
Reply all
Reply to author
Forward
0 new messages