61 views

Skip to first unread message

Aug 19, 2014, 11:13:36 AM8/19/14

to sage-...@googlegroups.com

Hi,

I'm currently wrapping FLINT fq modules into new Sage classes for finite fiels and got strange coercion errors when running non trivial code.

Indeed, the polynomial ring constructors use a dictionary and so relies on the hash of finite fields (the parents).

By some chance, the hashes of two previous implementations of finite fields using PARI where different because some variable name was replaced by something else in some object and this prevented this problem to occur.

I read http://www.sagemath.org/doc/developer/coding_in_python.html#the-hash-special-method but would welcome any advice.

Though (non-prime) finite fields with the same order and defining polynomial are considered equal, I'd say the hashes should be different.

So, let's add the string of the implementation of the finite field into it's hash (which is cached by the way).

Does that seem sensible?

Or should I modify the finite field code so that these hashes are equal whatever the implementation is?

Then we'd need coercion between all of the implementations (which is not currently the case but should be done anyway), but that would surely yield a non-negligible speed penalty, and you'll never know what the implementation of the base ring of the polynomial ring will be as it will depend on the order finite fields and their polynomial rings are created.

Or should we modify the caching system of the polynomial rings not to use hashes but ids?

Best,

JP

I'm currently wrapping FLINT fq modules into new Sage classes for finite fiels and got strange coercion errors when running non trivial code.

Indeed, the polynomial ring constructors use a dictionary and so relies on the hash of finite fields (the parents).

By some chance, the hashes of two previous implementations of finite fields using PARI where different because some variable name was replaced by something else in some object and this prevented this problem to occur.

I read http://www.sagemath.org/doc/developer/coding_in_python.html#the-hash-special-method but would welcome any advice.

Though (non-prime) finite fields with the same order and defining polynomial are considered equal, I'd say the hashes should be different.

So, let's add the string of the implementation of the finite field into it's hash (which is cached by the way).

Does that seem sensible?

Or should I modify the finite field code so that these hashes are equal whatever the implementation is?

Then we'd need coercion between all of the implementations (which is not currently the case but should be done anyway), but that would surely yield a non-negligible speed penalty, and you'll never know what the implementation of the base ring of the polynomial ring will be as it will depend on the order finite fields and their polynomial rings are created.

Or should we modify the caching system of the polynomial rings not to use hashes but ids?

Best,

JP

Aug 19, 2014, 1:39:18 PM8/19/14

to sage-...@googlegroups.com

Hi Jean-Pierre,

I'm not sure I understand correctly; do you mean that the problem is caused by fields comparing equal even if the implementations are different? I think we should in any case make FiniteField inherit from WithEqualityById, so two instances compare equal if and only if they are identical. Would this be enough?

Peter

Aug 19, 2014, 2:08:33 PM8/19/14

to sage-...@googlegroups.com

Not sure.

I seem to be able to produce things with different hashes but equal...

sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')

True

Though I have:

sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')

False

Maybe that's because of the __cmp__ functions I copied/pasted/slightly modified.

Or maybe I forgot to implement some methods as the new classes are cdefed.

I'll check that.

Even if it boils down to a different matter, I don't feel that the current other __hash__ method are good enough as well to tell between different finite field.

It mostly work by chance.

And indeed whe copy/paste/modify'ing it, it gave me similar hashes for different implems.

Not sure for what __hash__ is used directly.

In most cases the EqualityById stuff might do the trick.

I seem to be able to produce things with different hashes but equal...

sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')

True

Though I have:

sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')

False

Maybe that's because of the __cmp__ functions I copied/pasted/slightly modified.

Or maybe I forgot to implement some methods as the new classes are cdefed.

I'll check that.

Even if it boils down to a different matter, I don't feel that the current other __hash__ method are good enough as well to tell between different finite field.

It mostly work by chance.

And indeed whe copy/paste/modify'ing it, it gave me similar hashes for different implems.

Not sure for what __hash__ is used directly.

In most cases the EqualityById stuff might do the trick.

Aug 19, 2014, 2:26:13 PM8/19/14

to sage-...@googlegroups.com

I also see that some double underscore methods rather than single underscore ones are implemented, that might be one of the problems here.

Aug 19, 2014, 2:28:50 PM8/19/14

to sage-...@googlegroups.com

On Tuesday, August 19, 2014 8:26:13 PM UTC+2, Jean-Pierre Flori wrote:

I also see that some double underscore methods rather than single underscore ones are implemented, that might be one of the problems here.

Or the other way around...

Aug 19, 2014, 2:53:32 PM8/19/14

to sage-...@googlegroups.com

I guess _cmp_ should be used for cython parents and __cmp__ for python ones.

There is in parent.pyx:

# Both are parents -- but need *not* have the same type.

if HAS_DICTIONARY(left):

r = left.__cmp__(right)

else:

r = left._cmp_(right)

There is in parent.pyx:

# Both are parents -- but need *not* have the same type.

if HAS_DICTIONARY(left):

r = left.__cmp__(right)

else:

r = left._cmp_(right)

Aug 19, 2014, 3:05:15 PM8/19/14

to sage-...@googlegroups.com

Hi Jean-Pierre,

On 2014-08-19, Jean-Pierre Flori <jpf...@gmail.com> wrote:

> Though (non-prime) finite fields with the same order and defining

> polynomial are considered equal, I'd say the hashes should be different.

Absolutely no way. If they are considered equal, then the hashes must be

the same.

It may, however, be a possibility to make them compare unequal, but with

coercion maps in both directions.

> So, let's add the string of the implementation of the finite field into

> it's hash (which is cached by the way).

> Does that seem sensible?

No.

> Or should I modify the finite field code so that these hashes are equal

> whatever the implementation is?

> Then we'd need coercion between all of the implementations (which is not

> currently the case but should be done anyway), but that would surely yield

> a non-negligible speed penalty,

Why should that be the case? Once a coercion is established, it will be

looked up from cache, which by the way uses a special kind of dictionary

that compares parents by identity, not equality.

> and you'll never know what the

> implementation of the base ring of the polynomial ring will be as it will

> depend on the order finite fields and their polynomial rings are created.

So what? If possible, avoid making your code depend on types.

> Or should we modify the caching system of the polynomial rings not to use

> hashes but ids?

This might indeed be a possibility. As mentioned above, the coercion

system uses such caches.

Let F1, F2 be two different implementation of "the same" finite field.

Currently, F1[x] and F2[x] are identical objects, which may indeed

constitute a problem. Apart from making the cache use comparison by

identity, there is the possibility to be explicit about the

implementation of the polynomial ring (or do you still want to use the default

implementation for the polynomial rings if there are different implementations

of the base ringst?). Then your cache problem would vanish, since the

implementation is part of the cache key (the PolynomialRing constructor

has an "implementation" optional argument).

Best regards,

Simon

On 2014-08-19, Jean-Pierre Flori <jpf...@gmail.com> wrote:

> Though (non-prime) finite fields with the same order and defining

> polynomial are considered equal, I'd say the hashes should be different.

the same.

It may, however, be a possibility to make them compare unequal, but with

coercion maps in both directions.

> So, let's add the string of the implementation of the finite field into

> it's hash (which is cached by the way).

> Does that seem sensible?

> Or should I modify the finite field code so that these hashes are equal

> whatever the implementation is?

> Then we'd need coercion between all of the implementations (which is not

> currently the case but should be done anyway), but that would surely yield

> a non-negligible speed penalty,

looked up from cache, which by the way uses a special kind of dictionary

that compares parents by identity, not equality.

> and you'll never know what the

> implementation of the base ring of the polynomial ring will be as it will

> depend on the order finite fields and their polynomial rings are created.

> Or should we modify the caching system of the polynomial rings not to use

> hashes but ids?

system uses such caches.

Let F1, F2 be two different implementation of "the same" finite field.

Currently, F1[x] and F2[x] are identical objects, which may indeed

constitute a problem. Apart from making the cache use comparison by

identity, there is the possibility to be explicit about the

implementation of the polynomial ring (or do you still want to use the default

implementation for the polynomial rings if there are different implementations

of the base ringst?). Then your cache problem would vanish, since the

implementation is part of the cache key (the PolynomialRing constructor

has an "implementation" optional argument).

Best regards,

Simon

Aug 19, 2014, 3:10:38 PM8/19/14

to sage-...@googlegroups.com

Hi Jean-Pierre,

> I'm not sure I understand correctly; do you mean that the problem is

> caused by fields comparing equal even if the implementations are

> different? I think we should in any

>

> Not sure.

> I seem to be able to produce things with different hashes but equal...

> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')

> True

This should probably be False; even though the fields are canonically

isomorphic, pretending that they are equal sounds like asking for

trouble to me...

> Though I have:

> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')

> False

> Maybe that's because of the __cmp__ functions I copied/pasted/slightly

> modified.

> Or maybe I forgot to implement some methods as the new classes are cdefed.

> I'll check that.

I'm testing a patch that makes FiniteField behave like WithEqualityById

(unfortunately, inheriting from it is not possible because Cython does

not support multiple inheritance) and removes all hashing and comparison

methods from other finite field classes.

By the way, there is no point in relying on hashes being different for

"different" objects if they still compare equal; there is always some

small probability of a hash collision. (I just saw that Simon King

wrote something to the same effect.)

Peter

> I'm not sure I understand correctly; do you mean that the problem is

> caused by fields comparing equal even if the implementations are

> different? I think we should in any

>

> Not sure.

> I seem to be able to produce things with different hashes but equal...

> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='flint_fq')

> True

isomorphic, pretending that they are equal sounds like asking for

trouble to me...

> Though I have:

> sage: GF(13**5, 'a', impl='pari_ffelt') == GF(13**5, 'a', impl='pari_mod')

> False

> Maybe that's because of the __cmp__ functions I copied/pasted/slightly

> modified.

> Or maybe I forgot to implement some methods as the new classes are cdefed.

> I'll check that.

(unfortunately, inheriting from it is not possible because Cython does

not support multiple inheritance) and removes all hashing and comparison

methods from other finite field classes.

By the way, there is no point in relying on hashes being different for

"different" objects if they still compare equal; there is always some

small probability of a hash collision. (I just saw that Simon King

wrote something to the same effect.)

Peter

Aug 19, 2014, 3:13:07 PM8/19/14

to sage-...@googlegroups.com

Hi Peter,

On 2014-08-19, Peter Bruin <pjb...@gmail.com> wrote:

> I'm not sure I understand correctly; do you mean that the problem is caused

> by fields comparing equal even if the implementations are different? I

> think we should in any case make FiniteField inherit from WithEqualityById,

> so two instances compare equal if and only if they are identical. Would

> this be enough?

I think this would make sense---of course with bidirectional coercions,

as mentioned in my previous mail.

Cheers,

Simon

On 2014-08-19, Peter Bruin <pjb...@gmail.com> wrote:

> I'm not sure I understand correctly; do you mean that the problem is caused

> by fields comparing equal even if the implementations are different? I

> think we should in any case make FiniteField inherit from WithEqualityById,

> so two instances compare equal if and only if they are identical. Would

> this be enough?

as mentioned in my previous mail.

Cheers,

Simon

Aug 19, 2014, 3:13:51 PM8/19/14

to sage-...@googlegroups.com

On Tuesday, August 19, 2014 9:05:15 PM UTC+2, Simon King wrote:

Hi Jean-Pierre,

On 2014-08-19, Jean-Pierre Flori <jpf...@gmail.com> wrote:

> Though (non-prime) finite fields with the same order and defining

> polynomial are considered equal, I'd say the hashes should be different.

Absolutely no way. If they are considered equal, then the hashes must be

the same.

You mean that it should be this way (for parents, or elements, or any class whether from cython or python)?

I assume this is for parents.

I assume this is for parents.

It may, however, be a possibility to make them compare unequal, but with

coercion maps in both directions.

> So, let's add the string of the implementation of the finite field into

> it's hash (which is cached by the way).

> Does that seem sensible?

No.

Ok.

> Or should I modify the finite field code so that these hashes are equal

> whatever the implementation is?

> Then we'd need coercion between all of the implementations (which is not

> currently the case but should be done anyway), but that would surely yield

> a non-negligible speed penalty,

Why should that be the case? Once a coercion is established, it will be

looked up from cache, which by the way uses a special kind of dictionary

that compares parents by identity, not equality.

I mean actually applying the coercion might be expensive.

That's why I'd like that when building a polynomial ring over a finite field using a given implementation it uses the same implementation to "represent" its coefficients.

That's why I'd like that when building a polynomial ring over a finite field using a given implementation it uses the same implementation to "represent" its coefficients.

Aug 19, 2014, 3:26:27 PM8/19/14

to sage-...@googlegroups.com

Thanks for all the answers.

It seems my issues was caused by defining the _cmp_ method with the wrong number of underscores in my cython extension classes.

Do you confirm that for parents defined in a usual Python class __cmp__ with two underscores should be used and for a cython extension class _cmp_ with only one underscore?

I've put back my original __hash__ functions which evaluate equal on the two different classes and have now nice polynomial rings as I wanted.

Do you confirm this "hash collision" is not a problem?

Thanx,

JP

It seems my issues was caused by defining the _cmp_ method with the wrong number of underscores in my cython extension classes.

Do you confirm that for parents defined in a usual Python class __cmp__ with two underscores should be used and for a cython extension class _cmp_ with only one underscore?

I've put back my original __hash__ functions which evaluate equal on the two different classes and have now nice polynomial rings as I wanted.

Do you confirm this "hash collision" is not a problem?

Thanx,

JP

Aug 19, 2014, 4:04:24 PM8/19/14

to sage-...@googlegroups.com

On top of that, Python3 want have cmp anymore :)

Aug 19, 2014, 4:38:45 PM8/19/14

to sage-...@googlegroups.com

Hi Simon,

>> I'm not sure I understand correctly; do you mean that the problem is

>> caused by fields comparing equal even if the implementations are

>> different? I think we should in any case make FiniteField inherit

>> from WithEqualityById, so two instances compare equal if and only if

>> they are identical. Would this be enough?

>

> I think this would make sense---of course with bidirectional

> coercions, as mentioned in my previous mail.

This is now implemented in #16855 (ready for review).

Peter

>> I'm not sure I understand correctly; do you mean that the problem is

>> caused by fields comparing equal even if the implementations are

>> different? I think we should in any case make FiniteField inherit

>> from WithEqualityById, so two instances compare equal if and only if

>> they are identical. Would this be enough?

>

> I think this would make sense---of course with bidirectional

> coercions, as mentioned in my previous mail.

Peter

Aug 20, 2014, 8:46:06 PM8/20/14

to sage-...@googlegroups.com

On Tuesday, August 19, 2014 12:13:07 PM UTC-7, Simon King wrote:

I think this would make sense---of course with bidirectional coercions,

as mentioned in my previous mail.

Bidirectional coercion implies memory leak: The strong references to the codomains on the coercion maps will keep both fields alive. The weakly keyed global coercion dict will therefore keep the coercion maps accessible, which keep the fields alive. We've had these leaks before.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu