Hi sage-dev,
There is an ongoing discussion at #18529 about defining __eq__ for manifolds and whether one should
use UniqueRepresentation (which implements equality by id). If anybody has any opinion on this,
she/he is welcome to join the discussion.
Hi Eric,
Are you serious? Certainly you are aware that there is no algorithm that
would be able to test whether two arbitrary compact 4-manifolds are
homeomorphic. And if I am not mistaken, the Rubinstein-Thompson
algorithm for the recognition of the 3-sphere is still state of the art,
but it involves solving a double-exponential number (in terms of number of
tetrahedra of a one-vertex triangulation) of NP-complete problems.
In such a situation, I believe a UniqueRepresentation does not make
sense. You may have different input data yielding homeomorphic
manifolds, and it is not feasible to solve a computationally super-hard
problem upon creation of each manifold (basically comparing the new
to-be-created manifold with all previously created manifolds, unless you
come up with a normal form for manifolds).
What you COULD use is CachedRepresentation. Then, identical input data
would result in identical output. But still, non-identical input data
would result in two manifolds that may or may not be homeomorphic.
And using manifolds as dictionary keys seems to me like a very bad idea,
given the above-mentioned problems.
- Or IF you really want to use them as dictionary keys, then just let
manifolds compare by identity (so, you allow that two homeomorphic
manifolds evaluate as non-equal), and have a separate method
M.is_homeomorphic(N)
that does the real thing (but is only invoked if the user asks for
it).
In that way, you would have super-fast equality-by-identity in the
obvious cases, but would still have a framework for the difficult cases,
where you postpone all unsolvable problems until the user wants them
solved.
On Wednesday, November 4, 2015 at 2:10:51 PM UTC-8, Eric Gourgoulhon wrote:Hi sage-dev,
There is an ongoing discussion at #18529 about defining __eq__ for manifolds and whether one should
use UniqueRepresentation (which implements equality by id). If anybody has any opinion on this,
she/he is welcome to join the discussion.
Avoid UniqueRepresentation if you can. It requires expensive processing of construction parameters
and hence introduces bad overhead and it introduces "global variables" in a way that is much worse than global variables.
The main reason why UniqueRepresentation is necessary in some cases is because the coercion framework requires it to figure out certain things efficiently. However, this mainly comes in for coercion discovery. I would not expect manifolds to act in any non-trivial way in coercions, so I'm pretty sure this is not forcing you to have it.
In general, I don't think there is a real problem if
A = manifold(<construction parameter set 1)
B = manifold(<construction parameter set 1)
has (A is B) equal to False. I would think it's not really a problem if it has A == B equal to False.
Indeed, this would mean that manifolds stored in separate pickles would be non-equal. However, pickle is smart enough to ensure that
loads(dumps([A,A]))
would again be a sequence of two identical manifolds (they might just be distinct from the A you had originally).
So I'm somewhat inclined to use EqualityById and let the pickling be broken (and redefining _test_pickling to do nothing (i.e., passes) with a docstring warning that it is intentionally broken). I'm starting to become really convinced that this is the way to go.
So we see that the objects used as keys are charts, vector frames and manifolds.
On Thursday, November 5, 2015 at 7:22:10 AM UTC-8, Travis Scrimshaw wrote:So I'm somewhat inclined to use EqualityById and let the pickling be broken (and redefining _test_pickling to do nothing (i.e., passes) with a docstring warning that it is intentionally broken). I'm starting to become really convinced that this is the way to go.
I completely agree. There's really no problem if calling "TwoSphere()" two times yields non-equal two-spheres: You've just created two manifolds that are homeomorphic, but since they are different manifolds, there is not a canonical homeomorphism between them, so treating them as distinct objects is quite justified. If you want to keep talking about *the same* manifold.
The fact that in sage QQ['x','y'] and QQ['x','y'] gives back equal polynomial rings is because the correspondence in naming of the generators suggests a distinguished isomorphism between the two polynomial rings (well, that's a mathematical justification. It's mainly convenient). Sage goes a step further by ensuring that the rings are *identical*, which has turned out to have pros and cons.
Clearly, for a manifold there is not a small set of parameters given at construction that, when in agreement, suggest a distinguished isomorphism.
So I think the basis for any "CachedRepresentation" is rather weak (and will lead to code the behaviour of which is hard to predict).
Incidentally, I guess your "charts" need to be immutable if you use then as keys,
but do your manifolds have to be?
EqualityById objects have relatively little use for being hashable. Certainly as dictionary keys they are hardly useful:
D[eq_by_id_obj] could always be spelled as eq_by_id_obj.D , since you have to have the exact object in hand to look it up in a dictionary anyway.
If you allow for atlasses to be refined and updates, it would be more natural to leave manifolds mutable and hence unhashable (because clearly you're mutating them).
By making manifolds hashable you're entering into a contract that doesn't sound so natural (although EqualityById guarantees you'll keep your contract).
Yes charts are immutable: a chart is defined by two parameters, which are passed to the constructor: its domain (a manifold) and its coordinates (a list of symbolic variables). These two parameters do not change during the object life. I think that using UniqueRepresentation for charts is then well justified. Same thing for vector frames.
EqualityById objects have relatively little use for being hashable. Certainly as dictionary keys they are hardly useful:
D[eq_by_id_obj] could always be spelled as eq_by_id_obj.D , since you have to have the exact object in hand to look it up in a dictionary anyway.
I am afraid I don't quite understand the above.
Considering your example (taking into account that U is an eq-by-id object), how could we spell a._restrictions[U] as something like U.(a_restrictions) ?
Technically speaking, yes, we are mutating them, by adding charts and vector frames. But if we make them eq-by-id, we guarantee that they don't change: they are recognized only by their id, which is constant during all their lives. So why not using them as dictionary keys as in the example above?By making manifolds hashable you're entering into a contract that doesn't sound so natural (although EqualityById guarantees you'll keep your contract).
Yes we keep the contract. If this does not sound so natural, which alternative would you propose to deal with the restrictions of a vector field to parallelizable open subsets?
Certainly, if "A is B" then "A==B" will be True. Actually I am not sure
if A.__eq__ would be invoked at all if you compare to identical objects
(i.e., I don't know what Python does).
On Thursday, November 5, 2015 at 2:35:57 PM UTC-8, Eric Gourgoulhon wrote:Yes charts are immutable: a chart is defined by two parameters, which are passed to the constructor: its domain (a manifold) and its coordinates (a list of symbolic variables). These two parameters do not change during the object life. I think that using UniqueRepresentation for charts is then well justified. Same thing for vector frames.
It indicates they could be if they needed to be. However, making things UniqueRepresentation or CachedRepresentation has a very real cost and makes the job of the garbage collector *much* harder. You can get very subtle memory leaks via weak references once global caches are involved.
So what specifically do you *gain* from making them UniqueRepresentation?
Is it ever the case that two charts are constructed independently from each other (with the same construction parameters!) and that it is important that the resulting charts are actually identical?
Given that one of the construction parameters is a manifold, which is going to be EqualityById anyway, I expect this will not happen. It'll be much more straightforward to look the chart up on the manifold (where it'll be stored anyway, right?
) if you need to retrieve it.
I won't claim that doing so is a good idea (in fact, it looks like a horrible idea). However, there are other scenarios where this transposition of index and indexee can be fruitful to consider.
On 2015-11-05, Travis Scrimshaw <tsc...@ucdavis.edu> wrote:
> I would then be advocating for using UniqueRepresentation if that was
> the only issue.
It really depends whether in comparing manifolds you would prefer to do *some*
heuristics to detect homeomorphic manifolds, or prefer to consider
manifolds unequal if they were created with different input data.
However, your example with NAN in the reply to Simon shows that dictionary lookup shortcuts equality testing on identical keys. If I understand correctly, this means that even if we had a slow __eq__ for charts, as soon as we write
point._coordinates[X] with the chart X that is the same object as the one used to create the entry in the dictionary point._coordinates, then we would have fast access to the coordinates? Could this depend on the Python version/implementation? (i.e. can we rely on this for the future?). If yes, we may even abandon EqualityById for charts.
Furthermore, if X and Y happen to be non-identical but "equal" charts, would you want points._coordinates[X] be identical to points_coordinates[Y] ? Would that ever be useful?
EqualityById is easy and cheap to have. It basically restores the default equality- and hashing properties of Python's objects. I don't think it's something you should particularly avoid having.
>> > I would then be advocating for using UniqueRepresentation if that was
>> > the only issue.
>>
>> It really depends whether in comparing manifolds you would prefer to do
>> *some*
>> heuristics to detect homeomorphic manifolds, or prefer to consider
>> manifolds unequal if they were created with different input data.
>>
>
> IMO, such checks are too expensive to be done at creation. For example, you
> want do so some tests over all n-spheres.
I did not say anything about whether the tests should be done at
creation time or not. I was talking about "comparing manifolds", which
obviously means "comparing manifolds AFTER creation"-
If you want that two distinct homeomorphic manifolds may evaluate equal then
you must not use UniqueRepresentation. That's all.
> Granted, there are nice and well-defined inputs for 2d and 3d
> manifolds, which would make good sense to do Cached/UniqueRepresentation
> around these structures. For a general (topological) manifold where there
> are not necessarily such nice inputs, what would you give as input data?
Whatever you currently give. An atlas. A simplicial complex. A system of
multivariate polynomial equations.
Here is an example in terms of "polynomial equations" as input data.
The first question is to you whether you want that the same set of equations
yields identical manifolds or distinct manifolds. The second question is whether
you want that the manifolds are identical if and only if the two systems of
equations have the same solutions. It depends on your answers whether to
use CachedRepresentation, UniqueRepresentation or nothing.
Best,
Travis
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
To post to this group, send email to sage-...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
I think Nils' point about the downsides of having a cache is one reason not to use UniqueRepresentation, both because it can generate references to objects you might want to forget, and because sometimes you just want to start with two three dimensional manifolds, specifying distinct atlases as you go. If you use unique representation, you might accidentally get the same object.Why not just inherit from :class:`~sage.misc.fast_methods.WithEqualityById` ? That seems to have what you need. Yes, the standard pickling test will be broken, but I think that's fine for your application.David