A modest proposal for typed expressions

124 views
Skip to first unread message

Dan Buchoff

unread,
Dec 17, 2014, 12:32:49 PM12/17/14
to sy...@googlegroups.com
This proposal is a little far-reaching, but I think it solves some real issues with the ease of use of refine and provides a reasonable way forward for typed symbols.

The problem with refine is that:
(1) It's wordy. Assumptions on symbols are succinct and easy.
(2) It requires explicit invocation so it requires the developer to refine before simplifying and simplify can't take advantage of refinements.
(3) You must track your simplifying assumptions and since these are separate from the expression, it's too easy to refine on assumptions that later don't apply.

Introduce a new type: Guard
Guard(expr,axiom) - an expression which has meaning if the axiom is true
Expr.with(axiom) - turns an expression into a Guard based on the given axiom

This solves the problem by:
(1) still allowing assumptions attached to symbols, handling them in a sane way
(2) tracking your assumptions so they can be used while simplifying
(3) requiring you to explicitly detach an expression from its assumptions

For the simplest case, this provides typed expressions:
(x**2).with(Q.integer(x))
---> Guard(x**2, Q.integer(x))

Axioms distribute over application, so each expression only needs one set of axioms
a1.with(p1) + a2.with(p2)
---> Guard(a1 + a2, p1 & p2)

x.with(Q.positive(x)) + x.with(Q.real(x))
---> Guard(2*x,Q.positive(x) & Q.real(x))

Axioms are taken to be the current assumption context, so rewrite rules can be applied more immediately.
sqrt(x**2).with(Q.nonnegative(x))
---> Guard(x,Q.nonnegative(x))

This elegantly solves the issue of typed expressions by unifying their assumptions, possibly resulting in a contradiction (which may be implemented as a special value or a ContradictionError)
Q.positive(x)+Q.negative(x)
---> x.with(Q.positive(x) & Q.negative(x))
simplify(_)
---> Contradiction

Thoughts?

Aaron Meurer

unread,
Dec 17, 2014, 1:20:12 PM12/17/14
to sy...@googlegroups.com
I agree that the new assumptions can be cumbersome. We should
definitely keep support for old style assumptions syntax (Symbol('x',
positive=True). The downside to it is that that doesn't work for
expressions. Your idea, of basically tying assumptions to an
expression, is interesting. How would Guard (which probably needs a
better name) work in Python? Would it subclass the expression type? If
so, we'd need to fix https://github.com/sympy/sympy/issues/6751.

OTOH, it is also often very useful to keep assumptions completely
separate from expressions, and we should not disallow that use-case.

Also, I should point out that you can't use "with" as a method name as
it's a reserved keyword in Python.

> x.with(Q.positive(x)) + x.with(Q.real(x))
> ---> Guard(2*x,Q.positive(x) & Q.real(x))

This seems problematic to me. Currently, the way the old assumptions
work, which is fairly sane, is that symbols with the same name but
different assumptions compare different. So Symbol('x', positive=True)
+ Symbol('x', real=True) will result in x + x, with the two x's being
distinct symbols.

I think figuring out the correct thing to do in this case is the key
to getting something like this to work.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/2179e8b1-a241-44ca-a479-3fe2c0f35e4f%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Richard Fateman

unread,
Dec 18, 2014, 6:47:44 PM12/18/14
to sy...@googlegroups.com
Please forgive me if this is off track -- I've never used assumptions in sympy, and
don't know much about sympy implementation.

But here's a problem that has come up in a number of other computer algebra system
that are (in some way) linked to a programming language.

If you say something about a symbol s,  are you saying something about
the locally lexically bound symbol s, a binding of the symbol s as the result of
(say) pattern matching, a globally bound symbol  (in some global symbol table)
or something else.
Can any program variable in python have assumptions?
Or are declared vars what can have assumptions?
Are vars bound in some nice discipline in which they can be shadowed and layered in
some way?  Are there trees of contexts in which assumptions live (which is
what Maxima does, but only if you ask.)

There is a whole crowd of people.  Well, not such a big crowd really, but a bunch
of people who write about MKM  mathematical knowledge manipulation.

If you are embarking on a path to (in effect) re-invent this stuff, probably it is
a good idea to read about other peoples' (sometimes quite ineffective) approaches.

But I may be all wet on whether this is related to sympy.

RJF

Sergey Kirpichev

unread,
Dec 19, 2014, 8:04:50 AM12/19/14
to sy...@googlegroups.com
On Friday, December 19, 2014 2:47:44 AM UTC+3, Richard Fateman wrote:
Please forgive me if this is off track -- I've never used assumptions in sympy, and
don't know much about sympy implementation.

Actually, we have two:
sympy/core/assumptions.py (old one)
and
sympy/assumptions/ (new)

Lets consider old assumptions first.
 
Can any program variable in python have assumptions?
Or are declared vars what can have assumptions?

Python variables can't have assumptions.  Values, objects, in this case Symbol's - can.

>>> v = Symbol('r', real=True)
>>> v.is_real
True
>>> v.assumptions0
{'commutative': True, 'complex': True, 'hermitian': True, 'imaginary': False, 'real': True}

Objects are immutable and every Symbol is created with some
set of assumptions which determine certain properties of symbolic objects
and can have 3 possible values: True, False, None.

We can deduce assumptions about expressions, like this:
>>> (v**2).is_nonnegative
True

Aaron Meurer

unread,
Dec 19, 2014, 11:02:04 PM12/19/14
to sy...@googlegroups.com
On Thu, Dec 18, 2014 at 5:47 PM, Richard Fateman <fat...@gmail.com> wrote:
> Please forgive me if this is off track -- I've never used assumptions in
> sympy, and
> don't know much about sympy implementation.
>
> But here's a problem that has come up in a number of other computer algebra
> system
> that are (in some way) linked to a programming language.
>
> If you say something about a symbol s, are you saying something about
> the locally lexically bound symbol s, a binding of the symbol s as the
> result of
> (say) pattern matching, a globally bound symbol (in some global symbol
> table)
> or something else.

Something to note here is that because SymPy is built on top of
Python, which is already a programming language in its own right,
there is a clear distinction between a SymPy Symbol and a Python
variable. A SymPy Symbol is an object with a string attached to it. A
Python variable is the name of an identifier used in code to point to
some Python object. So

a = 1
a = "hello"
a = Symbol('a')
a = Symbol('x')

are all valid ways of assigning different things to the Python
variable a. In the last two cases, a will point to a SymPy Symbol. In
the second to last, the name of the Symbol will be the same as the
Python variable name, but this will be a coincidence. In fact, there's
not even a way for Symbol to know what variable name it is attached
to, and the question doesn't even make sense since any object in
Python can be bound to any number of variable names (including 0).

So there is no binding question. Two Symbol objects are considered the
same if they have the same name and assumption, regardless of how or
where they were created. And indeed, since they are immutable and
SymPy has caching, it's possible (though not guaranteed) that they may
even end up being the exact same object in memory (this can be tested
with the Python "is" operator).

It's an important distinction to make semantically, as it's a very big
point of confusion for new users to SymPy, especially ones who are
also new to Python, or who may be coming from a language where this
distinction is not made (i.e., most other computer algebra systems).
I'm not entirely clear which you meant by "symbols" and "vars" in your
question. I prefer to refer to SymPy Symbol with a capital S, since
that is the name of the object in the code, and "Python variable" to
make it clear that it has nothing to do with SymPy as a Python
library.

I'd say SymPy is better for being able to make this distinction. SymPy
Symbols aren't scoped. They are just objects that float around. They
don't even know where they live. Python variables, which are what
actually comprise the code, are what are scoped.

If in a language like Lisp I were to take any quoted atom and think of
that as a mathematical variable, that would lead to a lot of
confusion. What is 'a? Is it a real number? Is it a complex number? Is
it a string? Is it a list of lists? It can be any of those things,
depending on what scope it is in in the language.

> Can any program variable in python have assumptions?
> Or are declared vars what can have assumptions?
> Are vars bound in some nice discipline in which they can be shadowed and
> layered in
> some way? Are there trees of contexts in which assumptions live (which is
> what Maxima does, but only if you ask.)

The answer is not so easy because as Sergey notes, there are two
independent systems in SymPy, the so-called old assumptions, which are
almost as old as SymPy itself (at least as far as I can tell), and the
new assumptions, which aren't really new anymore, but still lack the
ability to deduce most things, since most development effort has gone
into improving the old system.

In the old system, to *apply* an assumption, you put it on a Symbol,
like Symbol('x', positive=True). You cannot apply an assumption to
anything other than a symbol in the old assumptions. This creates a
distinct symbol from Symbol('x'). To ask about an assumption, you
check an attribute of the object, like x.is_real or
(2*x).is_nonnegative. This gives True if it is known to be true, False
if it is known to be False, or None if it can't be deduced (either
because the algorithms aren't implemented or because it could be
either). The possible assumptions are hard-coded into the system, and
the deduction system is fairly simple (though quite fast). Basically,
deductions are made by computing all possible facts more or less from
a "transitive closure" of the inferences (it's alpha and beta
deduction).

In the new assumptions, the assumptions are not really applied. They
are just facts that you can write down about an expression. You can
write down a fact about any kind of expression, not just a symbol.
These facts are like Q.positive(x), Q.negative(x + 1), or
And(Q.positive(x), Q.negative(x)). To ask about an assumption, you use
something like ask(Q.negative(x), Q.positive(-x)). This translates to
"is x negative given that -x is positive?". The output is
True/False/None again. This uses a combination of some hard-coded
lookups and a SAT solver.

Note that writing down Q.positive(x) doesn't assume that x is
positive. It just creates an object that represents that logical
predicate. That object can later be used in the appropriate places to
assume this, or to ask about it, or to create large predicates.

There is also some very simple notion of "global" assumptions, i.e.,
if you use assume(Q.positive(x)), it will enter the fact into some
global assumptions table, which is used automatically by ask(). You
can also nest these contexts with with blocks like

with assuming(Q.positive(x)):
with assuming(Q.whatever(x)):
...

This is still proof-of-concept more than anything though. In fact,
anything "new assumptions" is more or less proof of concept, because
the actual code in SymPy that implements facts and the code that uses
the assumptions, e.g., in the solvers, all use the old assumptions
system.

>
> There is a whole crowd of people. Well, not such a big crowd really, but a
> bunch
> of people who write about MKM mathematical knowledge manipulation.
>
> If you are embarking on a path to (in effect) re-invent this stuff, probably
> it is
> a good idea to read about other peoples' (sometimes quite ineffective)
> approaches.

Some of these approaches have already been reinvented by SymPy, I'm
sure. This is part of the reason we have two assumptions systems
(three if you count my approach in
https://github.com/sympy/sympy/pull/2508, which effectively takes the
new assumptions system and uses the SAT solver exclusiviely, which
gives some very powerful possibilities).
> https://groups.google.com/d/msgid/sympy/40c5819a-273b-4e33-b697-1db5dde29a73%40googlegroups.com.

Dan Buchoff

unread,
Dec 20, 2014, 12:17:08 AM12/20/14
to sy...@googlegroups.com
I got around to making a very lightweight Proof-of-Concept in https://github.com/sympy/sympy/pull/8664

Richard Fateman

unread,
Dec 20, 2014, 11:39:37 PM12/20/14
to sy...@googlegroups.com
Thanks for the detailed explanation.  I hope my further elaboration
will be of some use...
ok, given no previous context,  consider  solve(y=x^2,x).

presumable you get 2 answers,  x=sqrt(y)   and x=- sqrt(y)

x and y are both symbols.   (this is in almost any CAS)

But if you then say something like plot(sqrt(x),  {some range})  then
x is presumably first a symbol, but then something happens to make it
a variable in the nature of   for x from 0 to 10 by 0.1 do "something with sqrt(x)".
and then x is a variable.  You can do something like substitution, or something
like evaluation.  In some systems there is no distinction.

 
. I prefer to refer to SymPy Symbol with a capital S, since
that is the name of the object in the code, and "Python variable" to
make it clear that it has nothing to do with SymPy as a Python
library.

good distinction to make if you can.
 
I'd say SymPy is better for being able to make this distinction. SymPy
Symbols aren't scoped.
 
OK, this is important.
 
They are just objects that float around. They
don't even know where they live. Python variables, which are what
actually comprise the code, are what are scoped.

So here's the question.  It is presumably possible to have two python vars
with the same name, x in different scopes.

It is apparently NOT possible to have  two SymPy symbols with the same
name, because they are possibly cached to the same object.

It seems to me this can cause problems if, inside some scope, you need
to create or manipulate SymPy symbols  for local
usage.  (It can be done by "gensym" in Lisp, so maybe you do this?)
For example, I might wish to use a name, say "lambda" as a symbol inside
a symbolic eigenvalue program.  I don't want to over-write some global
"lambda" symbol which might have different properties, assumptions, typ etc.

If in a language like Lisp I were to take any quoted atom and think of
that as a mathematical variable, that would lead to a lot of
confusion. What is 'a? Is it a real number?

Maybe the confusion is less than you think.   If  the name A is
in the (global or some package) symbol table, then it has 
a value cell, a property-list cell, a function cell, a print-name cell,
a package cell.  I think that's all. 
It doesn't have a type,  though one could store type info on the
property list, or someplace else.

The value cell can contain a number, or a pointer to a structure which
might be an array, an atom, a list,  a file descriptor,  etc.  During the
lifetime of an object, it could be changed.  The type of A is the apparent
type of the item in its value cell.  The type of 'A is symbol.


Is it a complex number? Is
it a string? Is it a list of lists? It can be any of those things,
depending on what scope it is in in the language.

Actually,the type of  'A in any context is symbol.  The type of
A is the type of the object in its value cell.   If A is locally bound
to an object, the type is the type of that object.  A locally bound
variable A  does not have a property list etc etc.  In fact the
name may be quite gone (compiled away), and all one has is
a stack  (or register)  for the value.
Putting an assumption on the contents of a register requires some
extra work.   Putting an assumption on a global symbol table entry
can be done by putting something on the property list of that atom.


> Can any program variable in python have assumptions?
> Or are declared vars what can have assumptions?
> Are vars bound in some nice discipline in which they can be shadowed and
> layered in
> some way?  Are there trees of contexts in which assumptions live (which is
> what Maxima does, but only if you ask.)

The answer is not so easy because as Sergey notes, there are two
independent systems in SymPy, the so-called old assumptions, which are
almost as old as SymPy itself (at least as far as I can tell), and the
new assumptions, which aren't really new anymore, but still lack the
ability to deduce most things, since most development effort has gone
into improving the old system.

Deduction systems are potential performance black holes.  How much
work do you want to do given something like  If (a>b) then ... else ...
 

In the old system, to *apply* an assumption, you put it on a Symbol,
like Symbol('x', positive=True). You cannot apply an assumption to
anything other than a symbol in the old assumptions. This creates a
distinct symbol from Symbol('x'). To ask about an assumption, you
check an attribute of the object, like x.is_real or
(2*x).is_nonnegative. This gives True if it is known to be true, False
if it is known to be False, or None if it can't be deduced (either
because the algorithms aren't implemented or because it could be
either). The possible assumptions are hard-coded into the system, and
the deduction system is fairly simple (though quite fast). Basically,
deductions are made by computing all possible facts more or less from
a "transitive closure" of the inferences (it's alpha and beta
deduction).

.  It can be fast if you are doing simple-enough type inference. 
Solving (non-linear? non-algebraic?) inequalities, not so fast.
Actually, not even computable in general.

In the new assumptions, the assumptions are not really applied. They
are just facts that you can write down about an expression. You can
write down a fact about any kind of expression, not just a symbol.
These facts are like Q.positive(x), Q.negative(x + 1), or
And(Q.positive(x), Q.negative(x)). To ask about an assumption, you use
something like ask(Q.negative(x), Q.positive(-x)). This translates to
"is x negative given that -x is positive?". The output is
True/False/None again. This uses a combination of some hard-coded
lookups and a SAT solver.

  Maybe all your problems will be simple instances.
Let's hope so.

Note that writing down Q.positive(x) doesn't assume that x is
positive. It just creates an object that represents that logical
predicate. That object can later be used in the appropriate places to
assume this, or to ask about it, or to create large predicates.

There is also some very simple notion of "global" assumptions, i.e.,
if you use assume(Q.positive(x)), it will enter the fact into some
global assumptions table, which is used automatically by ask(). You
can also nest these contexts with with blocks like

with assuming(Q.positive(x)):
  with assuming(Q.whatever(x)):
    ...

This is still proof-of-concept more than anything though. In fact,
anything "new assumptions" is more or less proof of concept, because
the actual code in SymPy that implements facts and the code that uses
the assumptions, e.g., in the solvers, all use the old assumptions
system.

I suspect the concept doesn't need proving, in some sense. 

>
> There is a whole crowd of people.  Well, not such a big crowd really, but a
> bunch
> of people who write about MKM  mathematical knowledge manipulation.
>
> If you are embarking on a path to (in effect) re-invent this stuff, probably
> it is
> a good idea to read about other peoples' (sometimes quite ineffective)
> approaches.

Some of these approaches have already been reinvented by SymPy, I'm
sure. This is part of the reason we have two assumptions systems
(three if you count my approach in
https://github.com/sympy/sympy/pull/2508, which effectively takes the
new assumptions system and uses the SAT solver exclusiviely, which
gives some very powerful possibilities).

I understand that SAT solvers have a new respectability, but I still feel
uncomfortable with such a dependence.   

Sergey Kirpichev

unread,
Dec 21, 2014, 6:19:28 AM12/21/14
to sy...@googlegroups.com
On Sunday, December 21, 2014 7:39:37 AM UTC+3, Richard Fateman wrote:
It seems to me this can cause problems if, inside some scope, you need
to create or manipulate SymPy symbols  for local
usage.  (It can be done by "gensym" in Lisp, so maybe you do this?)

You can use Dummy:
x1 = Dummy('x')
x2 = Dummy('x')

x1 and x2 - different objects.

Aaron Meurer

unread,
Dec 22, 2014, 10:29:35 PM12/22/14
to sy...@googlegroups.com
I'm not sure what the distinction would be, except maybe that
evaluation always gives you a float. Strictly speaking in SymPy if you
plot a function it is first rewritten as a numerical function using a
different library (generally NumPy) and evaluated with that.

>
>
>>
>> . I prefer to refer to SymPy Symbol with a capital S, since
>> that is the name of the object in the code, and "Python variable" to
>> make it clear that it has nothing to do with SymPy as a Python
>> library.
>>
> good distinction to make if you can.
>
>>
>> I'd say SymPy is better for being able to make this distinction. SymPy
>> Symbols aren't scoped.
>
>
> OK, this is important.
>
>>
>> They are just objects that float around. They
>> don't even know where they live. Python variables, which are what
>> actually comprise the code, are what are scoped.
>
>
> So here's the question. It is presumably possible to have two python vars
> with the same name, x in different scopes.
>
> It is apparently NOT possible to have two SymPy symbols with the same
> name, because they are possibly cached to the same object.

Assumptions also play a role (Symbol('x') != Symbol('x', real=True)), but yes.

>
> It seems to me this can cause problems if, inside some scope, you need
> to create or manipulate SymPy symbols for local
> usage. (It can be done by "gensym" in Lisp, so maybe you do this?)
> For example, I might wish to use a name, say "lambda" as a symbol inside
> a symbolic eigenvalue program. I don't want to over-write some global
> "lambda" symbol which might have different properties, assumptions, typ etc.

As Sergey pointed, we have a special subclass of Symbol called Dummy
which creates new objects each time. I think the way it's implemented
is that the Dummy class has an internal counter that is incremented
each time an instance is created and used for equality comparison.
Although I'm not sure why it doesn't just use the memory address (id()
in Python). Maybe caching screws that up.

>>
>>
>> If in a language like Lisp I were to take any quoted atom and think of
>> that as a mathematical variable, that would lead to a lot of
>> confusion. What is 'a? Is it a real number?
>
>
> Maybe the confusion is less than you think. If the name A is
> in the (global or some package) symbol table, then it has
> a value cell, a property-list cell, a function cell, a print-name cell,
> a package cell. I think that's all.
> It doesn't have a type, though one could store type info on the
> property list, or someplace else.
>
> The value cell can contain a number, or a pointer to a structure which
> might be an array, an atom, a list, a file descriptor, etc. During the
> lifetime of an object, it could be changed. The type of A is the apparent
> type of the item in its value cell. The type of 'A is symbol.

I guess what I was trying to say is you should use distinct types for
the programming language AST and CAS symbols.

For example, Julia allows you to create quoted expressions, but if I
were to write a CAS in Julia, I would create a new CASSymbol type, not
reuse Symbol (which is the type of :x, the quoted symbol x). Of
course, it should be possible to use quoting as syntatic sugar to
create expressions if you have it, but reusing them leads to type
confusion.

I don't know what any of the popular CAS systems written in languages
with quoting do. I'm just making this observation.
Yeah, I'm starting to notice that. So far, the best things I can think
of to avoid it are one, to try faster (but less powerful) deductions
first, and two, avoid as much as possible automatic evaluation
(evaluation that happens when an object is created, as opposed to when
some simplification function is called on it), especially automatic
evaluation that uses the deduction system (e.g., exp(2*pi*I) -> 1 is
probably fine, but exp(2*n*pi*I) -> 1 where n is deduced to be integer
is a bad idea).

>
>>
>>
>> In the old system, to *apply* an assumption, you put it on a Symbol,
>> like Symbol('x', positive=True). You cannot apply an assumption to
>> anything other than a symbol in the old assumptions. This creates a
>> distinct symbol from Symbol('x'). To ask about an assumption, you
>> check an attribute of the object, like x.is_real or
>> (2*x).is_nonnegative. This gives True if it is known to be true, False
>> if it is known to be False, or None if it can't be deduced (either
>> because the algorithms aren't implemented or because it could be
>> either). The possible assumptions are hard-coded into the system, and
>> the deduction system is fairly simple (though quite fast). Basically,
>> deductions are made by computing all possible facts more or less from
>> a "transitive closure" of the inferences (it's alpha and beta
>> deduction).
>
>
> . It can be fast if you are doing simple-enough type inference.
> Solving (non-linear? non-algebraic?) inequalities, not so fast.
> Actually, not even computable in general.

Inequalities are an area where SymPy has very little implemented. It
would be nice to have some powerful algorithms to implement, and maybe
a GSoC student to do it. The main algorithm I know about is
cylendrical algebraic decomposition.
"Proof of concept" meaning it is implemented enough to show that it
works, but not much more than that :)

>>
>>
>> >
>> > There is a whole crowd of people. Well, not such a big crowd really,
>> > but a
>> > bunch
>> > of people who write about MKM mathematical knowledge manipulation.
>> >
>> > If you are embarking on a path to (in effect) re-invent this stuff,
>> > probably
>> > it is
>> > a good idea to read about other peoples' (sometimes quite ineffective)
>> > approaches.
>>
>> Some of these approaches have already been reinvented by SymPy, I'm
>> sure. This is part of the reason we have two assumptions systems
>> (three if you count my approach in
>> https://github.com/sympy/sympy/pull/2508, which effectively takes the
>> new assumptions system and uses the SAT solver exclusiviely, which
>> gives some very powerful possibilities).
>>
> I understand that SAT solvers have a new respectability, but I still feel
> uncomfortable with such a dependence.

A SAT solver is what we already have. If we had some other kind of
deductive "solver" implemented that was better suited we could use it
insetad.

SAT solvers are very nice, though. They are only slow in worst-case
situations, and most things you encounter in real life are far from
worst case (at least that's true for the use-cases I've seen.
Hopefully it remains true for computer algebra). They usually just
need a few simple deductions. They're also general enough in what they
can solve (basically any kind of logical predicate that you can put in
CNF) that you can think of the solver as a black box.

Aaron Meurer
> https://groups.google.com/d/msgid/sympy/12e52c17-5999-42a9-bbd2-5791d352a8fd%40googlegroups.com.

Joachim Durchholz

unread,
Dec 23, 2014, 5:06:14 AM12/23/14
to sy...@googlegroups.com
Am 23.12.2014 um 04:29 schrieb Aaron Meurer:

> [...] we have a special subclass of Symbol called Dummy
> which creates new objects each time. I think the way it's implemented
> is that the Dummy class has an internal counter that is incremented
> each time an instance is created and used for equality comparison.
> Although I'm not sure why it doesn't just use the memory address (id()
> in Python). Maybe caching screws that up.

Using id to identify objects tends to break in unexpected ways, and as
people get bitten by this, the consensus to consider this a bad code
smell grows.

Code typically relies on id being unique and constant. The problem is
that both aspects might not hold anymore in unusual circumstances:
- Serialization and deserialization makes it nonconstant.
- If objects from different processes are compared, ids may be nonunique
(if binary representations are compared) or nonconstant (if serialized
representations are compared).
- VM implementors hate ids because they make it much harder to reap the
benefits of using a moving garbage collector (you can't use the memory
address as an id anymore, so you need to store it explicitly and waste
memory on a feature that is irrelevant for 99% of all data objects).
- id can return "not equal" for objects that we want to consider equal.
This does not throw off caching itself but might affect code that was
written without caching in mind. You have to check all direct and
indirect uses of id to be sure that no bugs hide, and the need to check
indirect uses (through function calls) means that id breaks the
abstraction barriers (the FPL folks say "id is not referentially
transparent").

Using a counter avoids some of these aspects but not all.
For SymPy, I think the last aspect is the most important one, but there,
the counter makes no difference at all.
So... I'm not sure that using a counter is giving us anything. OTOH the
point is precisely that nobody can be sure.
Reply all
Reply to author
Forward
0 new messages