rich comparisons, version 0.2

David Ascher

unread,

Apr 20, 1998, 3:00:00 AM4/20/98

to

Since I got a little bit of feedback, I'm putting up a revised version of
the proposal which would allow you to do:

data[data > 255] = 255

instead of

data = where(umath.greater(data, 255), 255, data)

up on my newly redesigned website:

http://starship.skyport.net/~da/proposals/richcmp.html

Changes:

- folks prefered the fortran/perl names (__le__ instead of
__less_equal__) by an overwhelming margin ((guido, tim_one) > da)

- more discussion of how to save chained comparisons.

Followup-To: main list.

Jim Fulton

unread,

Apr 22, 1998, 3:00:00 AM4/22/98

to

David Ascher has made a proposal at:

http://starship.skyport.net/~da/proposals/richcmp.html)

for fixing Python's comparison model.

I agree that the current model is broken, but I don't like
Davids proposal because I feel strongly that comparison operators
should *always* return values that have meaningful *boolean*
interpretations.

I don't dispute that it is useful to have operations that operate
on arrays and return arrays based on elementwise comparison. I
just don't want to see the comparison operators overloaded for this
purpose.

Here is a (high-level) counter proposal. (I was originally going to
simply state some general views, and realized that what I ended up
smelled alot more like a proposal. :)

New comparison model:

1. Comparison operators should return meaningful boolean values.

2. Order relationships on values are not necessary.

3. It should be possible for comparisons to raise exceptions
(ie be unsupported).

4. It should be possible that the expression:

a b or a==b

returns false.

5.
"a <= b" should be equivelent to "a = b" should be equivelent to "a > b or a == b", and
"a != b" should be equivelent to "not (a == b)"

(The expression:

not (a b or a == b) == not (a >= b),
not not (a == b) == not (a != b)

should always return true, true, true.)

6.
a. From a purist point of view, there should be three new
slots in *both* types (tp_eq, tp_lt, tp_gt) and classes
(__eq__, __lt__, __gt__).

The interpreter should evaluate "a <= b" by calling
a.__lt__(b), and if necessary a.__eq__(b).

b. However, for efficiency sake, there will probably need to be
five new slots in *both* types (tp_eq, tp_lt, tp_le, tp_gt,
tp_ge) and classes (__eq__, __lt__, __le__, __gt__, __ge__).

I can live with combining the type slots into a single slot
with a flag as suggested by David, however, if this is done, then
I'd prefer to see the same thing done in Python, for consistency.

8. Currently (at least in 1.4) there is a restriction that
tp_cmp for non-numeric types is only called if objects being
compared are of the same type. This restriction should be removed.

Jim

--
Jim Fulton mailto:j...@digicool.com
Technical Director (540) 371-6909 Python Powered!
Digital Creations http://www.digicool.com http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.

Tim Hochberg

unread,

Apr 22, 1998, 3:00:00 AM4/22/98

to

Jim Fulton wrote in message <353DF4...@digicool.com>...

>David Ascher has made a proposal at:
>
> http://starship.skyport.net/~da/proposals/richcmp.html)
>
>for fixing Python's comparison model.
>
>I agree that the current model is broken, but I don't like
>Davids proposal because I feel strongly that comparison operators
>should *always* return values that have meaningful *boolean*
>interpretations.
>
>I don't dispute that it is useful to have operations that operate
>on arrays and return arrays based on elementwise comparison. I
>just don't want to see the comparison operators overloaded for this
>purpose.

I have a feeling that where people come down on this will depend on how much
of their Python use involves the Numeric module. I use it a lot, so take a
guess what I'm going to say...

Maybe the following suggestions will make David Ascher's proposal somewhat
more paletable to people who have concerns similar to Jim Fulton[JF]'s:

[JF doesn't like nonboolean values being returned from the comparison
operators]

If the concern is with constructions such as:

>>>if x < y: # or while
>>> # do something

Then one possibility is to modify PyObject_IsTrue so that objects that are
not meaningfull boolean values raise an exception (Where meaningfull
boolenan values would have to include numbers, mappings, and sequences, for
backward compatibility). For Numeric arrays (and other types that implement
the Numeric protocol), it would probably be sufficient to have nb_nonzero
return -1.

[Jim Fulton would like to see <= implemented as < or =]

It seems that this functionality could be folded into DA's proposal, for
example if <= is not defined, < would be called and if false, = would be
called. However, if <= was defined it would be called directly.

[JF doesn't like the inconsistency between the C method's rich comparisons
and the python equivalents (in C there is one method with a flag, in python
there are __lt__, __eq__, etc.)]

My gut feeling is that would rather see this last slot point to another
method similar to that occupied by the numbers, e.g.,

typedef struct {
binaryfunc cm_lt;
binaryfunc cm_le;
/* ... */
} PyCompareMethods;

This would be consistent with the python side of things, and might be more
efficient for the case where you wanted to ommit methods (e.g, <=).

My apologies to DA and JF if I've misrepresented their proposals.

____
/im

David Ascher

unread,

Apr 22, 1998, 3:00:00 AM4/22/98

to

On Wed, 22 Apr 1998, Jim Fulton wrote:

> David Ascher has made a proposal at:
>
> http://starship.skyport.net/~da/proposals/richcmp.html)
>
> for fixing Python's comparison model.

Thanks for reading it.

I wrote a fairly lengthy reply to this, but it self-destructed. Here's
why:

I agree with Jim that 'a < b' should be the same as not 'a >= b' (even for
arrays). Alas, this is impossible for non-{0,1} values given the way 'not'
works. Similar objections exist for meaningful values of:

a < b < c vs. a < b and b < c
a < b or a < c vs. a < max(b,c)
etc.

regardless of how you do chained comparisons. This has to do with the way
'and', 'or' and 'not' work.

The only way around that is to expand the type interface to customize the
behaviors of 'not', 'and' and 'or', which I don't think I have a hope of
getting past you =).

--david

<aside>

> 3. It should be possible for comparisons to raise exceptions
> (ie be unsupported).

FYI, this is currently possible (within the tp_cmp/__cmp__ protocol), and
I've done it in an unreleased NumPy. I didn't specify it in my proposal,
but the patches I have right now are such that if a rich comparison raises
n exception, the old comparison is tried, which can in turn raise an
exception.

</aside>

Jim Fulton

unread,

Apr 22, 1998, 3:00:00 AM4/22/98

to

David Ascher wrote:
>
> On Wed, 22 Apr 1998, Jim Fulton wrote:
>
> > David Ascher has made a proposal at:
> >
> > http://starship.skyport.net/~da/proposals/richcmp.html)
> >
> > for fixing Python's comparison model.
>
> Thanks for reading it.

You're welcomed. :-)

> I wrote a fairly lengthy reply to this, but it self-destructed. Here's
> why:

???

> I agree with Jim that 'a < b' should be the same as not 'a >= b' (even for
> arrays). Alas, this is impossible for non-{0,1} values given the way 'not'
> works. Similar objections exist for meaningful values of:
>
> a a etc.
>
> regardless of how you do chained comparisons. This has to do with the way
> 'and', 'or' and 'not' work.
>
> The only way around that is to expand the type interface to customize the
> behaviors of 'not', 'and' and 'or', which I don't think I have a hope of
> getting past you =).

We misscommunicated somehow. I did not say that 'a < b' should be the
same as 'not (a >= b)'. In fact, I said what amounts to the opposite,
which is that it should be possible for the expression
'a b or a==b' to be false.

I also don't think I said anything about transitivity of < or >.

> <aside>
>
> > 3. It should be possible for comparisons to raise exceptions
> > (ie be unsupported).
>
> FYI, this is currently possible (within the tp_cmp/__cmp__ protocol), and
> I've done it in an unreleased NumPy. I didn't specify it in my proposal,
> but the patches I have right now are such that if a rich comparison raises
> n exception, the old comparison is tried, which can in turn raise an
> exception.
>
> </aside>

I knew that. I just included it for completeness.

Note that my numbered points were not responses to your proposals, but
rather features that I thought a new comparison model should have.

David Ascher

unread,

Apr 22, 1998, 3:00:00 AM4/22/98

to

On Wed, 22 Apr 1998, Jim Fulton wrote:
>
> > I agree with Jim that 'a < b' should be the same as not 'a >= b' (even for
> > arrays). Alas, this is impossible for non-{0,1} values given the way 'not'
> > works. Similar objections exist for meaningful values of:
> >
> > a > a > etc.
> >
> > regardless of how you do chained comparisons. This has to do with the way
> > 'and', 'or' and 'not' work.
> >
> > The only way around that is to expand the type interface to customize the
> > behaviors of 'not', 'and' and 'or', which I don't think I have a hope of
> > getting past you =).
>
> We misscommunicated somehow. I did not say that 'a < b' should be the
> same as 'not (a >= b)'. In fact, I said what amounts to the opposite,
> which is that it should be possible for the expression
> 'a b or a==b' to be false.
>
> I also don't think I said anything about transitivity of < or >.

What, miscommunication on usenet? Nahhh, never. =)

You're right, you didn't say those things. However, with the proposal as
it is on my web page, if you include the chained comparisons discussion,
it is not possible for "not (a >= b)" to be anything but 0, which is bad.

Let me rephrase: My proposal was motivated by my desire to be able to do:

(0) positivemask = 0 < mydata # comparisons returning things
(1) myData[0<mydata<128] = myData+128 # chained comparisons
(2) myData[not 0<myData<255] = -1 # logical operations on comparisons

---------

(0), I believe, is solvable with the rich comparison I propose and if
arrays raise exceptions on truth-testing.

(1) is possible if you include the richcomparison mod for chained
comparisons, which requires that comparison arrays always test true
and even without that requirement if you accept a try/except
in the opcodes for chained comparisons:

'a < b < c' generating code which would correspond to:

try:
temp1 = a < b
if temp1: return temp1
else: return b < c
except:
return boolean_and(a<b, b<c)

(2) is impossible with the current 'not' behavior, but even worse, is
*wrong* if comparison arrays always test true... If arrays raised
exceptions, then at least (2) would raise an exception, but that's
a shame.

In summary, (0) is easily possible, but you don't think it's wise. (1) is
possible at the cost of generating extra opcodes for every chained
comparison, which Guido didn't seem to keen on (personal communication, as
we say in science =), and (2) is not at all possible without changes to
the and/or/not mechanism, which I don't see happening. My hope for a
clean syntax for all comparisons on arrays is shot, which makes me less
interested in a partial solution.

Out of curiosity, can you be more explicit about why you feel so strongly
that comparisons should only return 0 or 1?

--david

PS: at least I understand the implementation much better after this
project. =)

Tim Peters

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

[Jim Fulton, with an alternative proposal]

Just picking on one point here:

> New comparison model:
> ...

> 5.
> "a <= b" should be equivelent to "a "a >= b" should be equivelent to "a > b or a == b", and

Useful notions of comparison exist where these don't hold. E.g., suppose a
and b each contain fields f and g, of some ordinary numeric type, and I
would like to define:

a == b iff a.f == b.f and a.g == b.g
a b iff a.f > b.f and a.g > b.g
a <= b iff a.f <= b.f and a.g <= b.g
a >= b iff a.f >= b.f and a.g >= b.g

i.e.

a rop b iff a.f rop b.f and a.g rop b.g

for rop in {==, <, >, <=, >=}. Then a <= b does not imply a = b.

Concretely, suppose a salesman has a monthly quota to meet, and a year's
worth of monthly sales is modeled by S while a year's worth of monthly
quotas is modeled by Q. Then S >= Q *naturally* means the guy (at least)
met his quota each month, but that doesn't imply he either exceeded his
quota each month (if S > Q: guy.raiseQuotas()), or sold exactly his quota
each month (if S == Q: guy.sendToCloserClass()).

You need at least three relational operators to model that kind of stuff
faithfully, but they're {==, <, <=}, not {==, <, >}.

So I dislike this part of the proposal.

I'm not sure the set of non-controversial boolean-valued-comparison
identities goes beyond these two:

a a
a <= b iff b >= a

Note that even "a == a" is a mistake to insist on (prevents modeling
IEEE-754 arithmetic comparisons). Although last I looked, Python did assume
it (compared object identities first and skipped cmp if the arguments to ==
were the same object).

> "a != b" should be equivelent to "not (a == b)"

I have a lot more sympathy for that one, but it would preclude defining

a != b iff a.f != a.g and b.f != b.g

in the original example, and that can be natural too.

I'm in favor of letting people define all six separately. Or none <0.503
wink>.

personally-enjoys-overloading-but-objects-when-
other-people-do-it-ly y'rs - tim

Tim Peters

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Random thoughts re:

http://starship.skyport.net/~da/proposals/richcmp.html

Expansion slot:

> The last available slot in the PyTypeObject, reserved
> up to now for future expansion, is used to optionally
> store a pointer to a new comparison function, of type
> richcmpfunc defined by:
>
> typedef PyObject *(*richcmpfunc)
> Py_PROTO((PyObject *, PyObject *, int));

Taking "the last" of anything is a big step. How does Guido feel about
this? The usual escape when it's felt that Yet Another escape may
eventually be needed is to use "the last" slot for a pointer to Yet Another
Structure containing the above + some number of other not-yet-used slots.
Things that never use any of the eventual enhancements can leave the
original "last slot" null.

cmp:

> The builtin cmp() is still used for simple comparisons.
> For rich comparisons, it is called with a third argument,
> one of "<", "<=", ">", ">=", "==", "!=", "<>" (the last
> two have the same meaning). When called with one of these
> strings as the third argument, cmp() can return any Python
> object. Otherwise, it can only return -1, 0 or 1 as before.

Unclear whether "otherwise" means:

a) cmp is called with exactly two arguments.

or

b) cmp is called with exactly two arguments, or with a third argument that
is not one of the seven listed strings.

#a makes more sense to me. E.g., the proposal later mentions a desire to
pass something else as the third cmp argument to get a __boolean_and__ done.
And personally being more interested in abusing all this for partial
orderings <wink>, I'd like to be able to define a class __cmp__ that e.g.
accepts a third argument of "<=>" to mean "comparable" (but Python doesn't
need to know about that! it just needs to pass on whatever cmp's third
argument is). But the intended relationship between cmp() and __cmp__()
isn't addressed in the proposal (& should be).

Other so-far unaddressed issues:

+ What do min and max do in the new world? E.g., if hasattr(x, '__lt__'),
does max(x, y) use x.__lt__ to decide which is greater? Presumably you
*don't* want x.__lt__ invoked for NumPy arrays in the max/min context, but
someone else *will* want x.__lt__ invoked for e.g. their partially-ordered
scalar class. And etc: what if x doesn't define __lt__ but does define
__gt__, or defines neither but does define __le__, or ...? Python will have
to get very explicit about how some of the built-ins are implemented.

+ Ditto list.sort().

+ Mostly ditto for dict[key]: if hasattr(key, '__eq__'), does dict[key] use
key.__eq__ when chasing hash chains? Another where NumPy may want a
different answer than other clients.

+ Ditto list.index(x), list.remove(x), and list.count(x).

+ If a class C defines both (& only) __gt__ and __cmp__, and class D defines
no comparison operators, what does C() < D() do? Invoke C.__cmp__(C(), D(),
"<")? Seems like it should, but the implementation seems messy given the
next one too:

+ As above, but C only defines __cmp__: presumably then C() < D() must
*not* pass a third argument to C.__cmp__ lest current code break.

Patches:

> Patches
> I have patches for the first part, but not for the rich
> comparison mods.
>
> Test for an example:
>
> x = 3
>
> other test:
>
> y = 3
>
> Also:
>
> foo = 3::
>
> x = 3

I found this part of the page difficult -- or perhaps too easy <wink>.

Overall, the proposal looks good! I'm most concerned about the "unaddressed
issues" above.

but-that's-mostly-because-they're-unaddressed<wink>-ly y'rs - tim

John B. Williston

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Jim Fulton wrote:

>I agree that the current model is broken, but I don't like
>Davids proposal because I feel strongly that comparison operators
>should *always* return values that have meaningful *boolean*
>interpretations.
>
>I don't dispute that it is useful to have operations that operate
>on arrays and return arrays based on elementwise comparison. I
>just don't want to see the comparison operators overloaded for this
>purpose.

Unless I misunderstood David's proposal, which I might add seems a terribly
ingenious solution to a somewhat pernicious problem, your concern here seems
nugatory. That is, it seems to me that even the "extended" return values
David suggests do have meaningful boolean interpretations. If this is
correct (David--could you perhaps verify this?), then what does it matter if
"savvy" code "knows" how to use the extended stuff, while "dumb" code
"knows" only about its boolean behavior? Of course, if I am not correct, it
still does not seem a difficult thing to mandate than any extended return
value be required to provide a boolean interpretation of itself, does it?

John

P.S. I am forwarding this message to David Ascher in the hopes that he might
comment upon it.

http://www.netcom.com/~wconsult
___ ___
\ \ __ / / Williston Consulting
\ \/ \/ / __________ makes software worth buying.
\ /\ / / _______/ wcon...@ix.netcom.com
\_/ \_/ / /
/ /_______
/__________/

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

I agree that you need a special operation to model this. I question
whether it should be >=.

I think this boils down to what the interpretation of >= (<=) is.
I read this as greater(less)-than-or-equal-to, and give it a
corresponding logical interpretation. Perhaps I'm being arbitrary.

> So I dislike this part of the proposal.
>
> I'm not sure the set of non-controversial boolean-valued-comparison
> identities goes beyond these two:
>
> a a
> a <= b iff b >= a
>
> Note that even "a == a" is a mistake to insist on (prevents modeling
> IEEE-754 arithmetic comparisons). Although last I looked, Python did assume
> it (compared object identities first and skipped cmp if the arguments to ==
> were the same object).

Good points.

> > "a != b" should be equivelent to "not (a == b)"
>
> I have a lot more sympathy for that one, but it would preclude defining
>
> a != b iff a.f != a.g and b.f != b.g
>
> in the original example, and that can be natural too.

I guess it depends on what you mean by natural. ;-)

> I'm in favor of letting people define all six separately. Or none <0.503
> wink>.
>
> personally-enjoys-overloading-but-objects-when-
> other-people-do-it-ly y'rs - tim

Hee hee.

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

John B. Williston wrote:
>
> Jim Fulton wrote:
>
> >I agree that the current model is broken, but I don't like
> >Davids proposal because I feel strongly that comparison operators
> >should *always* return values that have meaningful *boolean*
> >interpretations.
> >
> >I don't dispute that it is useful to have operations that operate
> >on arrays and return arrays based on elementwise comparison. I
> >just don't want to see the comparison operators overloaded for this
> >purpose.
>
> Unless I misunderstood David's proposal, which I might add seems a terribly
> ingenious solution to a somewhat pernicious problem, your concern here seems
> nugatory.

:-[

> That is, it seems to me that even the "extended" return values
> David suggests do have meaningful boolean interpretations.

Meaningful is the operative word here. What is the Boolean meaning
of (0,0,0), (1,1,1), or (1,0,1)? How about ()?

My first assumption would be that empty arrays are false and non-empty
arrays are true.

Note that I don't object to having comparisons return non-integers, and
of course, all Python objects are, in a sense, boolean. Not all
objects have meaningful boolean values.

> If this is correct (David--could you perhaps verify this?),

I think that Davids intention was that "array1 comparison array2"
should return an array of boolean values. From reading the proposal,
I didn't see any indication that the result of a comparison should
have a boolean interpretation.

But wait, the proposal says: "Thus, objects returned by rich
comparisons should always test true, ...".
Oops, a boolean value that is always true seems meaningless to
me. Perhaps your notion of "meaning" is different than mine.

> then what does it matter if
> "savvy" code "knows" how to use the extended stuff, while "dumb" code
> "knows" only about its boolean behavior?

If the results of comparisons always returned meaningful boolean values,
then it wouldn't matter at all.

> Of course, if I am not correct, it
> still does not seem a difficult thing to mandate than any extended return
> value be required to provide a boolean interpretation of itself, does it?

Exactly. I don't think that David's proposal is in
line with this.

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

David Ascher wrote:
>
> On Wed, 22 Apr 1998, Jim Fulton wrote:
> >
> > > I agree with Jim that 'a < b' should be the same as not 'a >= b' (even for
> > > arrays).

(snip)

> > We misscommunicated somehow.

(snip)

> You're right, you didn't say those things. However, with the proposal as
> it is on my web page, if you include the chained comparisons discussion,
> it is not possible for "not (a >= b)" to be anything but 0, which is bad.

Yes.

>
> Let me rephrase: My proposal was motivated by my desire to be able to do:
>
> (0) positivemask = 0 < mydata # comparisons returning things
> (1) myData[0<mydata<128] = myData+128 # chained comparisons
> (2) myData[not 0<myData<255] = -1 # logical operations on comparisons

I don't have any problem with doing something like this. I don't even
mind (that much) if there are operators for it. I just don't
like overriding the comparison operators for this purpose.

Why not invent some new operators for this? I propose we add new
operators, called arrays smileys: :>, <:, <=:, :=>, :=, and =:.

(0) positivemask = 0 <: mydata # comparisons returning
things
(1) myData[0<:mydata<:128] = myData+128 # chained comparisons
(2) myData[not 0<:myData<:255] = -1 # logical operations on
comparisons

<:

(snip)

> Out of curiosity, can you be more explicit about why you feel so strongly
> that comparisons should only return 0 or 1?

I didn't say that either. :>

I said that comparisons should always return *meaningful* *boolean*
values.
I don't give a whit whether the return values are integers.

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

To some degree, the current discussion of comparison operators
turns on peoples attitudes toward operator overloading. Should
operators have a clear meaning to which classes that overload
them should conform? Or are they purely syntactic sugar?

I liked the way Smalltalk let me define a large number of operators
(by simply using one- or two-letter combinations from a set of
non-alphanumeric characters). The language provided almost no
interpretation for operators. For example, the expression:

2+3*4

yeilds 20 in Smalltalk, not 24.

I have come to prefer Python's more restricted operator overloading
machinery. Certain operators are "numberic" opertors and are
treated differently by the language. Others are
"sequence" or "mapping" operators. I think this is a good thing.

One of Python's major strengths is clarity. I also don't think it
is sooo bad to spell out what we mean with method names to make
sure that the reader understands what's being done. I'd rather
not sacrifice clarity for the sake of less typing. (People who
know me can tell you that I hate typing. :)

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

> > The last available slot in the PyTypeObject, reserved
> > up to now for future expansion, is used to optionally
> > store a pointer to a new comparison function, of type
> > richcmpfunc defined by:
> >
> > typedef PyObject *(*richcmpfunc)
> > Py_PROTO((PyObject *, PyObject *, int));
>
> Taking "the last" of anything is a big step. How does Guido feel about
> this? The usual escape when it's felt that Yet Another escape may
> eventually be needed is to use "the last" slot for a pointer to Yet Another
> Structure containing the above + some number of other not-yet-used slots.
> Things that never use any of the eventual enhancements can leave the
> original "last slot" null.

Not to worry -- I can add more spares at the end of the structure.
The point of having some spares (always initialized to zero) is binary
compatibility with extensions compiled for older versions, but I only
see this as a temporary mesaure -- eventually, old binaries will fail
for other reasons.

--Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

The most problematic area of Dave's proposal is whether the array
object returned by "array1 cmpop array2" should return true or raise
an exception when asked for its truth value. This is because in
different contexts, you want different things. In this example:

a = <some array>
b = <another array>
if a < b: # <----------------
a = a+b

clearly you want this to raise an exception at the indicated line to
explain to the programmer that you can't test an array of Booleans for
its truth value.

On the other hand, about the only viable approach to correctly
implementing chained comparisons, e.g. a<b<c, is to generate (pseudo)
code like this:

temp1 = a<b
if temp1:

temp2 = b<c
return __combine__(temp1, temp2)
else:
return false # or perhaps "return temp1"

Here, __combine__() is a postulated new operation used specifically
for this situation. It is defined to return its second argument for
regular objects (by default), but overloaded for array objects to do
elementwise 'and'.

The problem is that on the one hand, we want the array of booleans
(returned by an array comparison) to always return true, so that
chained comparisons can work right; but on the other hand, we want
them to always raise an exception when used in other boolean
contexts. :-(

There are some ways out, but I think of them as hacks. For example,
there could be two different truth testing operations, one raising an
exception, one returning true. Or, an array of all zeros could return
false, and other arrays could be true; this would work with the
chained comparisons as long as the else branch is modified to read
"return temp1". I think Dave had a different idea in mind but I can't
recall it.

Paul F Dubois

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to Jim Fulton

Maybe I'm dense but I don't see the problem.

If you let a<b return anything you like, including a bit vector, then it
is useful to people in various contexts. David's examples of how much
simpler NumPy gets are one instance.

If the problem then is, well, what happens when I say:

if a < b:

then the answer seems clear to me. You do the comparison, and you use
upon the result the function slot that defines whether or not an object
is "true". If that slot is not defined, exception.

Why is this hard?

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Perhaps because in older versions of Python, *every* object had to
have a truth value, and there was no way for the truth test to raise
an exception. This has been fixed in 1.5, which is why Dave Ascher
started working on this again.

I think I agree with Paul and Dave, and disagree with Jim Fulton (who
wants "pure" comparison operators that can only return booleans) --
this is no reason to attach the proposal.

David Ascher

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

On Thu, 23 Apr 1998, Jim Fulton wrote:

> I have come to prefer Python's more restricted operator overloading
> machinery. Certain operators are "numberic" opertors and are
> treated differently by the language. Others are
> "sequence" or "mapping" operators. I think this is a good thing.

Mostly, I agree -- I do think it breaks down at the edges. For example,
arrays are numeric sometimes, sequences other times. I have other objects
which are somewhat sequences and somewhat mappings. In fact, even Guido
seems to change his definition of what a sequence is =).

> One of Python's major strengths is clarity. I also don't think it
> is sooo bad to spell out what we mean with method names to make
> sure that the reader understands what's being done. I'd rather
> not sacrifice clarity for the sake of less typing. (People who
> know me can tell you that I hate typing. :)

My feeling is that people who are used to array languages will in fact
find
s[not data > 100] = 100

much clearer than
s[logical_not(greater(data,100)] = 100

----

You seem to argue (I won't claim to know what you think today!) that a<b
means "the logical truth of whether a less than b" whereas I'd like to see
it as meaning "the value of a<b", just like in Python right now, 3*a means
"the value of 3*a", which may or may not be the same as "the number three
multiplied by a". In fact, for sequences it means "three reduplications
of a".

Thus * has a meaning which is conditioned on the type of its arguments.
I think that's all I want for comparisons. *Of course*, it can lead to
bad usage. But that's just as true for all the other operators:

def __add__(self, other):
return self + random.randint(int(time.time()))

=)

--david

David Ascher

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

On Thu, 23 Apr 1998, Guido van Rossum wrote:
> On the other hand, about the only viable approach to correctly
> implementing chained comparisons, e.g. a<b<c, is to generate (pseudo)
> code like this:
>
> temp1 = a if temp1:
> temp2 = b<c
> return __combine__(temp1, temp2)
> else:
> return false # or perhaps "return temp1"

I know that's what we said in private email, but I think that it's wrong
now, because it still requires that arrays are truth-testable. I do think
that with the exception-raising variety, if we use a try/except rule, we
can get it to work.

We can also have 'not' changed to call a __boolean_not__ method if the
truth testing raises an exception. Similarly for 'or' and 'and'.

def not(f):
try:
if istrue(f):
return 0
else:
return 1
except TruthException:
return f.__boolean_not__()

def or(f,g):
try:
if istrue(f)
return f
else:
return g
except TruthException:
return f.__boolean_or__(g)

etc.

Yes, it does require more opcodes, and might slow things down a bit
(although there might be ways around those problems).

> There are some ways out, but I think of them as hacks.

What about what I just proposed above?

> Or, an array of all zeros could return false, and other arrays could be
> true; this would work with the chained comparisons as long as the else
> branch is modified to read "return temp1".

I don't think that arrays should have truth values, since they don't have
"meaningful boolean interpretations", which is the only kind of thing 'if'
should be applied to.

--david

David Ascher

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

On Thu, 23 Apr 1998, Tim Peters wrote:

> Taking "the last" of anything is a big step. How does Guido feel about
> this?

I actually got conditional approval before doing this. =)

> > The builtin cmp() is still used for simple comparisons.
> > For rich comparisons, it is called with a third argument,
> > one of "<", "<=", ">", ">=", "==", "!=", "<>" (the last
> > two have the same meaning). When called with one of these
> > strings as the third argument, cmp() can return any Python
> > object. Otherwise, it can only return -1, 0 or 1 as before.
>
> Unclear whether "otherwise" means:
>
> a) cmp is called with exactly two arguments.
>
> or
>
> b) cmp is called with exactly two arguments, or with a third argument that
> is not one of the seven listed strings.

#a is what my current implementation does.

> #a makes more sense to me. E.g., the proposal later mentions a desire to
> pass something else as the third cmp argument to get a __boolean_and__ done.
> And personally being more interested in abusing all this for partial
> orderings <wink>, I'd like to be able to define a class __cmp__ that e.g.
> accepts a third argument of "<=>" to mean "comparable" (but Python doesn't
> need to know about that! it just needs to pass on whatever cmp's third
> argument is). But the intended relationship between cmp() and __cmp__()
> isn't addressed in the proposal (& should be).

Good point. In my current implementation:

cmp(a,b) --> a.__cmp__(b) # with the usual occasional __rcmp__
cmp(a,b,'>) -->
if hasattr(a, '__gt__'): return a.__gt__(b)
elif hasattr(b, '__lt__'): retur b.__lt__(a)
else: usual magic with a.__cmp__(b), b.__rcmp__(a) and smarts.

I other words, __cmp__ is never called with three arguments, since that
would break existing code.

> Other so-far unaddressed issues:
>
> + What do min and max do in the new world? E.g., if hasattr(x, '__lt__'),
> does max(x, y) use x.__lt__ to decide which is greater? Presumably you
> *don't* want x.__lt__ invoked for NumPy arrays in the max/min context, but
> someone else *will* want x.__lt__ invoked for e.g. their partially-ordered
> scalar class. And etc: what if x doesn't define __lt__ but does define
> __gt__, or defines neither but does define __le__, or ...? Python will have
> to get very explicit about how some of the built-ins are implemented.

Good point. Right now all of these use (I assume) PyObject_Compare,
which I can't modify, since I'd have to change its signature. Instead,
I provide a new C API function PyObject_RichCompapre which takes a third
argument.

I've been assuming that for the sake of ordering relations, a class
would define __cmp__ and __rcmp__ as before, and that for actually using
comparisons __lt__ and __gt__ etc. would be defined. This is probably
too naive.

> [ list.sort, dict[key], list.index, remove, count, ]

> + If a class C defines both (& only) __gt__ and __cmp__, and class D defines
> no comparison operators, what does C() < D() do? Invoke C.__cmp__(C(), D(),
> "<")? Seems like it should, but the implementation seems messy given the
> next one too:

> + As above, but C only defines __cmp__: presumably then C() < D() must
> *not* pass a third argument to C.__cmp__ lest current code break.

All good points.

> I found this part of the page difficult -- or perhaps too easy <wink>.

The problem of automatic website mirroring at 4am and playing with mods to
StructuredText before leaving work...

> Overall, the proposal looks good! I'm most concerned about the "unaddressed
> issues" above.

Right, I think they can all be addressed. I think it makes sense to
first settle whether comparisons should return 0/1 or PyObject *.

--david

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Guido van Rossum wrote:
>
> The most problematic area of Dave's proposal is whether the array
> object returned by "array1 cmpop array2" should return true or raise
> an exception when asked for its truth value. This is because in
> different contexts, you want different things. In this example:
>
> a = <some array>
> b = <another array>
> if a < b: # <----------------
> a = a+b
>
> clearly you want this to raise an exception at the indicated line to
> explain to the programmer that you can't test an array of Booleans for
> its truth value.
>

> On the other hand, about the only viable approach to correctly
> implementing chained comparisons, e.g. a<b<c, is to generate (pseudo)
> code like this:
>
> temp1 = a if temp1:
> temp2 = b<c
> return __combine__(temp1, temp2)
> else:
> return false # or perhaps "return temp1"
>

> Here, __combine__() is a postulated new operation used specifically
> for this situation. It is defined to return its second argument for
> regular objects (by default), but overloaded for array objects to do
> elementwise 'and'.
>
> The problem is that on the one hand, we want the array of booleans
> (returned by an array comparison) to always return true, so that
> chained comparisons can work right; but on the other hand, we want
> them to always raise an exception when used in other boolean
> contexts. :-(
>
> There are some ways out, but I think of them as hacks. For example,
> there could be two different truth testing operations, one raising an

> exception, one returning true. Or, an array of all zeros could return

> false, and other arrays could be true; this would work with the
> chained comparisons as long as the else branch is modified to read

> "return temp1". I think Dave had a different idea in mind but I can't
> recall it.

I interpret this as evidence for my position that comparisons should
return boolean values. This looks to me like one hack piled
on another for the syntactic sugar of being able to override
'<' and it's ilk with operations that do similar but different things.

Put another way, giving the proposed semantics for comparisons is
a bit like puting a square peg in a round hole.

Jim Fulton

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

David Ascher wrote:
>
> On Thu, 23 Apr 1998, Jim Fulton wrote:
>
> > I have come to prefer Python's more restricted operator overloading
> > machinery. Certain operators are "numberic" opertors and are
> > treated differently by the language. Others are
> > "sequence" or "mapping" operators. I think this is a good thing.
>
> Mostly, I agree -- I do think it breaks down at the edges. For example,
> arrays are numeric sometimes, sequences other times. I have other objects
> which are somewhat sequences and somewhat mappings. In fact, even Guido
> seems to change his definition of what a sequence is =).
>
> > One of Python's major strengths is clarity. I also don't think it
> > is sooo bad to spell out what we mean with method names to make
> > sure that the reader understands what's being done. I'd rather
> > not sacrifice clarity for the sake of less typing. (People who
> > know me can tell you that I hate typing. :)
>
> My feeling is that people who are used to array languages will in fact
> find
> s[not data > 100] = 100
>
> much clearer than
> s[logical_not(greater(data,100)] = 100

This is obviously a place where the language designer gets to pick
the "right" interpretation for comparison operators, which could
include no interpretation at al (ala Smalltalk). I've stated my
opinion.

> ----
>
> You seem to argue (I won't claim to know what you think today!) that a means "the logical truth of whether a less than b" whereas I'd like to see
> it as meaning "the value of a<b",

Exactly. ;-)

> just like in Python right now, 3*a means
> "the value of 3*a", which may or may not be the same as "the number three
> multiplied by a". In fact, for sequences it means "three reduplications
> of a".
>
> Thus * has a meaning which is conditioned on the type of its arguments.
> I think that's all I want for comparisons. *Of course*, it can lead to
> bad usage. But that's just as true for all the other operators:
>
> def __add__(self, other):
> return self + random.randint(int(time.time()))

It's more true for some operators than other. IMO, '+' and '*' are
flawed because of their dual meaning.

Jim Kraai

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Greetings,

(Apologies for length of post)

Making Python w/o Tkinter works just fine, but enabling Tkinter doesn't.

Here's a look at Modules/Setup:

# The _tkinter module.
#
# The TKPATH variable is always enabled, to save you the effort.
TKPATH=:lib-tk

# The command for _tkinter is long and site specific. Please
# uncomment and/or edit those parts as indicated. If you don't have a
# specific extension (e.g. Tix or BLT), leave the corresponding line
# commented out. (Leave the trailing backslashes in! If you
# experience strange errors, you may want to join all uncommented
# lines and remove the backslashes -- the backslash interpretation is
# done by the shell's "read" command and it may not be implemented on
# every system.

# *** Always uncomment this (leave the leading underscore in!):
_tkinter _tkinter.c tkappinit.c -DWITH_APPINIT \
# *** Uncomment and edit to reflect where your X11 header files are:
-I/usr/lpp/X11/include \
# *** Or uncomment this for Solaris:
# -I/usr/openwin/include \
# *** Uncomment and edit to reflect where your Tcl/Tk headers are:
-I/usr/local/include \
# *** Uncomment and edit for Tix extension only:
# -DWITH_TIX -ltix4.1.8.0 \
# *** Uncomment and edit for BLT extension only:
# -DWITH_BLT -I/usr/local/blt/blt8.0-unoff/include -lBLT8.0 \
# *** Uncomment and edit for PIL (TkImaging) extension only:
# -DWITH_PIL -I../Extensions/Imaging/libImaging tkImaging.c \
# *** Uncomment and edit for TOGL extension only:
# -DWITH_TOGL togl.c \
# *** Uncomment and edit to reflect where your Tcl/Tk libraries are:
-L/usr/local/lib \
# *** Uncomment and edit to reflect your Tcl/Tk versions:
-ltk8.0 -ltcl8.0 \
# *** Uncomment and edit to reflect where your X11 libraries are:
-L/usr/lpp/X11/lib \
# *** Or uncomment this for Solaris:
# -L/usr/openwin/lib \
# *** Uncomment these for TOGL extension only:
# -lGL -lGLU -lXext -lXmu \
# *** Always uncomment this; X11 libraries to link with:
-lX11

End of Modules/Setup

Here's a transcript of the death of make:

expr `cat buildno` + 1 >buildno1
mv -f buildno1 buildno
cc -c -O -I. -DHAVE_CONFIG_H -DBUILD=`cat buildno` ./Modules/getbuildinfo.c
ar cr libpython1.5.a getbuildinfo.o
ranlib libpython1.5.a
cd Modules; make OPT="-O" VERSION="1.5" prefix="/" exec_prefix="/" LIBRARY=../libpython1.5.a link
./makexp_aix python.exp "" ../libpython1.5.a; cc -O -Wl,-bE:./python.exp -lld python.o ../libpython1.5.a -lm -lm -lm -lBLT8.0 -L/usr/local/lib -ltk8.0 -ltcl8.0 -L/usr/lpp/X11/lib -lX11 -ldl -lm -o python
ld: 0711-317 ERROR: Undefined symbol: .XCreateIC
ld: 0711-317 ERROR: Undefined symbol: .XFilterEvent
ld: 0711-317 ERROR: Undefined symbol: .XmbLookupString
ld: 0711-317 ERROR: Undefined symbol: .XOpenIM
ld: 0711-317 ERROR: Undefined symbol: .XGetIMValues
ld: 0711-317 ERROR: Undefined symbol: .XDestroyIC
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
make: 1254-004 The error code from the last command is 8.

Stop.
make: 1254-004 The error code from the last command is 2.

Stop.

End of End of Make

When I go poking around in the X11 stuff, I find that XCreateIC is
defined in Xlib.h, but am not talented enough to find a direct
reference to Xlib.h in any of the Python source. So that got me
nowhere, but it might be a hint to the smart people here. (Is it
part of the tcl/tk stuff? If so, why do all of my tcl/tk scripts/
tools work? <Sigh>)

What have I done wrong here?

Any help would be much appreciated, I'm on a hot project & flying
solo.

Thank you,

----------------jim kraai--j...@mci.com----------------
Believe -> Perceive -> Perform -> Inform

Konrad Hinsen

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

David Ascher <d...@skivs.ski.org> writes:

> I don't think that arrays should have truth values, since they don't have
> "meaningful boolean interpretations", which is the only kind of thing 'if'
> should be applied to.

For consistency with other sequence objects, I'd expect the truth
value of an array to be defined by the truth value of len(array).
(Note that this is not a test for emptiness; an array might have
shape (5, 0), being empty but having a non-zero length).

That rule gives a clear interpretation to code like
if a < b: ...
This interpretation might not be intuitively clear, but then many
non-number comparisons in Python share that problem.
--
-------------------------------------------------------------------------------
Konrad Hinsen | E-Mail: hin...@ibs.ibs.fr
Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28
Institut de Biologie Structurale | Fax: +33-4.76.88.54.94
41, av. des Martyrs | Deutsch/Esperanto/English/
38027 Grenoble Cedex 1, France | Nederlands/Francais
-------------------------------------------------------------------------------

hug...@cnri.reston.va.us

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to d...@skivs.ski.org

First off, I want to throw in my vote with David, Paul, et al. for
implementing this change in Python. As soon as Guido agrees its a good idea
I'd love to build it into JPython! This feature would be tremendously useful
for NumPy, Fredrick Lund has mentioned that he'd find it valuable for PIL, and
I know it would make building Mathematica-like tools for Python far easier.

I've been complaining about this particular issue since the first Python
conference I attended about 2.5 years ago. The "special" treatment of
comparision operators in Python has always seemed strange to me. I never
grasped a good reason why I could override __add__ and yet I couldn't do the
same thing with __lt__. Anyway, on to the issue at hand:

> On Thu, 23 Apr 1998, Guido van Rossum wrote:
> > On the other hand, about the only viable approach to correctly
> > implementing chained comparisons, e.g. a<b<c, is to generate (pseudo)

a<b<c -> a<b and b<c (this is exactly how things currently work I believe)

The change is in how to implement "a and b". I would do this as follows:

if hasattr(a, '__booland__'):
return a.__booland__(b)
elif a:
return b
else:
return a

The only difference between this and the current implementation is the first
hasattr clause. This lookup would clearly have to be implemented using the
tricks for __getattr__ in order to ensure that performance was not a
significant issue, but with that in mind I think this would work well.

This proposal allows a.__nonzero__() to raise an exception for objects where
truth testing doesn't make a whole lot of sense (like arrays).

It would add two more methods to the new method set:

__booland__ and __boolor__.

This set of operators would complete the goal of David's proposal (at least
what I think the goal was ;-) and make all of Python's boolean operations
overridable for those classes where it is appropriate.

Another thing that I like about this proposal (and I think that this really
addresses Jim Fulton's complaints, also in my own opinion ;-)

is that:

if a < b:

will throw a big ugly exception if the result of a < b is not something that
is expected to be truth tested, so naive users will get a fairly clear
response that they're doing something wrong. I've been out of touch with
NumPy lately, but in my last release this sort of thing would behave fairly
meaningless ways which can only be considered a very BAD thing.

Jim Hugunin - hug...@python.org

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

> On Thu, 23 Apr 1998, Guido van Rossum wrote:
> > On the other hand, about the only viable approach to correctly
> > implementing chained comparisons, e.g. a<b<c, is to generate (pseudo)

> > code like this:
> >
> > temp1 = a > if temp1:
> > temp2 = b<c
> > return __combine__(temp1, temp2)
> > else:
> > return false # or perhaps "return temp1"
>

> I know that's what we said in private email, but I think that it's wrong
> now, because it still requires that arrays are truth-testable.

Yes, that's what I was pointing out.

> I do think
> that with the exception-raising variety, if we use a try/except rule, we
> can get it to work.

I looked at this a little bit, but I think the try-except will
generate way too much code (the best variant I could try generated 24
instructions for "a or b" instead of 4 now!). The problem is that the
Python compiler doesn't know whether a and b are arrays or not, so it
has to generate code that works regardless of array-ness for all
occurreces of "a or b" (etc.).

> We can also have 'not' changed to call a __boolean_not__ method if the
> truth testing raises an exception. Similarly for 'or' and 'and'.
>
> def not(f):
> try:
> if istrue(f):
> return 0
> else:
> return 1
> except TruthException:
> return f.__boolean_not__()

OK.

> def or(f,g):
> try:
> if istrue(f)
> return f
> else:
> return g
> except TruthException:
> return f.__boolean_or__(g)
>
> etc.
>
> Yes, it does require more opcodes, and might slow things down a bit
> (although there might be ways around those problems).

While some of this could be alleviated with some specialized opcodes,
I think the code bloat is real, and too much.

> > There are some ways out, but I think of them as hacks.
>

> What about what I just proposed above?

Apart from the code bloat, I think that catching an exception is a
hack (but then again, a for statement catches IndexException -- it
just doesn't use a try-except statement, but here I think you can't
avoid that).

> > Or, an array of all zeros could return false, and other arrays could be
> > true; this would work with the chained comparisons as long as the else
> > branch is modified to read "return temp1".
>

> I don't think that arrays should have truth values, since they don't have
> "meaningful boolean interpretations", which is the only kind of thing 'if'
> should be applied to.

It's not that simple. Traditionally, Python has forced every object
type a truth value, and for sequences and dictionaries, it has the
general rule that empty==false and all others are true. There's a
fair expectation among Python users that this is the case, and we'll
have to give a better argument than "I don't think so" to break their
fair expectation. (Not that I think that "fair expectations" should
never be broken -- just not without a very good reason.)

I'm tempted to back down a bit and forget about fancy implementation
of chained comparisons or boolean operators. I would support
overloading individual comparison operators (don't forget "in" and
"not in" by the way!), and asking the truth value of arrays would
raise an exception. This should make Jim happy too -- "a < b" needn't
return a truth value, but if it doesn't, you can't use it in a context
that needs one.

David Ascher

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

On 23 Apr 1998, Konrad Hinsen wrote:

> For consistency with other sequence objects, I'd expect the truth
> value of an array to be defined by the truth value of len(array).
> (Note that this is not a test for emptiness; an array might have
> shape (5, 0), being empty but having a non-zero length).
>
> That rule gives a clear interpretation to code like
> if a < b: ...
> This interpretation might not be intuitively clear, but then many
> non-number comparisons in Python share that problem.

I'd argue that this interpretation would not only be not intuitive, but
highly unexpected for most users. len(a) is zero iff the dimension of a
along rank-0 is 0, right?

>>> a = zeros((0,))
>>> a.shape
(0,)
>>> a.shape = (0,20)
>>> len(a)
0
>>> a.shape = (20,0)
>>> len(a)
20

In other words, changing the shape would be changing the truth value of an
array? That's just weird if you ask me.

--da

Steven D. Majewski

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

I'm probably in the camp that comparison of two arrays can yield
an array, but in a boolean context of IF, the value is arbitrary,
just as comparisons of other sequences.

[ But I sure wish I had a time machine to go back and beg Guido
NOT to use "+" for concatenation! Having some sequences interpret
"+" as concatenation and other interpret it as addition seems
to be treading on dangerous ground. ]

XlispStat's vectorized operations give a similar capability, and I
find it very useful, however, Lisp doesn't have Python's semantics
that all types are comparable. Only numerics are comparable with
">", "<", et.al. This limits and avoids some of the complications
discussed here for Python.

In XlispStat you can say:

> ( > 5 ( iseq 10 )) # compare scalar and sequence
(T T T T T NIL NIL NIL NIL NIL)
> ( > ( + 2 (iseq 10)) ( iseq 10 )) # compare sequence and sequence
(T T T T T T T T T T)

However, as in Python, a list of False/Nil values is not the
same as a False/Nil value:

> ( > ( iseq 10 ) ( iseq 10 ))
(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL)

> ( if ( > ( iseq 10 ) ( iseq 10 ))
( print T )
( print NIL ))

T

There is also a useful pair of functions, SELECT, which takes
a sequence and a list of indices, and WHICH, which returns the
indexes of the non-null values of a sequence, thus:

> ( setf jseq ( iseq 10 ))
(0 1 2 3 4 5 6 7 8 9)
> ( select jseq ( which ( > jseq 4 )))
(5 6 7 8 9)

I've also run into the problem that IEEE NaNs are very useful for
missing values and some other things -- they function as numbers,
in that they don't raise exceptions when used in numeric operations,
but they are "contageous" - i.e. 1.0 + NaN = NaN

However, the one problem is that Xlisp, like Python does it's
comparisons internally with a function that returns one of:
( -1, 0, +1 ) and this doesn't map into using NaN in these
functions, where you want NaNs to not be ordered. ( i.e. they
should be not equal, not less than, and not greater than than
any value including NaN. )

You may not agree that the above semantics are the "right" ones --
both the IEEE FP and the ANSI Lisp and C standards purposely
sidestep some of these issues. However, the combination of
the comparison yielding ( -1, 0, +1 ) and some platform differences
in IEEE FP support ends up making the results of NaN comparisons
to be, in practice, buggy and platform dependent. ( It's probably
more polite to say "undefined" ;-)

---| Steven D. Majewski (804-982-0831) <sd...@Virginia.EDU> |---
---| Department of Molecular Physiology and Biological Physics |---
---| University of Virginia Health Sciences Center |---
---| P.O. Box 10011 Charlottesville, VA 22906-0011 |---
"Nature and Nature's laws lay hid in night:
God said, let Newton be! and all was light." - pope

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Jim Hugunin wrote:

> a<b<c -> a
> The change is in how to implement "a and b". I would do this as follows:
>
> if hasattr(a, '__booland__'):
> return a.__booland__(b)
> elif a:
> return b
> else:
> return a

This is better than catching the exception. It's still a problem
though. Let's suppose we want to generate code for

f() < g() < h()

I don't want to type long bytecode listings, so I'm manually
translating the stack-based bytecode to temporary-variable-based
pseudo Python. Here is the code that currently gets implemented; t1,
t2, t3 and c1, c2 are temporaries; result is the final result of the
expression.

t1 = f()
t2 = g()
c1 = t1 < t2
if c1:
t3 = h()
c2 = t2 < t3
result = c2
else:
result = c1

With Jim's proposal, this same thing becomes

t1 = f()
t2 = g()
c1 = t1 < t2
if hasattr(c1, '__booland__'):
t3 = h()
c2 = t2 < h3
result = c1.__booland__(c2)
else:
if c1:
t3 = h()
c2 = t2 < t3
result = c2
else:
result = c1

The problem here is that the evaluation of h() appears twice in this
version. In practice, this may be an arbitrarily complex expression.
I don't see a way to avoid this, since it is essential that h() not be
evaluated at all when c1 is false. NumPy users may not write very
long expressions as comparison operands (I don't know), but other
people may write long expressions, and still get this (unused)
duplication of code.

BTW, I presume that for consistency, the expression f() and g() should
also translated to

t1 = f()
if hasattr(t1, '__booland__'):
t2 = g()
result = t1.__booland__(t2)
else:
if not t1:
result = t1
else:
t2 = g()
result = t2

Jim Hugunin

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Guido van Rossum wrote:
>
> Jim Hugunin wrote:
>
> > a<b<c -> a >
> > The change is in how to implement "a and b". I would do this as follows:

> The problem here is that the evaluation of h() appears twice in this
> version. In practice, this may be an arbitrarily complex expression.
> I don't see a way to avoid this, since it is essential that h() not be
> evaluated at all when c1 is false. NumPy users may not write very
> long expressions as comparison operands (I don't know), but other
> people may write long expressions, and still get this (unused)
> duplication of code.

I hadn't thought about this at all, but of course you're right. Including the
code to evaluate h() twice in the generated bytecodes is almost certainly
unacceptable. I do have a suggestion below though.

> BTW, I presume that for consistency, the expression f() and g() should
> also translated to
>
> t1 = f()
> if hasattr(t1, '__booland__'):
> t2 = g()
> result = t1.__booland__(t2)
> else:
> if not t1:
> result = t1
> else:
> t2 = g()
> result = t2

Yes, this is exactly what I'm proposing (and "f() or g()" should be translated
similarly). I also like this as a good place to discuss the implementation
because its the simplest code snippet that includes all of the complicated
issues. Here's my proposed rewriting of this to avoid evalualting g() twice:

t1 = f()
booland = hasattr(t1, '__booland__')
if not booland && not t1:
result = t1

t2 = g()
if booland:
result = t1.__booland__(t2)
else:
result = t2

I'm using temporary variables here as real temporary variables and not always
as a way of rewriting stack elements. This is easy in my JPython
implementation. I don't know enough about CPython's implementation to comment
on what troubles this might cause.

-Jim

Jim Kraai

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

Left out some diagnostic info:
AIX 4.2.1.0 (yes, it's 4.2, not 4.1)
I really, really believe that the path to libX11.a is correct

DejaNews searches indicate that these types of errors
are the result of the linker not being able to find
libX11.a due to absence (during OS installation) or
the linker's inability to find the file (bad -l
option).

I've really done my best to confirm that the files are
where they should be, and that the link option
references them correctly. Oddity--which will surely
further display my ignorance--when I do a strings on
libX11.a, I don't see the symbols that the make is
failing on.

Again, any help would be much appreciated.

--jim

------------------------
From: Jim Kraai <Jim....@mci.com>
Subject: Help: AIX Python + Tkinter Compile Problems
Date: Thu, 23 Apr 1998 12:35:38 -0600
To: pytho...@cwi.nl

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

On some platforms, several other libraries containing optional parts
of X11 must also be linked. The link order may be tricky too.

Guido van Rossum

unread,

Apr 23, 1998, 3:00:00 AM4/23/98

to

[Jim]

> I hadn't thought about this at all, but of course you're right.
> Including the code to evaluate h() twice in the generated bytecodes
> is almost certainly unacceptable. I do have a suggestion below
> though.

[me]

> > BTW, I presume that for consistency, the expression f() and g() should
> > also translated to
> >
> > t1 = f()
> > if hasattr(t1, '__booland__'):
> > t2 = g()
> > result = t1.__booland__(t2)
> > else:
> > if not t1:
> > result = t1
> > else:
> > t2 = g()
> > result = t2

> Yes, this is exactly what I'm proposing (and "f() or g()" should be
> translated similarly). I also like this as a good place to discuss
> the implementation because its the simplest code snippet that
> includes all of the complicated issues. Here's my proposed
> rewriting of this to avoid evalualting g() twice:

> t1 = f()
> booland = hasattr(t1, '__booland__')
> if not booland && not t1:
> result = t1

Of course you want to skip over the rest here, so insert an "else:"
here and indent the rest by one tab stop:

> t2 = g()
> if booland:
> result = t1.__booland__(t2)
> else:
> result = t2

This is definitely more promising. I'm still worried about the code
bloat. Could you implement this (or something like it) in JPython and
report on the resulting increase Java bytecode (or .class file) size
for some typical code? A speed tests would also be nice. (How about
pystone?)

> I'm using temporary variables here as real temporary variables and
> not always as a way of rewriting stack elements. This is easy in my
> JPython implementation. I don't know enough about CPython's
> implementation to comment on what troubles this might cause.

Don't worry. I think I the swap, dup, pop and rot3 opcodes are enough
to effectively squirrel a copy of booland away on the stack and
unearth it when it's needed.

pappus

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Guido van Rossum wrote:
>
> On some platforms, several other libraries containing optional parts
> of X11 must also be linked. The link order may be tricky too.
>

> --Guido van Rossum (home page: http://www.python.org/~guido/)

(It really is the original poster, I'm at home now, so ...)

I do not mean to sound indignant, but this build is out-of-the-box.

Being out of the box, I usually find that the (in my opinion)
brilliant and hard-working people that I depend on have done their
homework _for_ me on those tricky linking-order issues.

Aren't there other Pythoners out there using AIX? OK, AIX 4.2.x
might be a lot less common, but not for long.

Anything I can do to fix this can go right back to the keeper-of-
all-source-that-is-good (that's you, Guido) to help this bazaar
moving along.

In light of your remarks, I've tried to eliminate one level of
complexity
by removing the reference to the BLT library.

No symptomatic changes.

This appears to be the line where the failure is happening:

cc -O -Wl,-bE:./python.exp -lld python.o ../libpython1.5.a -lm

-L/usr/local/lib -ltk8.0 -ltcl8.0 -L/usr/lpp/X11/lib -lX11 -ldl -lm
-o python

Are you suggesting it would it make a difference if I had the
-L/usr/lpp/X11/lib -lX11
appear earlier in the command-line than the tcl/tk stuff?

I can certainly try that in the morning.

Thank you much for your help so far. I look forward to resolving this
and
getting a patch in a workable form to contribute back to the sources.

--jim

Tim Peters

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

[tim]
> ...

> You need at least three relational operators to model that kind of
> stuff faithfully, but they're {==, <, <=}, not {==, <, >}.

[Jim Fulton]

> I agree that you need a special operation to model this. I question
> whether it should be >=.
>
> I think this boils down to what the interpretation of >= (<=) is.
> I read this as greater(less)-than-or-equal-to, and give it a
> corresponding logical interpretation. Perhaps I'm being arbitrary.

Not so much arbitrary as that your head is stuck in "scalar mode" <0.9
wink>. When objects have internal structure (i.e., are non-scalar), you
have to worry about "where the quantifiers go" -- addressing that is the
crux of what a "logical interpretation" *means*.

When we have non-scalar A and B objects with internal fields f, your vision
of <= places the quantifiers like this:

for all fields f
A.f < B.f
or
for all fields f
A.f == B.f

while a different vision puts them like so:

for all fields f
A.f < B.f or A.f == B.f

Now does it make more sense to say "less or equal" means "(all less) or (all
equal)", or that it means "all (less or equal)"?

Well, it's a bogus question: from a purely logical point of view they're
both arbitrary, while from an application view "it depends" (and for things
like Python's tuple comparisons, neither of the above is appropriate, while
NumPy wants something else entirely).

Do note that, in scalar mode (i.e. if there's only one internal field), both
formulations above are equivalent (and are also equivalent to Python's
lexicographic comparisons, and to NumPy's as well!) -- they can mean
different things only if there's more than one field. Having multiple
fields creates possibilities and problems that don't come up in scalar mode.

In textbook arithmetic, "<=" is usually *defined* as "< or =" -- but
textbook arithmetic isn't wrestling with these other problems either. To
the extent that operator overloading exists to enable programmers to model
real application domains using natural syntax, a language has to be as loose
as possible in restricting what that syntax is allowed to mean, lest it
artificially restrict the domains in which it's useful.

I.e., if we have to live with operator overloading, best to throw *all*
"common sense" out the window <0.9 wink -- but after IEEE's NaN != NaN,
there's no "common sense" left to preserve>.

> ...

> I guess it depends on what you mean by natural. ;-)

Exactly so, Jim -- and we're telling you we don't want Python to predefine
"natural" (or "common sense") for us.

>> personally-enjoys-overloading-but-objects-when-
>> other-people-do-it-ly y'rs - tim
>
> Hee hee.

Ha! The absence of a "<wink>" on that sig was no oversight <wink>.

but-then-i'm-old-and-hate-everything-ly y'rs - tim

Tim Peters

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

[guido and jim hugunin converge on (after applying
patches & optimizing a bit ...) this for "f() and g()"]

result = f()
booland = hasattr(result, '__booland__')
if booland || result: # short-circuit "||" crucial
t = result
result = g()
if booland:
result = t.__booland__(result)

Presumably also (for "f() or g()"):

result = f()
boolor = hasattr(result, '__boolor__')
if boolor || not result: # short-circuit "||" crucial
t = result
result = g()
if boolor:
result = t.__boolor__(result)

And also (for "not f()"), for consistency and because David really, really
wants it <wink>:

t = f()
if hasattr(t1, '__boolnot__'):
result = t.__boolnot__()
else:
result = not t

[guido]

> This is definitely more promising.

Oh yes! Good show.

> I'm still worried about the code bloat.

As a rough rule of thumb, I generally see .pyc files turn out to be about
the same size as their corresponding .py files. In a world where I now
routinely download 8 Mb *patches*, Python code could get 100x bigger and
still be insignificant <0.5 wink>. Not that I'm in favor of code bloat --
but I'm more worried about slowing down e.g.

while i < n and a[i] == b[i]:

> Could you implement this (or something like it) in
> JPython and report on the resulting increase Java
> bytecode (or .class file) size for some typical code?

> A speed test would also be nice. (How about pystone?)

Alas, for this change pystone is interesting only as a gross sanity check:
it contains exactly one instance of "and", and it's not that important.
"or" is a little more interesting, because while pystone contains only one
instance of that too, it accounts for half(!) the computation done by Proc4.
But it's a start ...

good-time-to-get-back-to-paul-prescod-about-
type-declarations-ly y'rs - tim

Konrad Hinsen

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

David Ascher <d...@skivs.ski.org> writes:

> I'd argue that this interpretation would not only be not intuitive, but
> highly unexpected for most users. len(a) is zero iff the dimension of a
> along rank-0 is 0, right?

Right. Same for lists.

> In other words, changing the shape would be changing the truth value of an
> array? That's just weird if you ask me.

I tend to see arrays as generalizations of nested lists, and lists
behave in the same way: [] is false, but [[]] is true.

I doubt there is any "expected" interpretation of truth values for
arrays; everyone would have different expectations. But that's already
true for other Python objects; I suspect many newcomers would
interpret

a = 'yes'
if a:
...

in a different way than the Python interpreter.

Konrad Hinsen

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Jim Kraai <Jim....@mci.com> writes:

> Left out some diagnostic info:
> AIX 4.2.1.0 (yes, it's 4.2, not 4.1)
> I really, really believe that the path to libX11.a is correct

Here's the relevant part of Modules/Setup in a version that works
with AIX 4.3:

# *** Always uncomment this (leave the leading underscore in!):
_tkinter _tkinter.c tkappinit.c -DWITH_APPINIT \
# *** Uncomment and edit to reflect where your X11 header files are:

-I/usr/include/X11 \

# *** Or uncomment this for Solaris:
# -I/usr/openwin/include \
# *** Uncomment and edit to reflect where your Tcl/Tk headers are:
-I/usr/local/include \
# *** Uncomment and edit for Tix extension only:
# -DWITH_TIX -ltix4.1.8.0 \
# *** Uncomment and edit for BLT extension only:
# -DWITH_BLT -I/usr/local/blt/blt8.0-unoff/include -lBLT8.0 \
# *** Uncomment and edit for PIL (TkImaging) extension only:
# -DWITH_PIL -I../Extensions/Imaging/libImaging tkImaging.c \
# *** Uncomment and edit for TOGL extension only:
# -DWITH_TOGL togl.c \
# *** Uncomment and edit to reflect where your Tcl/Tk libraries are:
-L/usr/local/lib \
# *** Uncomment and edit to reflect your Tcl/Tk versions:
-ltk8.0 -ltcl8.0 \
# *** Uncomment and edit to reflect where your X11 libraries are:

-L/usr/lib/X11 \

# *** Or uncomment this for Solaris:
# -L/usr/openwin/lib \
# *** Uncomment these for TOGL extension only:
# -lGL -lGLU -lXext -lXmu \

# *** Uncomment for AIX:
-lld \

# *** Always uncomment this; X11 libraries to link with:
-lX11

In addition to a different path for the X11 libs, there's the
additional "-lld" that's claimed to be necessary for AIX.
I don't know whether that is relevant, though; I am rather new at AIX.

Vladimir Marangozov

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to Jim....@mci.com

Jim Kraai wrote:
>
> Left out some diagnostic info:
> AIX 4.2.1.0 (yes, it's 4.2, not 4.1)
> I really, really believe that the path to libX11.a is correct

Jim,

Your compile & link commands are correct. I strongly suspect a path
problem. Make sure (again) that the path to *all* X11 libraries you're
using is, in your case "/usr/lpp/X11/lib" (and not something like
"/usr/local/X11/lib" or "/usr/lib/X11/" or simply "/usr/lib") and you
give the corresponding path to the include files.

> references them correctly. Oddity--which will surely
> further display my ignorance--when I do a strings on
> libX11.a, I don't see the symbols that the make is
> failing on.

Issue an "nm /usr/lpp/X11/lib/libX11.a | grep <symbol name>". You should
have an output similar to this one:

nm /usr/local/X11R6/lib/libX11.a | grep XCreateIC
.XCreateIC T 288224
XCreateIC D 26144 12

which attests that the symbol XCreateIC is defined.

--
Vladimir MARANGOZOV | Vladimir....@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252

Jim Fulton

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

hug...@cnri.reston.va.us wrote:
>
> First off, I want to throw in my vote with David, Paul, et al. for
> implementing this change in Python. As soon as Guido agrees its a good idea
> I'd love to build it into JPython! This feature would be tremendously useful
> for NumPy, Fredrick Lund has mentioned that he'd find it valuable for PIL, and
> I know it would make building Mathematica-like tools for Python far easier.

I'm surprized that a change that does not make any new computation
possible,
but that merely changes the way existing computation is spelled:

a<b

rather than:

less(a,b)

is "tremendously useful".

> I've been complaining about this particular issue since the first Python
> conference I attended about 2.5 years ago. The "special" treatment of
> comparision operators in Python has always seemed strange to me. I never
> grasped a good reason why I could override __add__ and yet I couldn't do the
> same thing with __lt__. Anyway, on to the issue at hand:

Of course, I (and probably everyone agrees with this statement.

Jim Fulton

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

There seems to be an overwhelming view that comparisons
on some types should yeild results that have no boolean meaning.

For example, people that work with arrays want comparisons
to yeild arrays of results of elementwise comparisons.
Does this mean that a==b will yield an array too?

Any type/class for which equality comparisons will not yield
meaningful results will not be useable as dictionary keys.
They will also not be useable in computation requireing equality
tests, at least without requireing some alternate spelling.
Do the proponents of this approach consider this to be a good
thing?

Konrad Hinsen

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Jim Fulton <j...@digicool.com> writes:

> I'm surprized that a change that does not make any new computation
> possible,
> but that merely changes the way existing computation is spelled:
>
> a
> rather than:
>
> less(a,b)
>
> is "tremendously useful".

It makes an enormous difference in readability of complex expressions.
Here's an example from real-life code:

return N.logical_and(N.logical_and(N.less_equal(a, x), N.less_equal(x, b)),
N.less_equal(x, c))

And that's just one line out of many - the whole subroutine looks rather
intimidating. Compare that to

return a <= x && x <= b && x <= c

In addition, the shorter code would work for a much larger range of
data types.

Konrad Hinsen

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Jim Fulton <j...@digicool.com> writes:

> For example, people that work with arrays want comparisons
> to yeild arrays of results of elementwise comparisons.
> Does this mean that a==b will yield an array too?

Of course! There's still "a is b" for identity tests.

> Any type/class for which equality comparisons will not yield
> meaningful results will not be useable as dictionary keys.

Arrays can't be used as keys anyway because they are mutable.

> They will also not be useable in computation requireing equality
> tests, at least without requireing some alternate spelling.
> Do the proponents of this approach consider this to be a good
> thing?

Yes. The current equality test (based on address equality) is
useless for arrays anyway, so it can only become better.
And there is no single generally useful definition of "equality"
for arrays.

All array languages implement comparisons elementwise, so it
can't be so bad!

Drew Csillag

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

pappus wrote:
>
> Guido van Rossum wrote:
> >
> > On some platforms, several other libraries containing optional parts
> > of X11 must also be linked. The link order may be tricky too.
> >
> > --Guido van Rossum (home page: http://www.python.org/~guido/)

<rant>
The way the AIX linker works (at least it did in 4.1.x), order doesn't
matter, it just links EVERYTHING in (unlike every other UNIX linker on
the planet...), unless you specify -qtwolink (or something like it) so
that it will do a two pass link and only link in what it needs. I
suspect 4.2.x is no different.
</rant>

Anyway, you may want to see what Tk did at it's link step (for wish) to
see what you're missing, or what is different. BTW: In previous
versions of AIX, I always had /usr/lib/X11 as the link path, not
/usr/lpp/X11/lib (not that it should matter any, but the AIX linker is a
strange animal...).

Drew Csillag
drew_c...@geocities.com

Jim Kraai

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

(another long path, er post)

Vladimir & Jody Winston suggested:

>Issue an "nm /usr/lpp/X11/lib/libX11.a | grep <symbol name>". You should
>have an output similar to this one:
>
>nm /usr/local/X11R6/lib/libX11.a | grep XCreateIC
>.XCreateIC T 288224
>XCreateIC D 26144 12
>
>which attests that the symbol XCreateIC is defined.

BINGO!

# nm /usr/lpp/X11/lib/libX11.a | grep XCreateIC

(no output)

# nm *.a | grep XCreateIC

(again, no output)

# nm /usr/lpp/X11/lib/libX11.a | tail
sys_errlist d 34400 4
sys_nerr U -
sys_nerr d 34396 4
tolower U -
tolower d 34232 4
write U -
write d 34292 4
writev U -
writev d 34364 4
xsendevent.c f -

(Stuff is in the file, just the wrong stuff)

It's not defined!!!

Why in the world did Tk compile?

I went to the Tk distribution and found the final wish link line.
There is NO suggestion to the compiler where to find the X11 libs.

I went back to Modules/Setup and commented out the include and lib
X11 lines before and after the refs to tcl/tk, and the compile
WORKED!!!

Now to find the _real_ libX11.a:

# find / -name libX11.a -print 2> /dev/null
/usr/lpp/X11/lib/R5/inst_updt/libX11.a
/usr/lpp/X11/lib/R5/libX11.a
/usr/lpp/X11/lib/R4/libX11.a
/usr/lpp/X11/lib/libX11.a
/usr/lib/netls/ark/lib/libX11.a
/usr/lib/libX11.a
# cd /usr/lib
# nm libX11.a | grep XCreateIC
.XCreateIC T 136168
XCreateIC D 124492 12

Hey! Now I feel _really_ dumb, but really happy that this is working.

My fantasy won't come true today, I won't get to contribute to the
Python sources. :(

Thanks to Guido, Konrad, Vlad, and Jody for their help!
(My employer would thank you, but I'm not going to tell him unless
someone wants to apply to work for MCI. It's fun! We're in the
middle of a merger. <hysterical, psychotic laughter> Vlad, still
looking for a job in the U.S.?)

--jim

------------------------
From: Vladimir Marangozov <Vladimir....@inrialpes.fr>
Subject: Re: Help: AIX Python + Tkinter Compile Problems
Date: Fri, 24 Apr 1998 11:42:30 +0200
To: pytho...@cwi.nl
Newsgroups: comp.lang.python

Jim,

---------------End of Original Message-----------------

Mike Miller

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

>>>>> "Guido" == Guido van Rossum <gu...@CNRI.Reston.Va.US> writes:
> NumPy users may not write very long expressions as
> comparison operands (I don't know)

I do - and I'd probably write even longer comparisons if it was
simpler. I see the main benefit of a simplified comparison
syntax (from the typist's, ummm, coder's perspective) is that it
/is/ simpler. At least simpler to read and write, and simpler to
explain to my colleagues.

Mike

--
Michael A. Miller mil...@uiuc.edu
Department of Physics, University of Illinois, Urbana-Champaign
PGP public key available on request

John B. Williston

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Jim Fulton wrote in message <353F3D...@digicool.com>...

>Meaningful is the operative word here. What is the Boolean meaning
>of (0,0,0), (1,1,1), or (1,0,1)? How about ()?
>
>My first assumption would be that empty arrays are false and non-empty
>arrays are true.
>
>Note that I don't object to having comparisons return non-integers, and
>of course, all Python objects are, in a sense, boolean. Not all
>objects have meaningful boolean values.

You raise a good point, but again it seems nugatory to me. The "Boolean
meaning" of an array is a pretty slippery concept, indeed. There is perhaps
some precedent to think that an array of all zeroes is false, while anything
else is true (i.e., most programming languages seem to interpret zero as
false
and anything else as true; thus, to expand this to cover an array of zeroes
does not seem unprecedented), but I can see why one might object.

However, any objection seems to miss the point: doing a boolean comparison
of
an array to something other than an array is meaningless to begin
with--regardless of what the "Boolean meaning" of the array is. Or, perhaps
I
should say only that I cannot understand how it could be meaningful.
Consider
the following:

if (a < b):
# do something useful here.

If 'a' refers to a 3 x 3 array while 'b' refers to a dictionary, the very
comparison itself is meaningless, and the programmer deserves what he gets
for
doing something so silly. At the same time, however, David's proposal adds
value to operations within the proper context (i.e., dealing strictly with
arrays). The only reason at all to allow an array to have a "Boolean
meaning"
(as I understand his proposal) is to avoid exceptions.

Now, I may well be in over my head here, and perhaps that is why I do not
see
how your objection obtains. But if I understand David's proposal correctly,
the additional functionality is terribly useful for numeric Python with the
single side effect that meaningless results are produced for meaningless
comparison operations.

>Perhaps your notion of "meaning" is different than mine.

Good Lord, I hope not (grin).

John

http://www.netcom.com/~wconsult
___ ___
\ \ __ / / Williston Consulting
\ \/ \/ / __________ makes software worth buying.
\ /\ / / _______/ wcon...@ix.netcom.com
\_/ \_/ / /
/ /_______
/__________/

Tim Peters

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

> There seems to be an overwhelming view that comparisons
> on some types should yeild results that have no boolean
> meaning.

Well, there's a handful (about 5 so far) of NumPy people explicitly in favor
of that. Steve M weighed in yesterday on the "should return arbitrary
result" side. You're clearly against it. I haven't declared one way or the
other, because I don't understand the full implications yet (see bottom).

> For example, people that work with arrays want comparisons
> to yeild arrays of results of elementwise comparisons.
> Does this mean that a==b will yield an array too?

If __eq__ were user-definable, that would up to them to decide. I'm not yet
a NumPy user, but used to live in that application world, and based on that
my best guess is that they would *greatly* prefer that a==b yield an array,
and have to write e.g. a.same_shape_and_elements(b) in the much rarer case
they need to compare two arrays for sequence equality (few matrix algorithms
are of the "while a != b:" form, and even when they are they usually need a
fuzzy version of equality to tolerate roundoff errors).

> Any type/class for which equality comparisons will not
> yield meaningful results will not be useable as dictionary
> keys.

So far that's unclear, and in an interrupted thread with David Ascher I'm
trying to tease that out. In short, dict lookup doesn't involve an explicit
user-coded "==", so it's not necessarily the case that dict lookup cares
about __eq__. Similarly, the built-in "min" and "max" don't involve
explicit user "<" or ">" or etc, so __gt__ etc may be irrelevant to them
too. Similarly for list.count() etc. Guido replied to the message in which
I brought those up, but skipped over all issues of that ilk, so I figured he
thought the resolution was so obvious it wasn't worth the time to explain it
<0.9 wink>.

> They will also not be useable in computation requireing
> equality tests, at least without requireing some alternate
> spelling. Do the proponents of this approach consider this
> to be a good thing?

Surely it's a tradeoff. So far most attention has been on the "good stuff",
and believe it or not it really is *very* important to them to be able to
write "a < b" and get back an array. The exact nature of the "bad stuff" is
still unclear, but may not be as bad as you currently think.

or-indeed-as-i-fear<0.5-wink>-ly y'rs - tim

Tim Peters

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

[Jim Fulton]

> I'm surprized that a change that does not make any new
> computation possible, but that merely changes the way
> existing computation is spelled:
>
> a
> rather than:
>
> less(a,b)
>
> is "tremendously useful".

But Jim! This is *Python*! If we didn't care what code looked like, most
of us would probably be hacking in some version of Lisp -- which already
covered most of Python's abstract *semantics* way back when Guido was just a
wee snakelet frolicking in the lush Amsterdam jungle. C.f. Steve Majewski's
posting yesterday with the

( > ( + 2 (iseq 10)) ( iseq 10 ))

etc XlispStat gems. The day beauty and elegant polymorphism cease to be
legitimate debating points here is likely the day Guido reverts to basking
in the sun swallowing the occasional rat.

he's-worked-so-hard-to-become-human<wink>-ly y'rs - tim

Jim Hugunin

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Guido van Rossum wrote:
[me]

> > t1 = f()
> > booland = hasattr(t1, '__booland__')
> > if not booland && not t1:
> > result = t1
>
> Of course you want to skip over the rest here, so insert an "else:"
> here and indent the rest by one tab stop:
>

> > t2 = g()
> > if booland:

> > result = t1.__booland__(t2)
> > else:
> > result = t2
>

> This is definitely more promising. I'm still worried about the code
> bloat. Could you implement this (or something like it) in JPython and

> report on the resulting increase Java bytecode (or .class file) size

> for some typical code? A speed tests would also be nice. (How about
> pystone?)

I'd be happy to do this. It'll have to wait until I get the next release of
JPython out the door though...

-Jim

Phil Austin

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

> Jim Fulton <j...@digicool.com> writes:
>
> > I'm surprized that a change that does not make any new computation
> > possible,
> > but that merely changes the way existing computation is spelled:
> >
> > a >
> > rather than:
> >
> > less(a,b)
> >
> > is "tremendously useful".
>

> It makes an enormous difference in readability of complex expressions.
> Here's an example from real-life code:
>
> return N.logical_and(N.logical_and(N.less_equal(a, x), N.less_equal(x, b)),
> N.less_equal(x, c))
>

As someone who has just taught two upper year courses that use
Python (Atmos. Sci. 301: Atmospheric Radiation and Remote Sensing) and
Atmos. Sci. 405: Cloud Physics and Chemistry), I can confirm that the
current notation is a significant problem for even the better B.Sc.
students. I think that this change (and the development of a
Win95/NT graphics package as capable as gist or plplot) sets
the stage for a slim book with converage of "Numerical Python"
by NumPy people (perhaps DA and FL?), and the broad use of Python in the
Faculty of Science--at UBC at least.

Phil

Paul F Dubois

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Everyone keeps repeating the mantra that boolean arrays don't have a
meaningful truth value. Too much is being made of this. There are
several meaningful choices:

a. true if not empty
b. true if all elements true
c. true if some element true

The fact that someone could then do something dumb is nothing new.
Someone can do a lot more dumb things than that without throwing
exceptions.

Every array language of this type has something equivalent to land(b)
and lor(b) to actually convert conditions (b) and (c) to a simple
boolean. Nothing stops us from doing what we want except the singular
case that the comparison won't let us return an object of our choice.

The consequences of deciding that somehow preventing accidental testing
of truth values for arrays is a major design goal is silly. You're
throwing the baby out with the bath water.

My view is that a Python programmer expects that for a sequence object
the truth value is equivalent to len(s) <> 0. Since any choice will do,
let it be that one. Then Python can proceed in blissful ignorance,
taking the truth value of any object if it wants to.

-- Paul

Guido van Rossum

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

> Of course! There's still "a is b" for identity tests.

Note that 'is' is not overloadable -- it checks whether a and b are
the same object.

I do worry about the requirement to write something like

if alltrue(a == b): ...

instead of

if a == b: ...

I agree that for numeric computation this would be silly, but the
universal array module supports other kinds of computations too (image
processing, for instance) and I don't know if it is a bad idea
everywhere.

Steven D. Majewski

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

On Fri, 24 Apr 1998, Tim Peters wrote:

> > There seems to be an overwhelming view that comparisons
> > on some types should yeild results that have no boolean
> > meaning.
>
> Well, there's a handful (about 5 so far) of NumPy people explicitly in favor
> of that. Steve M weighed in yesterday on the "should return arbitrary
> result" side. You're clearly against it. I haven't declared one way or the
> other, because I don't understand the full implications yet (see bottom).
>
> > For example, people that work with arrays want comparisons
> > to yeild arrays of results of elementwise comparisons.
> > Does this mean that a==b will yield an array too?

Well -- my vote was rather provisional, as I don't understand all of
the implications either. I was pointing out that it seems to work in
XlispStat, and that Python shares with Lisp the notion that there are
not so much Boolean values as Boolean contexts, and both of them
automatically map many values into True and False, sometimes arbitrarily.

I am concerned about having different semantics for arrays versus
other sequence types. ( Do I understand correctly that this divergence
is part of the proposal, and it doesn't include changing lists and
tuples to conform to array semantics ? ) I think that is a big problem
and a potential hole for newbies to fall into if NumPy has different
semantics than the rest of Python. I'm a lot more concerned
about that than the issue of how to interpret "if array1 > array2".
And as I already noted, I'm already concerned about "+" meaning
addition, except for sequences where it means concatenation, except
for arrays where it means addition ( except on alternate Fridays ? :-)
I all in favor of overloading of operators, but where it starts to
become a problem rather than a solution is when the same names are used
for radically different operations.

I doubt that there is a perfect AND backward compatible solution,
( especially thowing generic sequence operations and NaNs into the mix)
so the question is what is the least confusing and arbitrary ?

BTW: I was looking thru my notes from the XlispStat NaN discussion:
There are two types of IEEE NaNs -- signaling and non-signaling.
The signaling type are *really* Not-A-Number -- "1+NaN" should
signal an error. But non-signaling NaNs are really misnamed: they
ARE numbers, ( and in fact in XlispStat ( numberp #.Not-a-number )
is T ) but they are numbers with indefinite values, which like
IEEE positive and negative infinity are "contageous" when used in
computations. Maybe "numeric bottom" is a better name ?

---| Steven D. Majewski (804-982-0831) <sd...@Virginia.EDU> |---
---| Department of Molecular Physiology and Biological Physics |---
---| University of Virginia Health Sciences Center |---
---| P.O. Box 10011 Charlottesville, VA 22906-0011 |---
"Nature and Nature's laws lay hid in night:
God said, let Newton be! and all was light." - pope

Konrad Hinsen

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Guido van Rossum <gu...@CNRI.Reston.Va.US> writes:

> I do worry about the requirement to write something like
>
> if alltrue(a == b): ...
>
> instead of
>
> if a == b: ...
>
> I agree that for numeric computation this would be silly, but the
> universal array module supports other kinds of computations too (image
> processing, for instance) and I don't know if it is a bad idea
> everywhere.

OK, let's sum up the pros and cons:

In favor of returning an array:
- no other operator permits elementwise equality test
- consistency with other comparison operators
- that's what all array languages do

In favor of returning True iff all elements are equal:
- easier to test for all elements being equal

Seems like a clear choice to me.

Steven D. Majewski

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

On Fri, 24 Apr 1998, Guido van Rossum wrote:

> I do worry about the requirement to write something like
>
> if alltrue(a == b): ...
>
> instead of
>
> if a == b: ...
>

BTW: Lisp has SOME and EVERY [and NOTANY, NOTEVERY] ( ANY instead
of SOME would have been more consistent. ) which take a mandatory
predicate as an arg as well as the sequence. If we're going to
have something like this in Python, I'ld make it an optional
trailing predicate arg. ( and the default test for true. )

The other question is whether 'alltrue( a == b )' requires a
sequence of values, or is intended to use where you you might have
a single boolean value and if you don't you want a sequence of
booleans coerced into a single value. I think the latter case also
trails off into a swamp of ambiguity under BOOLEAN CONTEXT not
BOOLEAN VALUE semantics.

Guido van Rossum

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

Generally, I don't mind having things in the language that baffle the
uninitiated for a short while. Most newbies seem to be willing to
find "X doesn't work" on the first time they try X, learn or invent
another way to get the desired effect, and go on with their business.

What potentially concerns me is cases where a newbie can stare at a
seemingly okay piece of code for a looooooooong time, diddle with it
until they feel stupid, and still not understand it. In such cases,
where it makes sense, I'd like to see an exception, since an exception
is the way of Python to tell the user "X doesn't work".

I have a feeling that truth testing of arrays could be one of those
cases, so I vote that array truth testing raise an exception. But
it's up to the NumPy implementers -- Python (already!) doesn'ty
restrict them in the choice of how to define truth testing.

Yonik Seeley

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

In article <twhg3jf...@lmspc1.ibs.fr>,

Konrad Hinsen <hin...@ibs.ibs.fr> wrote:
>It makes an enormous difference in readability of complex expressions.
>Here's an example from real-life code:

Yeah, yeah, try telling that to the Java people..... someone could
overload operator< to erase your hard drive :-)

-Yonik

David Ascher

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

On Fri, 24 Apr 1998, Tim Peters wrote:

> So far that's unclear, and in an interrupted thread with David Ascher
> I'm trying to tease that out. In short, dict lookup doesn't involve an
> explicit user-coded "==", so it's not necessarily the case that dict
> lookup cares about __eq__. Similarly, the built-in "min" and "max"
> don't involve explicit user "<" or ">" or etc, so __gt__ etc may be
> irrelevant to them too. Similarly for list.count() etc. Guido replied
> to the message in which I brought those up, but skipped over all issues
> of that ilk, so I figured he thought the resolution was so obvious it
> wasn't worth the time to explain it <0.9 wink>.

Tim asked a bunch of good questions about how the new comparison
mechanism should interact with the rest of the Python infrastructure.

What I've implemented in my patches is a simple addition, which only
interacts with the Python comparison operators (<, <=, >, >=, !=, ==)
and with the builtin cmp() function if it is called with a third
argument. Everything else in Python which has anything to do with
comparisons still uses the *exact same* mechanism as is currently in
place. Meaning that objects are sorted in lists, tested for
membership in sequences, etc. based on their tp_cmp/__cmp__/__rcmp__
behavior.

Currently I find references to PyObject_Compare in:

classobject.c, dictobject.c, funcobject.c, listobject.c,
methodobject.c, object.c, tupleobject.c, cPickle.c, arraymodule.c

in addition to the ones in object.c, ceval.c, compile.c and
bltinmodule.c (which are sort of obvious =). There are obviously lots
of calls to it in extension modules around the galaxy.

It is an interesting question how much the new behavior should be used
by the existing code. In the spirit of minimal changes, the new
proposal doesn't require anything to be modified. The cost, however,
is that list.sort() is no longer intuitively explainable by appealing
to the '<' operators. In fact, with the proposal as it stands it is
possible to have objects which give one order if compared with
"cmp(a,b)" (+ the usual interpretation of the return codes) and
another order if compared with "a < b".

Calls to cmp() with three operands and PyObject_RichCompare on objects
which don't support them raises exceptions (in my implementation).

There are three possible approaches I can think of:

1) Leave everything as is
+ minimal changes to the core
+ everything works the way it does and is documented, etc, except
for the operators and the cmp() function (and the
PyObject_RichCompare in the C/API).
+ allows rich comparisons for objects that need it
- some discipline is required to make sure that objects compare
in reasonable ways. type/class designers must make sure to
implement tp_cmp/__cmp__/__rcmp__ if they want their objects to
have an ordering relation, and the __lt__ etc. for operators.

This introduces a split between the ordering relation and object
comparison from within the Python core on one side and the object
comparison from the Python code and rich-comparison aware future
extension code. The split between the ordering relation and
comparisons doesn't bother me deeply, but the split between
Python-internal and Python-scripts does worry me.

2) Change PyObject_Compare to call PyObject_RichCompare and do a
conversion to the old return codes (0,1,-1) based on some arbitrary
(but probably reasonable) protocol. Something like (in a Python
dialect of C):

PyObject_Compare_Old = PyObject_Compare

def PyObject_Compare(a, b):
if hasattr(type(a), 'tp_richmp') or hasattr(a, '__lt__'): # =)
if PyObject_IsTrue(PyObject_RichCompare(a, b, '<')):
return -1
elif PyObject_IsTrue(PyObject_RichCompare(a, b, '>')):
return 1
elif PyObject_IsTrue(PyObject_RichCompare(a, b, '==')):
return 0
else:
raise ComparisonException # ?
else:
return PyObject_Compare_Old(a,b) # use old version

Ignoring for now the performance questions (Jim Hugunin can be
called in if that's a concern =), this means that existing code
will use the new comparison operators instead of the old ones if
they're available. Note that the performance concern is very real
for list sorting for example.

+ Changes localized to the same files as #1
+ Allows smoother migration path towards the new protocol?
- Possibly significant performance hit

3) Rework the core to use the rich comparison mechanism everywhere
explicitely, leaving the old mechanism (using the mechanism in #2)
for preexisting extension types and classes.

In other words, replace calls to PyObject_Compare which are used to
test for equality with calls to PyObject_RichCompare(,,'==') and
the subsequent test for truth. This might get really messy.

+ purest solution [clean, consistent in the long term, efficient?]
- purest solution [in the short term, a lot of headaches]

A blend is also possible, so that the two notions of ordering (as used
in list.sort) and comparisons are neatly split between the two
mechanisms (see below).

Note that these issues arise *regardless* of whether return values
have meaningful boolean interpretations (Jim Fulton's point), and in
fact regardless of whether they are PyObject*'s or (0,1) (my
mischaracterization of Jim Fulton's point), but in fact occur due to
the split of a single comparison function into multiple functions.
The truth-testability of the return values of comparisons affects how
error handling in comparison works, which affects all the code which
calls it, but dealing with exceptions and comparisons is an issue
which has existed since 1.5 allowed comparisons to raise exceptions.

Orthogonally, one can look at things on a case by case basis as Tim
started to:

* list.sort(): My guess about this is that tp_cmp should be used if it
exists, since it does the job that's needed. So it should be:
if hasattr(type(foo), 'tp_cmp') or hasattr(foo, '__cmp__):
return cmp(foo, bar)
else:
the rigamarole described in #2 above.

* Similarly for dict[key], list.index, list.remove, list.count --
since the issue is simply one of equality, tp_cmp/__cmp__ should be
given the chance to give a quick answer. If it isn't defined, then
__eq__ should be tried.

* Comparison between instances isn't a real problem I don't think --
if it's done by the comparison operators or cmp() with three
arguments, the rich comparison is tried, and if it fails the old
comparison is used. While the code is a bit messy, i don't think
your (Tim) cases are really problematic (C defining __lt__ and __cmp__,
D not defining anything, etc.). But yes, the rules need to be made
explicit, no doubt about that. The current rules re: when __rcmp__
is called are pretty bizarre already!

In other words, I think there's a pretty clear split between the cases
where either equality or ordering relations are needed, and where true
'explicitly specified' comparisons are being used. I'm sure some will
disagree with that last statement if not the rest of the post...

--david

PS: This is orthogonal to numpy-specific questions, such as whether arrays
should have a truth value and if so what it should be.

I think it would be beneficial to keep the Python-core issues (rich
comparison) separate from the numpy-specific issues (how arrays should
work).

PPS: I was wondering why no one commented on my first proposal while
the only slightly modified version caused this flood. I've
decided it was the redesign of the web page, or maybe the fact
that I asked Jim Fulton for his comments...

Guido van Rossum

unread,

Apr 24, 1998, 3:00:00 AM4/24/98

to

> Tim asked a bunch of good questions about how the new comparison
> mechanism should interact with the rest of the Python infrastructure.
>
> What I've implemented in my patches is a simple addition, which only
> interacts with the Python comparison operators (<, <=, >, >=, !=, ==)
> and with the builtin cmp() function if it is called with a third
> argument. Everything else in Python which has anything to do with
> comparisons still uses the *exact same* mechanism as is currently in
> place. Meaning that objects are sorted in lists, tested for
> membership in sequences, etc. based on their tp_cmp/__cmp__/__rcmp__
> behavior.

I don't recall the details of your proposal, but I think that
PyObject_Compare() should see if the object implements <, <=
etc. (tp_richcmp) iff it doesn't have a tp_cmp pointer. The question
is which rich operations to use, in which order to test them, and how
to map between them; e.g. if a>b is not defined, do you use b<a, or
"not a<=b" ? Types for which this mapping is not the right thing to
do should define all six operators or at least tp_cmp. My choice: if
a>b is not defined, first try b<a, then try not a<=b. (This order is
because the "not" operator loses information in case the result is not
a Boolean.) Rules for the other three inequality ops should be
obvious now. If a==b is not defined, first try not a!=b, then "not
(ab)" (with the customary mappings for < and > if needed).
(Here the order is because we prefer to make one call rather than
two.) Concluding, you can get away with only defining one of < <= >
>= to fully define a comparison -- but only if your type implements a
total ordering. (This is all subject to scrutiny by Tim Peters. :-)

Hm, there may be additional complexity because of the coercion rules
for binary operators. Later, please!

> Currently I find references to PyObject_Compare in:
>
> classobject.c, dictobject.c, funcobject.c, listobject.c,
> methodobject.c, object.c, tupleobject.c, cPickle.c, arraymodule.c
>
> in addition to the ones in object.c, ceval.c, compile.c and
> bltinmodule.c (which are sort of obvious =). There are obviously lots
> of calls to it in extension modules around the galaxy.
>
> It is an interesting question how much the new behavior should be used
> by the existing code. In the spirit of minimal changes, the new
> proposal doesn't require anything to be modified. The cost, however,
> is that list.sort() is no longer intuitively explainable by appealing
> to the '<' operators. In fact, with the proposal as it stands it is
> possible to have objects which give one order if compared with
> "cmp(a,b)" (+ the usual interpretation of the return codes) and
> another order if compared with "a < b".

So we need to define which comparison list.sort() uses. On the one
hand, it should use cmp(), since its optional argument is a function
with the same signature. On the other hand, it could use < with very
little change -- internally, is always uses < (or not >=) except in
insertionsort(), which could easily be recoded to also use <. I think
it would be nice if a type would be sortable as long as it defined the
< operator. (I presume that if richcompare finds no tp_richcmp, it
falls back on using tp_cmp.)

> Calls to cmp() with three operands and PyObject_RichCompare on objects
> which don't support them raises exceptions (in my implementation).

I think this should be changed so that PyObject_RichCompare() falls
back on PyObject_Compare(), and vice versa (they should watch out for
recursion :-).

This is sort of what I proposed. If performance is wanted, a type
should implement all operations -- end of problem.

> + Changes localized to the same files as #1
> + Allows smoother migration path towards the new protocol?
> - Possibly significant performance hit
>
> 3) Rework the core to use the rich comparison mechanism everywhere
> explicitely, leaving the old mechanism (using the mechanism in #2)
> for preexisting extension types and classes.
>
> In other words, replace calls to PyObject_Compare which are used to
> test for equality with calls to PyObject_RichCompare(,,'==') and
> the subsequent test for truth. This might get really messy.

Hm... Maybe two C level APIs are needed -- one that makes a rich
comparison and returns a truth value (or an exception); and one that
makes a rich comparison and returns an object. The VM uses the latter
when comparison operators are used; list.sort() and min() and the code
for "object in sequence" can use the former.

> + purest solution [clean, consistent in the long term, efficient?]
> - purest solution [in the short term, a lot of headaches]
>
> A blend is also possible, so that the two notions of ordering (as used
> in list.sort) and comparisons are neatly split between the two
> mechanisms (see below).

I think that the mutual fallback between PyObject_RichCompare and
PyObject_Compare would neatly take care of a gradual transition.

> Note that these issues arise *regardless* of whether return values
> have meaningful boolean interpretations (Jim Fulton's point), and in
> fact regardless of whether they are PyObject*'s or (0,1) (my
> mischaracterization of Jim Fulton's point), but in fact occur due to
> the split of a single comparison function into multiple functions.
> The truth-testability of the return values of comparisons affects how
> error handling in comparison works, which affects all the code which
> calls it, but dealing with exceptions and comparisons is an issue
> which has existed since 1.5 allowed comparisons to raise exceptions.

Indeed. Experiments with trying to support chained comparisons and
Boolean operators (and, or, not) on arrays can be postponed till a
later point, since these cause code bloat and performance loss.

> Orthogonally, one can look at things on a case by case basis as Tim
> started to:
>
> * list.sort(): My guess about this is that tp_cmp should be used if it
> exists, since it does the job that's needed. So it should be:
> if hasattr(type(foo), 'tp_cmp') or hasattr(foo, '__cmp__):
> return cmp(foo, bar)
> else:
> the rigamarole described in #2 above.

I guess this is what I discussed above. (Sorry for the random access
treatment of your message :-( )

> * Similarly for dict[key], list.index, list.remove, list.count --
> since the issue is simply one of equality, tp_cmp/__cmp__ should be
> given the chance to give a quick answer. If it isn't defined, then
> __eq__ should be tried.

Actually, since for some types __eq__ can be implemented cheaper than
__cmp__, I would propose to try __eq__ first.

> * Comparison between instances isn't a real problem I don't think --
> if it's done by the comparison operators or cmp() with three
> arguments, the rich comparison is tried, and if it fails the old
> comparison is used. While the code is a bit messy, i don't think
> your (Tim) cases are really problematic (C defining __lt__ and __cmp__,
> D not defining anything, etc.). But yes, the rules need to be made
> explicit, no doubt about that. The current rules re: when __rcmp__
> is called are pretty bizarre already!

:-) :-(

> In other words, I think there's a pretty clear split between the cases
> where either equality or ordering relations are needed, and where true
> 'explicitly specified' comparisons are being used. I'm sure some will
> disagree with that last statement if not the rest of the post...

I think we're pretty close to being able to produce a working
implementation!

> --david
>
> PS: This is orthogonal to numpy-specific questions, such as whether arrays
> should have a truth value and if so what it should be.

As I said before, I think they should raise an exception. At least
that way a<b<c on arrays raises an exception instead of silently
returning b<c (which can be very tricky to detect).

> I think it would be beneficial to keep the Python-core issues (rich
> comparison) separate from the numpy-specific issues (how arrays should
> work).

I think the above proposal takes very good care of this.

> PPS: I was wondering why no one commented on my first proposal while
> the only slightly modified version caused this flood. I've
> decided it was the redesign of the web page, or maybe the fact
> that I asked Jim Fulton for his comments...

Yes -- we're all attracted to a good flame war :-)

David Ascher

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

On Fri, 24 Apr 1998, Guido van Rossum wrote:

> I think we're pretty close to being able to produce a working
> implementation!

I'll try and implement what we've apparently settled on over the weekend,
and I'm sure that will raise a whole other set of issues.

--david

Jim Fulton

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

Frankly, I'm amazed. I should be speechless. I almost am. :)

One final plea. Please consider creating new operators
for "rich comparison". I think that operators should only be used
when the semantics are very clear. Operator overloading should
preserve semantics of operators. (I think that providing
multiple *meanings* for '+' and '*' was a mistake.)
The proposal for rich comparisons strips clear meaning from
comparisons. By creating new operators you will
still get the terseness that the numeric folks want, without
compromizing the meaning of traditional comparisons.

(Of course, I think that the current comparison mechanaism
needs to be cleaned up, as I indicated in my original
counter proposal. I am not advocating the status quo.)

Jim

Jim Fulton

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

David Ascher wrote:
>
> PPS: I was wondering why no one commented on my first proposal while
> the only slightly modified version caused this flood. I've
> decided it was the redesign of the web page, or maybe the fact
> that I asked Jim Fulton for his comments...

I wasn't originally going to comment, due to lack of bandwidth on
my part. When you asked, I fealt I owed it to you to share my
strongly held views, which ellicited a strong response.

Jim Fulton

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

Guido van Rossum wrote:
>
> > PPS: I was wondering why no one commented on my first proposal while
> > the only slightly modified version caused this flood. I've
> > decided it was the redesign of the web page, or maybe the fact
> > that I asked Jim Fulton for his comments...
>
> Yes -- we're all attracted to a good flame war :-)

Was this a flame war? What constitutes a flame war? Isn't there,
or shouldn't there be, a difference between a discussion about differing
views on a subject and a flame war?

Gordon McMillan

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

David Ascher wrote:

> PPS: I was wondering why no one commented on my first proposal while
> the only slightly modified version caused this flood. I've
> decided it was the redesign of the web page, or maybe the fact
> that I asked Jim Fulton for his comments...

Naw, the whole reason was to stuff my mailbox with 525 messages (I
was out of town for a week ). I've scanned this thread, but not being
a Numpy user (I know it doesn't stand for "Not under my piano,
Yoko"), I'm rather bewildered.

Random comments:

For thingies that are naturally well-ordered, overriding _cmp_ and
only _cmp_ is rather cool - all kinds of things automatically work.
But where the thingies only have a partial ordering, _cmp_ doesn't
seem to cut it. OTOH, for my needs, this is rare.

Steve Majewski comments that the overrides of * and + for sequences
don't seem natural. To me, they come close enough to the "definition"
of these operators to seem natural. Addition has an identity element
E, for which A + E = A ( so E is an empty sequence). Multiplication
behaves properly with both 0 and 1.

I'll agree that "s[not data < 100] = 100" is more readable than
"s[logical_not(logical_lt(data, 100))] = 100", but neither of them
are comprehensible to me. As a NAU (non-array-user), I expect
comparison results to behave like booleans.

I'll certainly agree that comparisons are nearly always problematic:
"equal" really means "in some sense the same, but maybe not the sense
you were hoping for", or, more succinctly, "is confused with".

I think I would prefer that some other notation be used for this
something-that-is-in-some-way-reminiscent-of-a-normal-comparison
operation. OTOH, many advances in mathematics have grown out of
something that started as abuse of notation.

So, as long as it doesn't interfere with my normal, pedestrian usage
of comparisons, the proposal is fine by me. And if it helps me deal
with those rare situations in which I want to deal with a partial
ordering, that's dandy.

But I do find it rather perverse. My mama told me you can't compare
apples and oranges, and here these people are, not only comparing
them, but returning grapes from the comparison. Sheesh.

- Gordon

Konrad Hinsen

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

yse...@Xenon.Stanford.EDU (Yonik Seeley) writes:

> Konrad Hinsen <hin...@ibs.ibs.fr> wrote:
> >It makes an enormous difference in readability of complex expressions.
> >Here's an example from real-life code:
>
> Yeah, yeah, try telling that to the Java people..... someone could
> overload operator< to erase your hard drive :-)

Any feature can be misused. As long as misuse is not encouraged
or even enforced by some other language features, I don't care.

Konrad Hinsen

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

Jim Fulton <j...@digicool.com> writes:

> The proposal for rich comparisons strips clear meaning from
> comparisons. By creating new operators you will

Do they? I'd say they are a straightforward generalization of the
current behaviour/possibilities.

The proposed semantics for array comparisons are a different matter;
there would indeed be different semantics for arrays and other
sequences. But that's already true for arithmetic (+ and *).

> still get the terseness that the numeric folks want, without
> compromizing the meaning of traditional comparisons.

A new set of operators for "elementwise comparison" is certainly
not a bad idea. I just can't think of suitable character sequences
that would still keep Python code distinct from Perl code ;-)

David Ascher

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

On Sat, 25 Apr 1998, Jim Fulton wrote:

> One final plea. Please consider creating new operators for "rich
> comparison".

I certainly will, for one. I don't consider this settled by any means.
Keep pleading (I seriously doubt you're the only person who feels the way
you do on this issue).

I do have one question -- if comparisons can return anything, how can the
infrastructure enforce that they have meaningful boolean interpretations?

If there were two sets of operators, those which do "standard" comparison
and those which do the array comparisons (aka comparisons with
type-defined semantics), it would seem to me that it might make sense to
require the former to always return 0 or 1 (or exceptions), and let the
latter have the PyObject * return values. This would allow speeding up of
the algorithm Guido proposed for 'smarts' in the comparison functions, and
would allow using the signature of the functions to enforce the semantics
of "logical" comparison. The logical extension of this split is to also
split the 'not', 'and' and 'or' and allow some new version with the
semantics defined by the types they are applied to.

It's a change to the syntax, but it's one I'd be willing to live with. I
don't particularly care whether I have to type 'a < b' or 'a <_ b'. It is
the distinction between operator and function which I think makes a big
human factors difference.

I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I think of them
as using the TeX-like '_' subscript character, which will be familiar to
at least some of the target population, and gets at the notion that the
operators apply to the 'subscripted' elements of the operands.

I guess there should also be not_, and_ and or_, and maybe the special
methods should be __sub_lt__, __sub_le__, etc., and __sub_and__,
__sub_or__ etc.

--david

PS: No, I don't think it's been a flame war at all. In fact, I think
we've made real progress regardless of the contentious issues on
defining a common ground for a reworked comparison protocol.

David Ascher

unread,

Apr 25, 1998, 3:00:00 AM4/25/98

to

On Sat, 25 Apr 1998, David Ascher wrote:

> I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I think of them
> as using the TeX-like '_' subscript character, which will be familiar to
> at least some of the target population, and gets at the notion that the
> operators apply to the 'subscripted' elements of the operands.

Except that they're probably ambiguous: is (a<_b) (a < _b) or (a <_ b)?

So how about:
<[], >[], <=[], ==[] !=[]
or
[<], [<=], [==], or [!=]
or
*<, *<=, *==, *!=
?

--da

Konrad Hinsen

unread,

Apr 26, 1998, 3:00:00 AM4/26/98

to

David Ascher <d...@skivs.ski.org> writes:

> So how about:
> <[], >[], <=[], ==[] !=[]
> or
> [<], [<=], [==], or [!=]
> or
> *<, *<=, *==, *!=
> ?

I like the middle one best - it indicates rather clearly what's going
on.

Tim Peters

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

[David Ascher suggests ...]

> I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I
> think of them as using the TeX-like '_' subscript character,
> which will be familiar to at least some of the target
> population, and gets at the notion that the operators apply
> to the 'subscripted' elements of the operands.

[and then flames himself <wink>]

> Except that they're probably ambiguous:
> is (a<_b) (a < _b) or (a <_ b)?

Yes, that won't fly. Subtler but similar problem with the next one:

> So how about:
> <[], >[], <=[], ==[] !=[]

I.e., "a<[]" already has a meaning on its own, and it would likely be too
hard for the tokenizer *not* to "see" that meaning given "a<[]b". Ah ...
and "a<[][:]" would be truly ambiguous.

> or
> [<], [<=], [==], or [!=]

Those are beautiful! And mnemonic. And easy to remember. And they'd fly
(easy for the tokenizer).

> or
> *<, *<=, *==, *!=

Bleech. Those all require using a shift key <wink>.

I've got no theoretical or esthetic objections to comparisons returning
arbitrary objects, but if the tide favors new operator symbols, the [<]
flavor gets my vote.

elegance-is-its-own-reward-ly y'rs - tim

Tim Peters

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

[Steven D. Majewski]
> ...

> I am concerned about having different semantics for arrays
> versus other sequence types.

Frankly, I don't know what "sequence type" *means* anymore -- and that was
before rich comparisons came up. I've written many of my own "sequence
type" classes, and all they have in common is __getitem__. So 3/4ths of my
brain no longer believes "sequence type" means more than that a thing
implements the (conceptual) Indexable and ForInIterable interfaces.

Python *really* needs to get a firmer grip on its "folklore" type system
<0.7 wink>.

But in the meantime, I've lost any belief that a "sequence type" needs to
support, e.g., some notion of addition, or "meaningful" comparison. In
particular, the notion that a NumPy array is of a sequence type strikes me
as pragmatic fiction embraced for lack of sharper terminology, so it doesn't
bother me at all if NumPy arrays fail to act like strings in any respect
whatsoever <0.3 wink>. Why on earth should they? If it's because the type
system says "because they're both sequence types", then the type system (or
our abuse of it) is the problem.

Anyway, the suggestion is that you temporarily forget that an array is a
sequence type, and then see if it still looks so concerning.

> ( Do I understand correctly that this divergence is part
> of the proposal, and it doesn't include changing lists and
> tuples to conform to array semantics ? )

Right, it couldn't possibly include changing the semantics of list, tuple or
string comparisons -- that would break enormous wads of code all over the
world. David wouldn't do that to us; and Guido would kill him if he tried
<wink>.

> I think that is a big problem and a potential hole for newbies

> to fall into if NumPy has different semantics than the rest of
> Python.

I don't. Everything's a disconnected jumble to newbies; the notion that an
array has-- or even *could* have --semantics in common with a tuple is a
sophisticated view. For NumPy newbies in particular, they really are aiming
to make NumPy act more like Every Other Array Language on Earth, and in that
way are clearly trying to *reduce* newbie floundering.

> I'm a lot more concerned about that than the issue of how
> to interpret "if array1 > array2". And as I already noted,
> I'm already concerned about "+" meaning addition, except
> for sequences where it means concatenation, except for arrays
> where it means addition ( except on alternate Fridays ? :-)

But Steven! You're a multi-lingual programmer of no small accomplishment.
You routinely use "+" to mean two dozen things in four languages every week.
What's the problem? You're not unique in your ability to grasp that the
same symbol can mean very different things depending on context -- no
programmer can succeed without that twist of mind. I can't say I had any
difficulty at all with the notion that [1,2,3] + [4,5,6] doesn't return
[5,7,9], and won't believe you if you claim that you did <wink>.

> I all in favor of overloading of operators, but where
> it starts to become a problem rather than a solution
> is when the same names are used for radically different
> operations.

This is interesting. My problem with overloading is exactly the opposite:
when the same name is used for *similar*-- but subtly different --purposes.
E.g., if "ab" + "cd" returned "abcd" but [1,2] + [3,4] returned [(1,2),
(3,4)], I could never keep it straight. If they did *radically* different
things, I'd have no trouble at all provided only that the things they did
made good sense for their operand types. E.g., it wouldn't have bothered me
if [1,2] + [3,4] *did* return [4,6] -- but it would have driven me nuts if
"ab" + "cd" returned

string.join([chr(ord('a')+ord('c')),
chr(ord('b')+ord('d'))])

Maybe that's a difference in perspective: when I see an operator symbol, I
think of it as a message to its operands, and as always it's the operands
that decide what to do with the message. The message is just a name, and
like any name is entitled to be ambiguous as hell, while nevertheless
*suggesting* its purpose in life (catenation is a *fine* thing for "+" to
mean; subtraction isn't). The notion that an operator symbol has some
concrete meaning independent of its operand types is both foreign to me and
seems to be the way everyone else thinks of it <wink/frown>.

> ...

> There are two types of IEEE NaNs -- signaling and non-signaling.
> The signaling type are *really* Not-A-Number -- "1+NaN" should
> signal an error.

More precisely, it should signal the Invalid Operation exception, and the
user is supposed to be able to decide "globally" whether or not an Invalid
Operation exception actually triggers a visible error. Most of this stuff
wasn't intended for casual, or even end-user, use, BTW -- a lot of the 754
hair is there for the benefit of library implementers. You'll do the person
who follows you a huge favor by leaving most of it alone <stern wink>.

> But non-signaling NaNs are really misnamed: they
> ARE numbers,

Arbitrary -- define "number".

> ( and in fact in XlispStat ( numberp #.Not-a-number )
> is T )

Agreed that's one possible answer <wink>.

> but they are numbers with indefinite values, which like
> IEEE positive and negative infinity are "contageous"
> when used in computations.

The infinities aren't contagious except in that they often appear that way
due to to their large size <wink>. E.g., 1/+Inf = +0, and Inf - Inf = NaN.

Oddly enough, it's a subtle error to think of NaNs being contagious too:
the consensus among libm writers (blessed, as I recall, by Kahan Himself) is
that

pow(x, 0) == 1

even when x is NaN.

Here's a cute one with Python 1.5.1 on Win95:

>>> inf = 1e300 ** 2
>>> inf
1.#INF
>>> nan = inf - inf
>>> nan
-1.#IND
>>> pow(nan, 0)
1.0
>>> import math
>>> math.pow(nan, 0)
Traceback (innermost last):
File "<stdin>", line 1, in ?
ValueError: math domain error
>>>

Not even consistent within the same language for two functions named "pow":
now *that's* what I consider to be a *baffling* kind of overloading <0.9
wink>!

> Maybe "numeric bottom" is a better name ?

Except for crud like the preceding, it sure would have been <0.1 wink>.

every-case-is-special-ly y'rs - tim

Tim Peters

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

[Guido]

> ...
> Yes -- we're all attracted to a good flame war :-)

[Jim]

> Was this a flame war?

I've been trying now, for years and years, to teach you people by FINE
EXAMPLE that ":-)" is too easy to overlook in the heat of a flame war
<wink>. I've been considering expanding it to something like
WINK HERE --> ***<0.97 !!!WINK!!!>*** <-- THAT WAS A WINK
because even a full 1.0 "<wink>" gets missed too often. What Guido meant to
write was:

> Yes -- we're all attracted to a good flame war WINK HERE -->
> ***<FULL 1.0 !!!WINK!!!>*** <-- THAT WAS A FULL WINK

> What constitutes a flame war?

Repetition, exponentially increasing volume, ad hominen attacks, massively
indiscriminate quoting and/or cross-posting, SHOUTING, meta-discussion, and
a dearth of <wink>s are indications, but not normative. You're not really
there until somebody who isn't a lawyer does a bad impression of one ("your
lies about me are slander under US Title blah blah blah" -- really an
extreme case of meta-discussion).

> Isn't there, or shouldn't there be, a difference between
> a discussion about differing views on a subject and a
> flame war?

Sorry, Jim: that's meta-discussion. Stop fanning the flames before I sue
you for libel ;-)

"differing-views"-ha!-that'll-be-the-day-ly y'rs - tim

Tim Peters

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

[Steven D. Majewski]
> ...
> I've also run into the problem that IEEE NaNs are very useful
> for missing values and some other things

Indeed they are!

> -- they function as numbers, in that they don't raise
> exceptions when used in numeric operations,
> but they are "contageous" - i.e. 1.0 + NaN = NaN
>
> However, the one problem is that Xlisp, like Python does it's
> comparisons internally with a function that returns one of:
> ( -1, 0, +1 ) and this doesn't map into using NaN in these
> functions, where you want NaNs to not be ordered. ( i.e. they
> should be not equal, not less than, and not greater than than
> any value including NaN. )

As you note later, there is no way to deal with NaNs short of tricks
specific to language+vendor combinations. For that matter, there's no
better way to deal with any of the other 754 features either ...

Note too that 754 wants more comparison outcomes than just {<, >, ==,
unordered}: most of the supported ones are supposed to come in two flavors,
depending on whether or not they raise an Invalid Operation exception if fed
a *quiet* NaN (as opposed to a *signaling* NaN, which is always supposed to
gripe).

In my previous job, I had to figure out what the hell the compiler should do
for Fortran's 3-branch "arithmetic IF" when it was fed a quiet NaN
<wink/snort -- of course there isn't a non-insane answer to that one; I
can't even remember what I settled on>.

Would have been much more useful had 754 defined NaN to be greater than any
other value when it was stored at the right end of an array, but smaller if
stored at the left <wink>.

> You may not agree that the above semantics are the "right"
> ones --

Seriously, in practice the "non orderedness" of NaNs appears useless even if
you do use a language drizzled with 754 extensions -- about all I've seen it
used for is the sickening trick of using "x == x" to determine non-NaN-ness
and "x != x" to determine NaN-ness. That much could have been accomplished
with an isNaN() function without violating all common sense.

> both the IEEE FP and the ANSI Lisp and C standards
> purposely sidestep some of these issues.
> ...

Not true of IEEE-754 -- it doesn't sidestep anything, although it's
important to realize that 754 says nothing at all about language bindings.
It just defines the arithmetic.

denormally y'rs - tim

Fredrik Lundh

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

>I've been trying now, for years and years, to teach you people by FINE
>EXAMPLE that ":-)" is too easy to overlook in the heat of a flame war
><wink>. I've been considering expanding it to something like
> WINK HERE --> ***<0.97 !!!WINK!!!>*** <-- THAT WAS A WINK
>because even a full 1.0 "<wink>" gets missed too often. What Guido meant to
>write was:
>
>> Yes -- we're all attracted to a good flame war WINK HERE -->
>> ***<FULL 1.0 !!!WINK!!!>*** <-- THAT WAS A FULL WINK

<wink strength=0.5>
IMO, the only solution is to switch to HTML-mail, and use
software that fully implement the <wink> tag:

<wink [strength=value]> potentially controversial text </wink>

Like our new MIOW system, which among other things include a wink
tolerance setting (if you set it to zero, you won't see any controversial
text), and flame-avoiding reply functions ("are you sure you really want
to reply to a half-serious post?")
</wink>

Cheers /F

Just van Rossum

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

>[Steven D. Majewski]

>> I think that is a big problem and a potential hole for newbies
>> to fall into if NumPy has different semantics than the rest of
>> Python.

[Tim Peters]

>I don't. Everything's a disconnected jumble to newbies; the notion that an
>array has-- or even *could* have --semantics in common with a tuple is a
>sophisticated view. For NumPy newbies in particular, they really are aiming
>to make NumPy act more like Every Other Array Language on Earth, and in that
>way are clearly trying to *reduce* newbie floundering.

I was a total NumPy newbie 1.5 weeks ago (I am only a 96.5% NumPy
newbie now ;-), and about the first thing I tried was to concatenate two
arrays using +. A brief "huh?" was soon followed by an aha-ish "doh!".
I hasn't bothered me since that + means two totally different things for
strings/lists/tuples and arrays. To the contrary!
(I just wish I knew how to really concatenate NumPy arrays...)
(and it *does* bother me that array1[:1] = array2[:4] doesn't
work...)

Just

krod...@tdyryan.com

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

In article <Pine.SUN.3.96.980425...@skivs.ski.org>,

David Ascher <d...@skivs.ski.org> wrote:
> I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I think of them
> as using the TeX-like '_' subscript character, which will be familiar to
> at least some of the target population, and gets at the notion that the
> operators apply to the 'subscripted' elements of the operands.

The two array-oriented languages I am most familiar with - Xmath and Matlab -
use "." to indicate elementwise operations instead of normal matrix
operations; maybe ".<", ".>", etc. ?
--
Kevin Rodgers Teledyne Ryan Aeronautical krod...@tdyryan.com

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading

danny_...@gmo.com

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

I am extremely interested in the outcome of the rich comparison debate(I'm a
big vectorization fan and love parts of Matlab for this reason)and would like
to see something implemented. I have also kept my mouth shut for the better
part of the debate for good reason.

I think defining new operators for the rich comparisons is an acceptable idea.
My only comment is that python does not live in a vacuum and in the interest
of notational uniformity (there are lots of Matlab users out there) I would
suggest a slight notational change from David Ascher's proposal. Matlab
preceeds element-wise operations with a "." as it ".*",".^". While Matlab
does not have (or need) element-wise comparisons I think we should stay
consistent with that notation so that when people switch from Matlab to NumPy
or use both together (as I do, via COM) there is a minimum of syntactic
difference.

Therefore I propose .<, .>, .<=, etc...

Danny

Steven D. Majewski

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

On Mon, 27 Apr 1998, Tim Peters wrote:

> I've been trying now, for years and years, to teach you people by FINE
> EXAMPLE that ":-)" is too easy to overlook in the heat of a flame war
> <wink>. I've been considering expanding it to something like
> WINK HERE --> ***<0.97 !!!WINK!!!>*** <-- THAT WAS A WINK
> because even a full 1.0 "<wink>" gets missed too often. What Guido meant to
> write was:
>
> > Yes -- we're all attracted to a good flame war WINK HERE -->
> > ***<FULL 1.0 !!!WINK!!!>*** <-- THAT WAS A FULL WINK
>

Maybe you need to post in HTML, Tim, so you can use a <BLINK><WINK> !

hug...@cnri.reston.va.us

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

In article <Pine.SUN.3.96.980425...@skivs.ski.org>,
David Ascher <d...@skivs.ski.org> wrote:
>

> On Sat, 25 Apr 1998, David Ascher wrote:
>
> > I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I think of them
> > as using the TeX-like '_' subscript character, which will be familiar to
> > at least some of the target population, and gets at the notion that the
> > operators apply to the 'subscripted' elements of the operands.
>

> Except that they're probably ambiguous: is (a<_b) (a < _b) or (a <_ b)?
>

> So how about:
> <[], >[], <=[], ==[] !=[]

> or
> [<], [<=], [==], or [!=]

> or
> *<, *<=, *==, *!=

This has generated enough positive responses that I wanted to throw in my own
two cents on the other side.

I really dislike the idea of adding a significant number of new operators to
the Python language. Once you agree on [<], I think that I could make a good
case for why you need [*] (to settle the old matrix vs. element-wise
multiplication issue), and etc...

My primary reason for wanting to be able to overload "<", ">", ... is that it
seems consistent with the rest of Python's design. If there's an __add__
method to implement "a+b", then there should be an "__lt__" method to
implement "a<b". After reading enough of Jim Fulton's arguments, I'm
beginning to think he'd like to get rid of "__add__" methods from the
language if it was possible <0.5 wink>.

Jim Hugunin - hug...@python.org

Steven D. Majewski

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

Tim and Just:

I concede that it's difficult to recapture the "newbie" mindframe
and figure out what will be clear or confusing if you didn't know
what you already know. ( Maybe I'm the one having trouble forgetting
what I think I know about sequences that just isn't so. )

Although it might have appeard that I was arguing for some sort of
theoretical purity, I did say that the pragmatic question at the
moment is what is the _least_ arbitrary and confusing choice. In the
absense of a grant to do a double blind test on a hundred newbies,
we have to guess from anecdote and rules of thumb. ( And it seems
that we agree about the original "rich comparison" issue. )

As to *my* problems with the syntax -- I agree that it isn't a big
deal. I'm quite confortable believing a half dozen arbitrary
and inconsistent things every morning. But I do think that the load
is cumulative. It is, in fact, the collective cognitive burden of many
tiny little inconsistencies in both Perl and Lisp, to give two very
differnet examples, that makes me prefer Python to either. One of
the implicit promises of O-O was to bring some order and reduce
the complexity and cognitive load of large Common-Lisp-sized libraries.

Guido van Rossum

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

I've tried to keep my mouth shut in the recent debate (is that a good
way to refer to it? :-). But the various calls for special operators
disturb me. Please, think again. Adding extra operators to the
language is a lot of hassle and won't happen any time soon. I promise
:-). Make do with the existing ones. Overloading '<' is okay.
Raising an exception from it is okay. Adding a third argument to
cmp() is okay. Adding a new extended comparison C API is okay.
Adding a new field to the type structure is okay. But adding new
syntax to the language is NOT OK. OK?

Danny Shevitz

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

Guido,

thanks for the letter. I think overloading is a much better way to go,
too. I guess I wasn't clear, but my point
was simply that if operators had to be added, a syntax consistent with
Matlab is better.

Privately, given my druthers, I'd rather use the syntax:

vec( 1<vec<10 ) = 5

than some funkified new operators. And if chained comparisons turned out
to be too much of a headache
I would easily and gladly settle for

mask = 1<vec & vec<10
vec(mask) = 5

which is as concise as matlab can be made.

Since this is a personal letter, I do hope you do something to enable
syntactically concise "rich comparisons". I will defer to the public
will as to what it is, aka, keep my mouth shut, but I hope SOMETHING
happens. Vectorized comparisons would be a BIG benefit to the language.

Danny

Amit Patel

unread,

Apr 27, 1998, 3:00:00 AM4/27/98

to

Tim Peters <tim...@msn.com> wrote:
|
| This is interesting. My problem with overloading is exactly the opposite:
| when the same name is used for *similar*-- but subtly different --purposes.
| E.g., if "ab" + "cd" returned "abcd" but [1,2] + [3,4] returned [(1,2),
| (3,4)], I could never keep it straight. If they did *radically* different
| things, I'd have no trouble at all provided only that the things they did
| made good sense for their operand types. E.g., it wouldn't have bothered me
| if [1,2] + [3,4] *did* return [4,6] -- but it would have driven me nuts if
| "ab" + "cd" returned
|
| string.join([chr(ord('a')+ord('c')),
| chr(ord('b')+ord('d'))])
|
| Maybe that's a difference in perspective: when I see an operator symbol, I
| think of it as a message to its operands, and as always it's the operands
| that decide what to do with the message. The message is just a name, and
| like any name is entitled to be ambiguous as hell, while nevertheless
| *suggesting* its purpose in life (catenation is a *fine* thing for "+" to
| mean; subtraction isn't). The notion that an operator symbol has some
| concrete meaning independent of its operand types is both foreign to me and
| seems to be the way everyone else thinks of it <wink/frown>.

I'm the same way, especially when it comes to programming languages.
I find it hard to keep Emacs Lisp and Scheme apart, because they look
similar but they work differently. I also find it hard to remember
the differences C and the C subset of C++. On the other hand I have
no problem keeping Python and Scheme separate, because they're
_enough_ different that my brain doesn't ever confuse the two.

similarity-is-confusing-ly - Amit

(did that last line follow the Tim Protocol?)

Konrad Hinsen

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

Just van Rossum <ju...@letterror.com> writes:

> (I just wish I knew how to really concatenate NumPy arrays...)

Along the first axis:

Numeric.concatenate((array1, array2))

Along the last axis:

Numeric.concatenate((array1, array2), -1)

Along the n-th axis:

Numeric.concatenate((array1, array2), n)

> (and it *does* bother me that array1[:1] = array2[:4] doesn't
> work...)

As long as the shapes agree, it should work. But you can't insert
items via assignment.

Jim Fulton

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

Tim Peters wrote:
>
> [Steven D. Majewski]
> ...

> > I think that is a big problem and a potential hole for newbies
> > to fall into if NumPy has different semantics than the rest of
> > Python.
>
> I don't. Everything's a disconnected jumble to newbies; the notion that an
> array has-- or even *could* have --semantics in common with a tuple is a
> sophisticated view. For NumPy newbies in particular, they really are aiming
> to make NumPy act more like Every Other Array Language on Earth, and in that
> way are clearly trying to *reduce* newbie floundering.

I don't agree. Newbies try to provide some organization to new
information. They expect things that "look" more or less the same
to have more or less the same meaning. This is especially true for
operators. Consistency is one of the primary tools of good
user-interface design. A programming language *is* a user interface.
While consistency is only one of many goals, and one that must
sometimes be traded off against others, it is an important goal
just the same.

> > I'm a lot more concerned about that than the issue of how
> > to interpret "if array1 > array2". And as I already noted,
> > I'm already concerned about "+" meaning addition, except
> > for sequences where it means concatenation, except for arrays
> > where it means addition ( except on alternate Fridays ? :-)
>
> But Steven! You're a multi-lingual programmer of no small accomplishment.
> You routinely use "+" to mean two dozen things in four languages every week.
> What's the problem? You're not unique in your ability to grasp that the
> same symbol can mean very different things depending on context -- no
> programmer can succeed without that twist of mind. I can't say I had any
> difficulty at all with the notion that [1,2,3] + [4,5,6] doesn't return
> [5,7,9], and won't believe you if you claim that you did <wink>.

I don't think Steve was worried about himself so much as about his
less nimble colleagues and students.

> > I all in favor of overloading of operators, but where
> > it starts to become a problem rather than a solution
> > is when the same names are used for radically different
> > operations.
>
> This is interesting. My problem with overloading is exactly the opposite:
> when the same name is used for *similar*-- but subtly different --purposes.
> E.g., if "ab" + "cd" returned "abcd" but [1,2] + [3,4] returned [(1,2),
> (3,4)], I could never keep it straight. If they did *radically* different
> things, I'd have no trouble at all provided only that the things they did
> made good sense for their operand types. E.g., it wouldn't have bothered me
> if [1,2] + [3,4] *did* return [4,6] -- but it would have driven me nuts if
> "ab" + "cd" returned
>
> string.join([chr(ord('a')+ord('c')),
> chr(ord('b')+ord('d'))])
>
> Maybe that's a difference in perspective: when I see an operator symbol, I
> think of it as a message to its operands, and as always it's the operands
> that decide what to do with the message. The message is just a name, and
> like any name is entitled to be ambiguous as hell, while nevertheless
> *suggesting* its purpose in life (catenation is a *fine* thing for "+" to
> mean; subtraction isn't). The notion that an operator symbol has some
> concrete meaning independent of its operand types is both foreign to me and
> seems to be the way everyone else thinks of it <wink/frown>.

Ideally, IMO, two messages with the same name should have the same
meaning
but possibly different implementations. Of course, "meaning" is
somewhat
relative, but the notion that two messages with the same name should
have
the same "meaning" is very useful. In general, the language should not
get mixed up in this. For example, the language should not dictate the
meaning of:

a.foo(b)

but, in the case of operators, I think the language *should* take a
stand.
To some degree it has to take a stand, since, at least in Python's case,
it applies precedence and syntactic rules to the interpretation of
operator
symbols.

> ...

(shows different behavior of __builtin__.pow and math.pow)

> Not even consistent within the same language for two functions named "pow":
> now *that's* what I consider to be a *baffling* kind of overloading <0.9
> wink>!

But these sorts of things *are* baffling to many people. Python is a
very high-level language with clear and easy to understand syntax and
semantics, at least for the most part. An important audience
for Python is and, IMO should be, people who are often baffled by
inconsistencies.

Jim

Jim Fulton

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

Konrad Hinsen wrote:

>
> David Ascher <d...@skivs.ski.org> writes:
>
> > So how about:
> > <[], >[], <=[], ==[] !=[]
> > or
> > [<], [<=], [==], or [!=]
> > or
> > *<, *<=, *==, *!=

> > ?
>
> I like the middle one best - it indicates rather clearly what's going

Sounds good to me. :-)

Jim Fulton

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

hug...@cnri.reston.va.us wrote:
>
> In article <Pine.SUN.3.96.980425...@skivs.ski.org>,
> David Ascher <d...@skivs.ski.org> wrote:
> >
> > On Sat, 25 Apr 1998, David Ascher wrote:
> >
> > > I'm proposing <_, >_, <=_, >=_, ==_, !=_ as operators -- I think of them
> > > as using the TeX-like '_' subscript character, which will be familiar to
> > > at least some of the target population, and gets at the notion that the
> > > operators apply to the 'subscripted' elements of the operands.
> >
> > Except that they're probably ambiguous: is (a<_b) (a < _b) or (a <_ b)?
> >

> > So how about:
> > <[], >[], <=[], ==[] !=[]
> > or
> > [<], [<=], [==], or [!=]
> > or
> > *<, *<=, *==, *!=
>

> This has generated enough positive responses that I wanted to throw in my own
> two cents on the other side.
>
> I really dislike the idea of adding a significant number of new operators to
> the Python language. Once you agree on [<], I think that I could make a good
> case for why you need [*] (to settle the old matrix vs. element-wise
> multiplication issue), and etc...

I'm not wild about adding lots of new operators either, but I'd like
the operators that do exist to have a clear meaning. If this means
adding
more operators then I'd rather do that.

> My primary reason for wanting to be able to overload "<", ">", ... is that it
> seems consistent with the rest of Python's design. If there's an __add__
> method to implement "a+b", then there should be an "__lt__" method to
> implement "a<b".

I agree whole-heartedly. I think that the comparison machinery should
be changed. I'd like to see the machinery changed in a way that
preserves
the meaning of the operators though.

> After reading enough of Jim Fulton's arguments, I'm
> beginning to think he'd like to get rid of "__add__" methods from the
> language if it was possible <0.5 wink>.

Not at all. I *would* like to change the operator symbol for
__concat__ and __repeat__ though, if it was possible, which it isn't.

Jim

Jim Fulton

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

David Ascher wrote:
>
> On Sat, 25 Apr 1998, Jim Fulton wrote:
>
> > One final plea. Please consider creating new operators for "rich
> > comparison".
>
> I certainly will, for one. I don't consider this settled by any means.
> Keep pleading (I seriously doubt you're the only person who feels the way
> you do on this issue).
>
> I do have one question -- if comparisons can return anything, how can the
> infrastructure enforce that they have meaningful boolean interpretations?

I don't think that the infrastructure *has* to enforce everything.
I do think that the *interface* of comparison operators should define a
meaning for them that is clear. I'm willing to trust new class/type
authors to obey the interface. It is very useful that 'and' and 'or'
do not simply return integers.

> If there were two sets of operators, those which do "standard" comparison
> and those which do the array comparisons (aka comparisons with
> type-defined semantics), it would seem to me that it might make sense to
> require the former to always return 0 or 1 (or exceptions), and let the
> latter have the PyObject * return values.

I don't have an opinion on this. All I care about is thet the
(non-rich)
comparison operators should return "meaningful boolean values" or raise
an exception.

> This would allow speeding up of
> the algorithm Guido proposed for 'smarts' in the comparison functions, and
> would allow using the signature of the functions to enforce the semantics
> of "logical" comparison. The logical extension of this split is to also
> split the 'not', 'and' and 'or' and allow some new version with the
> semantics defined by the types they are applied to.

Right.

> It's a change to the syntax, but it's one I'd be willing to live with. I
> don't particularly care whether I have to type 'a < b' or 'a <_ b'. It is
> the distinction between operator and function which I think makes a big
> human factors difference.

Cool.

Vladimir Marangozov

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

A bit late, but anyway...

Jim Fulton wrote:
>
> Frankly, I'm amazed. I should be speechless. I almost am. :)

>
> One final plea. Please consider creating new operators
> for "rich comparison".

In support of Jim (who felt alone in the dark) I also had mixed
feelings after reading the original proposal, about providing the
standard comparison operators (<, <=, !=, etc.) with richer semantics.

Creating new operators (I love David's "second line", i.e. [<], [<=],
etc. -- nice!) will certainly gain my "yes" vote.

David Ascher wrote:
>
> PPS: I was wondering why no one commented on my first proposal while
> the only slightly modified version caused this flood.
>

Frankly, I felt like being a bit too far from it. The intensive
discussion warned me however, 'cause I didn't want to remain clueless
after an exception resulting from subtle erroneous comparison in my code.

-think-that-I've-never-compared-whole-arrays-<wink-here>-a-la-Tim'ly y'rs
--
Vladimir MARANGOZOV | Vladimir....@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252

John B. Williston

unread,

Apr 28, 1998, 3:00:00 AM4/28/98

to

Guido van Rossum wrote in message
<1998042717...@eric.CNRI.Reston.Va.US>...

>I've tried to keep my mouth shut in the recent debate (is that a good
>way to refer to it? :-). But the various calls for special operators
>disturb me. Please, think again. Adding extra operators to the
>language is a lot of hassle and won't happen any time soon. I promise
>:-). Make do with the existing ones. Overloading '<' is okay.
>Raising an exception from it is okay. Adding a third argument to
>cmp() is okay. Adding a new extended comparison C API is okay.
>Adding a new field to the type structure is okay. But adding new
>syntax to the language is NOT OK. OK?

Heh. Guido is well named. This has all the subtlety of a mob hit (grin). But
I am forced to agree. Overloading operators is not nearly as inscrutable as
comments to the contrary imply, but adding new operators represents a
significant change to the language syntax. Of course, some might observe
that overloading operators is a change to syntax, but in my estimation it
actually makes the language overloading capabilities more consistent.

John

http://www.netcom.com/~wconsult
___ ___
\ \ __ / / Williston Consulting
\ \/ \/ / __________ makes software worth buying.
\ /\ / / _______/ wcon...@ix.netcom.com
\_/ \_/ / /
/ /_______
/__________/

Tim Peters

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

> [Steven D. Majewski]
> ...
> I think that is a big problem and a potential hole for newbies
> to fall into if NumPy has different semantics than the rest of
> Python.

[Tim P]
> I don't. Everything's a disconnected jumble to newbies ...

[Jim Fulton]

> I don't agree. Newbies try to provide some organization to
> new information. They expect things that "look" more or less
> the same to have more or less the same meaning. This is
> especially true for operators. Consistency is one of the
> primary tools of good user-interface design. A programming
> language *is* a user interface. While consistency is only
> one of many goals, and one that must sometimes be traded off
> against others, it is an important goal just the same.

I don't argue against the principles there, but it's very muddy to me how
you thnk they apply to the specific issue at hand.

Perhaps you could articulate what the specific current properties of
comparisons are that you're trying to preserve, and why those in particular
are worth preserving while others aren't?

E.g., "<" always returns a little integer now (or raises an exception), but
you've said you don't think *that's* important to preserve. You've also
said you don't think trichotomy (one of "a < b", "a == b", "a > b" returns
true) is important to preserve. I believe you also at least implied that
you don't think transitivity (a<b and b<c implies a<c) is important. But
the violation of any one of those throws consistency with Python's built-in
types straight out the window, as well as consistency with what comparisons
mean in almost any other scalar language. So why is it that those don't
matter, but "a<=b iff a?

You've also said that comparisons should return "meaningful boolean values",
but haven't defined that phrase. It would help to know what it means to
you. E.g., does "5 < [5]" return a meaningful boolean value today? If so,
"meaningful" was probably the wrong adjective <0.9 wink>; but if not, are
you also proposing to change what that does?

I don't see any consistency in your position: the criteria by which you
discard some properties and embrace others is a mystery to me. My position
is very consistent: throw it *all* out the window <wink>.

[steven]
> ...

> I'm already concerned about "+" meaning addition, except

> [when it doesn't]

[tim]
> ...

> You routinely use "+" to mean two dozen things in four

> languages every week. ... You're not unique in your ability

> to grasp that the same symbol can mean very different things
> depending on context -- no programmer can succeed without
> that twist of mind. I can't say I had any difficulty at all
> with the notion that [1,2,3] + [4,5,6] doesn't return [5,7,9],
> and won't believe you if you claim that you did <wink>.

[jim]

> I don't think Steve was worried about himself so much as
> about his less nimble colleagues and students.

That's why I belabored the point that Steven isn't unique (well, he *is*,
but not in context <smile>). The "of course, while *I* have no problem with
this at all, it's surely too much for a lesser being" flavor of argument
always rings hollow to me. Are you personally confused by the meanings for
"+" that exist today? *Objecting* to the variations is a different story;
I'm wondering whether you personally stumble over them in practice. I
don't; Steven doesn't; I doubt that you do either. I'm betting that almost
*nobody* ever does, in which case those "less nimble colleagues and
students" must be supernaturally feeble to merit such concern.

> ... [various stuff about overloading] ...

[tim]

> Maybe that's a difference in perspective: when I see an
> operator symbol, I think of it as a message to its operands,
> and as always it's the operands that decide what to do with
> the message. The message is just a name, and like any name
> is entitled to be ambiguous as hell, while nevertheless
> *suggesting* its purpose in life (catenation is a *fine*
> thing for "+" to mean; subtraction isn't). The notion that
> an operator symbol has some concrete meaning independent of

> its operand types is ... foreign to me ...

[jim]

> Ideally, IMO, two messages with the same name should have
> the same meaning but possibly different implementations.
> Of course, "meaning" is somewhat relative, but the notion
> that two messages with the same name should have the same
> "meaning" is very useful.

Like clothes.launder() vs money.launder(), or shape.draw() vs blood.draw(),
or matrix.norm() vs hi.norm() <wink>? I'm afraid English thrives on puns,
and the same word routinely means radically different things across
application areas. Therefore, to insist that a word have "one true meaning"
in a programming language is insisting that the language cater to one true
application domain. The example at issue is that when array folk write "a <
b" in an array language, they're not *saying* "compute an elementwise bit
matrix containing true where the scalar relation obtains" to themselves;
they're saying "a less than b" to themselves. And they're not being
perverse by doing this: they're being sociable -- the pun has a
conventional meaning in their domain.

If your objection is that it's not *your* conventional meaning, this won't
get anywhere even faster than it isn't now <wink>.

> In general, the language should not get mixed up in this. For
> example, the language should not dictate the meaning of:
>
> a.foo(b)

Indeed, when I design *my* killer language, the identifiers "foo" and "bar"
will be reserved words, never used, and not even mentioned in the reference
manual. Any program using one will simply dump core without comment.
Multitudes will rejoice.

> but, in the case of operators, I think the language *should*
> take a stand. To some degree it has to take a stand, since,
> at least in Python's case, it applies precedence and syntactic
> rules to the interpretation of operator symbols.

Ya, & that really gets in the way too <wink>.

I don't know, Jim. OK, I do: I have no problem at all with any kind of
*language*-defined overloading, because it's handy *and* if you're ever
confused you can look it up in the manual (which is easy to find, carefully
written, & kept up to date).

Also have no problem with user-defined overloading of alphanumeric function
or method names, up until (but not including) the point "promotion &
coercion & matching rules" make figuring out which specific thing will get
called a bloody puzzle.

*User*-defined overloading of operator symbols is something I generally stay
clear of, though. It is confusing! I've been on the receiving end of too
many bad examples of it to believe it helps more than it hurts. E.g.,
recently saw a C++ library which overloaded "thing << x" to mean
thing.append(x), among other atrocities. Code *using* that gibberish is,
well, gibberish. In my own group's C++ project, we overload "[]" on our
flavor of flex array, but that's *it*. Even our flavor of dict doesn't get
to use "[]", just so we can tell the difference between dicts and arrays by
local inspection. And that helps! ("speed matters")

But the thing is, along that axis, I think of the NumPy extensions much more
as language-defined than user-defined. What they're after *has* been
spelled the way they want (with assorted variations) in any number of
preceding languages. Their specific puns are well-established, conventional
within their domain, and will be well & publicly documented (none of which
applies to using "<<" as an alias for append!). Similarly for Aaron's
kjBuckets extensions, where e.g. "*" is used to denote graph composition:
that's got nothing (non-hallucinogenic <wink>) to do with multiplication or
repeated catenation, but again it's a conventional pun in *its* domain, and
kjBuckets is much more pleasant to use because of stuff like that.

I don't want to stop such *good* uses for operator overloading just because
someone else *will* abuse it. What "<" etc "should" mean depends on the
domain. All our feeble colleagues won't be bothered by it unless they get
sucked into that domain and have to use that code; but then they had better
master that domain's puns long before they see any code, else they won't be
able to communicate with their new colleagues at all.

[tim]

> (shows different behavior of __builtin__.pow and math.pow)
>
> Not even consistent within the same language for two functions
> named "pow": now *that's* what I consider to be a *baffling*
> kind of overloading <0.9 wink>!

[jim]

> But these sorts of things *are* baffling to many people.

Oh, I wasn't kidding: that *is* baffling! The "wink" was really in honor
of how the long and seemingly unrelated diversion into IEEE arithmetic
suddenly tied back into the main topic. The pow vs. math.pow thing was a
*perfect* example of my previous claim that the overloading that bothers me
most is when the same name is used to mean *subtly* different things: pow
!= math.pow is the worst kind of overloading there is. It was a wink of
serendipity, not sarcasm.

OTOH, it wouldn't have baffled me at all if gun.pow(NaN) raised a
"TypeError: can't pow without a bullet" exception <0.5 wink>.

> Python is a very high-level language with clear and easy
> to understand syntax and semantics, at least for the most
> part.

Yes, it is! Wonderfully so. I have no doubt but that you want to preserve
that and even improve it. The dilemma here is that I do too, yet we seem
unable to agree on how best to do that. Too little freedom makes life
confusingly clumsy; too much, clumsily confusing. Luckily, the tension
between freedom and restraint eventually gets severed by Guido's Razor.

aka-"thou-shalt-not-multiply-syntax-without-dire-necessity-
but-adding-1000-lines-to-cmp-is-way-cool"<wink>-ly y'rs - tim

Vladimir Marangozov

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to Guido van Rossum

Still thinking, still mixed feelings...

Guido van Rossum wrote:
>
> I've tried to keep my mouth shut in the recent debate (is that a good
> way to refer to it? :-). But the various calls for special operators
> disturb me. Please, think again. Adding extra operators to the
> language is a lot of hassle and won't happen any time soon. I promise
> :-).

Understood.

> Make do with the existing ones. Overloading '<' is okay.

No, it's not okay. While it's okay from an implementation point of
view, it isn't okay for the ordinary Python user who doesn't need
rich comparisons at all (and many of us are such "poor comparison"
users), i.e. it isn't okay if overloading is imposed.

Being a grey user in the croud, overloading *implicitely* the
traditional comparison operators with *some* (not boolean) richer
semantics is NOT okay for me.

However, it IS okay in a scenario where *I* overload them
*explicitely* because *I* want to deal with some richer semantics
(of my choice!).

I think that this is at the heart of the problem (and the discussion).

Regardless the implementation (thare are usually several ways for
implementing the same stuff in Python), I am against the implicit
operator overloading in general, and notably, against the implicit
comparison operator overloading with any semantics other than boolean.
(think about C++, how messy things can be).

If implicit overloading in Python becomes real, I will read Python
code, it will look the same, BUT I won't be able to trust what I see
any more, nor understand or predict how the program will work.
This is my understanding of these things, trying to be pragmatic.

The alternative is to be able to use the richer semantics explicitely
(which way is a separate problem) and this is precisely what people
found attractive (including myself) in the new operator proposals,
because they can still *control* what's going on, see and distinguish
"poor" from "rich".

> Raising an exception from it is okay.

This is okay.

> Adding a third argument to cmp() is okay.

This is okay.

> Adding a new extended comparison C API is okay.

This is okay.

> Adding a new field to the type structure is okay.

This is okay.

> But adding new syntax to the language is NOT OK. OK?

Okay. We should then find a way to provide richer semantics but
not at the cost of clarity and determinism in understanding
Python code.

All you seem to refer to in your post is implementation related.
We should think about how people would think with this thing :-)

Hey, I just can't imagine to what degree I will trust my code
after importing a module of yours (which imports other modules)
with overloaded operators... How should I compare the objects you
created with my own objects? I have a list of objects. Can I
compare it with your list? You overloaded "==" for strings. Will
the canonical test "if __name__ == '__main__'" still work?
What comparison operators you provide with objects of type X?
What do they do? etc, etc. I'm exaggerating here, but the FAQ
will double in size for sure... :-)

a...@pythonpros.com

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

In article <000101bd732b$72ffda00$26472399@tim>,

"Tim Peters" <tim...@msn.com> wrote:
> But the thing is, along that axis, I think of the NumPy extensions much more
> as language-defined than user-defined. What they're after *has* been
> spelled the way they want (with assorted variations) in any number of
> preceding languages. Their specific puns are well-established, conventional
> within their domain, and will be well & publicly documented (none of which
> applies to using "<<" as an alias for append!). Similarly for Aaron's
> kjBuckets extensions, where e.g. "*" is used to denote graph composition:

> that's got nothing (non-hallucinogenic <wink>) to do with multiplication...

Now Tim, surely you know that matrix multiplication is a natural
extension of scalar multiplication and graph composition is just
matrix multiplication of the indicator matrices of two graphs
coerced back to a graph (generally, very sparse matrices at that).
:) Sorry, just a flashback from my lost youth.
-- Aaron Watters
===
It's amazing how sophisticated triviality has become -- Lakatos

Jim Fulton

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

Tim Peters wrote:

> > [Steven D. Majewski]
> > ...
> > I think that is a big problem and a potential hole for newbies
> > to fall into if NumPy has different semantics than the rest of
> > Python.
>
> [Tim P]
> > I don't. Everything's a disconnected jumble to newbies ...
>
> [Jim Fulton]
> > I don't agree. Newbies try to provide some organization to
> > new information. They expect things that "look" more or less
> > the same to have more or less the same meaning. This is
> > especially true for operators. Consistency is one of the
> > primary tools of good user-interface design. A programming
> > language *is* a user interface. While consistency is only
> > one of many goals, and one that must sometimes be traded off
> > against others, it is an important goal just the same.
>
> I don't argue against the principles there, but it's very muddy to me how
> you thnk they apply to the specific issue at hand.
>
> Perhaps you could articulate what the specific current properties of
> comparisons are that you're trying to preserve, and why those in particular
> are worth preserving while others aren't?

Comparisons are currently supposed to return values whose booleaninterpretation
is supposed to be meaningfull.

> E.g., "<" always returns a little integer now (or raises an exception), but
> you've said you don't think *that's* important to preserve.

Right, like I said, I think it's important that the result have a meaningful
boolean value.Most Python objects can be evaluated as boolean values, so there
is no *need*
to limit return values to 0 or1. On the other hand there's no guarentee that 0
or
1 be meaningful.

I don't expect the language to enforce meaningfulness. I do want the comparison

interface to call for meaningful boolean results. An author can still define
operators that don't satisfy the requirements of the interface, but this should

be viewed as a bug.

> You've also
> said you don't think trichotomy (one of "a < b", "a == b", "a > b" returns
> true) is important to preserve. I believe you also at least implied that
> you don't think transitivity (a<b and b<c implies a<c) is important.

Right, although perhaps it would be better to raise exceptions in cases
where these do not hold.

> But
> the violation of any one of those throws consistency with Python's built-in
> types straight out the window,

Actually, you recently showed examples where transitivity doesn'tcurrently
hold. ;-)

> as well as consistency with what comparisons
> mean in almost any other scalar language. So why is it that those don't
> matter, but "a<=b iff a mechanical siblings) the only thing we need to "be consistent"? And if so,
> consistent with *what* <0.1 wink>?

Consistent with the *names* of the operators "less than or equal","greater than
or equal", and "not equal".

> You've also said that comparisons should return "meaningful boolean values",
> but haven't defined that phrase. It would help to know what it means to
> you. E.g., does "5 < [5]" return a meaningful boolean value today?

No. The current implementation is broken.

> If so, "meaningful" was probably the wrong adjective <0.9 wink>; but if not,
> are
> you also proposing to change what that does?

Yes.

> I don't see any consistency in your position: the criteria by which you
> discard some properties and embrace others is a mystery to me. My position
> is very consistent: throw it *all* out the window <wink>.
>
> [steven]
> > ...
> > I'm already concerned about "+" meaning addition, except
> > [when it doesn't]
>
> [tim]
> > ...
> > You routinely use "+" to mean two dozen things in four
> > languages every week. ... You're not unique in your ability
> > to grasp that the same symbol can mean very different things
> > depending on context -- no programmer can succeed without
> > that twist of mind. I can't say I had any difficulty at all
> > with the notion that [1,2,3] + [4,5,6] doesn't return [5,7,9],
> > and won't believe you if you claim that you did <wink>.
>
> [jim]
> > I don't think Steve was worried about himself so much as
> > about his less nimble colleagues and students.
>
> That's why I belabored the point that Steven isn't unique (well, he *is*,
> but not in context <smile>). The "of course, while *I* have no problem with
> this at all, it's surely too much for a lesser being" flavor of argument
> always rings hollow to me.

It should. I was not refering to lesser beings.

> Are you personally confused by the meanings for
> "+" that exist today? *Objecting* to the variations is a different story;
> I'm wondering whether you personally stumble over them in practice. I
> don't; Steven doesn't; I doubt that you do either. I'm betting that almost
> *nobody* ever does, in which case those "less nimble colleagues and
> students" must be supernaturally feeble to merit such concern.

I have quite a bit of experience teaching Python and working with colleaguesfor
whom Python is not an end but a means to performing some task. For
many people, inconsistencies are bewildering, irritating, and a barrier
to learning and/or accepting the language.

> > ... [various stuff about overloading] ...
>
> [tim]
> > Maybe that's a difference in perspective: when I see an
> > operator symbol, I think of it as a message to its operands,
> > and as always it's the operands that decide what to do with
> > the message. The message is just a name, and like any name
> > is entitled to be ambiguous as hell, while nevertheless
> > *suggesting* its purpose in life (catenation is a *fine*
> > thing for "+" to mean; subtraction isn't). The notion that
> > an operator symbol has some concrete meaning independent of
> > its operand types is ... foreign to me ...
>
> [jim]
> > Ideally, IMO, two messages with the same name should have
> > the same meaning but possibly different implementations.
> > Of course, "meaning" is somewhat relative, but the notion
> > that two messages with the same name should have the same
> > "meaning" is very useful.
>
> Like clothes.launder() vs money.launder(), or shape.draw() vs blood.draw(),
> or matrix.norm() vs hi.norm() <wink>? I'm afraid English thrives on puns,
> and the same word routinely means radically different things across
> application areas. Therefore, to insist that a word have "one true meaning"
> in a programming language is insisting that the language cater to one true
> application domain.

Which is why I said that the language has no stake in the meaning of the
message "foo" in:

a.foo(b)

Operators are a different story. When a language defines a name, like
"class" or "<", then the name should, IMO, have a language-defined
meaning.

> The example at issue is that when array folk write "a <
> b" in an array language, they're not *saying* "compute an elementwise bit
> matrix containing true where the scalar relation obtains" to themselves;
> they're saying "a less than b" to themselves. And they're not being
> perverse by doing this: they're being sociable -- the pun has a
> conventional meaning in their domain.

I don't think operatoprs should be used this way. See above.

> If your objection is that it's not *your* conventional meaning, this won't
> get anywhere even faster than it isn't now <wink>.

No. I think the language should define a meaning for these operators.

I'd be happy to debate specifically *what* the meaning should be, as
long as there *is* a meaning.

> > In general, the language should not get mixed up in this. For
> > example, the language should not dictate the meaning of:
> >
> > a.foo(b)
>
> Indeed, when I design *my* killer language, the identifiers "foo" and "bar"
> will be reserved words, never used, and not even mentioned in the reference
> manual. Any program using one will simply dump core without comment.
> Multitudes will rejoice.
>
> > but, in the case of operators, I think the language *should*
> > take a stand. To some degree it has to take a stand, since,
> > at least in Python's case, it applies precedence and syntactic
> > rules to the interpretation of operator symbols.
>
> Ya, & that really gets in the way too <wink>.
>
> I don't know, Jim. OK, I do: I have no problem at all with any kind of
> *language*-defined overloading, because it's handy *and* if you're ever
> confused you can look it up in the manual (which is easy to find, carefully
> written, & kept up to date).

Are you suggesting that with David's proposal, the language will
definecomparison operators to return either booleans or arrays?

Your definition of "good" and "bad" seem to be a matter of taste, wheretaste is
domain dependent. Presumably, there is some, possibly small,
domain where overloading "<<" to mean append is "good". This seems
tp allow any overloading to be "good".I'd much rather see the language define
what the meaning for *operators*
is.

Konrad Hinsen

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

Vladimir Marangozov <Vladimir....@inrialpes.fr> writes:

> If implicit overloading in Python becomes real, I will read Python
> code, it will look the same, BUT I won't be able to trust what I see
> any more, nor understand or predict how the program will work.
> This is my understanding of these things, trying to be pragmatic.

That's nothing new, most operators can already be overloaded.
In principle, you can make no assumptions at all about the meaning
of '+' without knowing the types involved. It's up to the implementors
of types and classes to ensure a "reasonable" behaviour of operators.
Until now, Python developers have been reasonable; I haven't yet
encountered types with weird operator definitions.

The current discussion is about allowing for comparisons what has
been possible for other operators for eons (well, the computer equivalent
of eons ;-)

> The alternative is to be able to use the richer semantics explicitely
> (which way is a separate problem) and this is precisely what people
> found attractive (including myself) in the new operator proposals,

I don't quite understand the difference between "implicit" and
"explicit". Could you explain?

> Hey, I just can't imagine to what degree I will trust my code
> after importing a module of yours (which imports other modules)
> with overloaded operators... How should I compare the objects you
> created with my own objects? I have a list of objects. Can I
> compare it with your list? You overloaded "==" for strings. Will
> the canonical test "if __name__ == '__main__'" still work?

That is not possible with any proposal I have seen. It's the
implementor of the string type (i.e. Guido) who decides how
strings are compared. No one can change this without modifying
the string type.

Vladimir Marangozov

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to Konrad Hinsen

Konrad Hinsen wrote:
> >
> > The alternative is to be able to use the richer semantics explicitely
> > (which way is a separate problem) and this is precisely what people
> > found attractive (including myself) in the new operator proposals,
>
> I don't quite understand the difference between "implicit" and
> "explicit". Could you explain?

It's all about how the rich comparison mechanism is triggered.

I refer to "implicit" if it is triggered by writing "a < b" and
I find this semantically intrusive with my current understanding
of "a < b".

I prefer an "explicit" triggering of rich comparisons (or other
missing functionalities in general) by writing something different
than "a < b". I don't really care whether it will be "a [<] b",
"a __rlt__ b", "rlt(a, b)", as far it doesn't change the present
semantics of the notation "a < b".

>
> > Hey, I just can't imagine to what degree I will trust my code
> > after importing a module of yours (which imports other modules)
> > with overloaded operators... How should I compare the objects you
> > created with my own objects? I have a list of objects. Can I
> > compare it with your list? You overloaded "==" for strings. Will
> > the canonical test "if __name__ == '__main__'" still work?
>
> That is not possible with any proposal I have seen. It's the
> implementor of the string type (i.e. Guido) who decides how
> strings are compared. No one can change this without modifying
> the string type.

Yes, I wrote this on purpose :-)

Andrew Kuchling

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

Jim Fulton writes:
>Your definition of "good" and "bad" seem to be a matter of taste,

>where taste is

>domain dependent. Presumably, there is some, possibly small,
>domain where overloading "<<" to mean append is "good". This seems

>to allow any overloading to be "good".I'd much rather see the language define

>what the meaning for *operators* is.

Part of Python's usefulness is that it can serve as glue tying
different components together, and it seems a mistake to deny people
the ability to overload the < operators if it proves natural for their
problem domain. If the ability to overload the operators proves
useless, or a source of confusing bugs, then people simply won't use
it. Witness the metaclass hook, which is there but little-understood
and little-used.

I'd have no problems with all this rich comparison stuff,
provided that 1) the core Python types don't actually use it, so that
it isn't woven into the standard libraries to such a degree that
understanding of it proves necessary to do simple things with Python,
and 2) no extreme penalty is paid for it, such as making comparisons
10 times slower. By Jim Hugunin's experiment with JPython, the
proposal seems OK on condition 2, though we'll see how the C
implementation turns out. In other words, I'm willing to see dark
corners added to the language, as long as I don't have to go into them
myself.

--
A.M. Kuchling http://starship.skyport.net/crew/amk/
Science and technology multiply around us. To an increasing extent they
dictate the languages in which we speak and think. Either we use those
languages, or we remain mute.
-- J.G. Ballard, from the introduction to _Crash_.

Jeremy Hylton

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

Vladimir Marangozov <Vladimir....@inrialpes.fr> writes:

> Hey, I just can't imagine to what degree I will trust my code
> after importing a module of yours (which imports other modules)
> with overloaded operators... How should I compare the objects you
> created with my own objects? I have a list of objects. Can I

This argument is specious. What on earth would it mean to compare an
object you created with another object from someone else's code unless
you knew exactly what each object's semantics were? Do you really
want to ask if my abstract syntax tree is less then your HTTP
connection object?

Using operators like < and == make no sense unless you understand the
semantics of the objects you're comparing.

> compare it with your list? You overloaded "==" for strings. Will
> the canonical test "if __name__ == '__main__'" still work?

> What comparison operators you provide with objects of type X?
> What do they do? etc, etc. I'm exaggerating here, but the FAQ
> will double in size for sure... :-)
>

The documentation must describe operations are provided on objets of
type X. It must also describe what they do. The FAQ doesn't get any
bigger unless the documentation is inadequate.

Jeremy

Tim Hochberg

unread,

Apr 29, 1998, 3:00:00 AM4/29/98

to

[Tim Peters charges into battle, his favorite arguments flashing in the
sun....]

[Jim Fulton remains unconvinced...]

In the specific case of NumPy, I really don't see what the problem would be
with these allowing "non-meaningfull boolean values" (non-MBVs) for
comparison of arrays. Regardless of the outcome of this argument, I expect
that in the future if anyone tries to use an array as a MBV will raise an
exception, e.g.:

if (array1 < array2): # Error
while (array1 < array2): #Error

These are the cases that the archetypical newbie is likely to mistakenly try
if s/he accidentally stumbles into NumPy contaminated code. But they'll
quickly get the message that using array's and conditions don't mix, so I
don't see a problem. Conversely (inversely, reversely?), the usage that
would be common in NumPy under this proposal, "mask = a < b", is rare in
common usage. When it does show up, mask is generally used immediately in a
conditional, so again any straying newbies would be quickly set on the right
path by a little "correction".

Tim apply-the-comfy-chair Hochberg