Hi Todd
Thank you for your contribution! I've got a couple of comments. The
experts, I hope, will have more to say.
You wrote:
> As to why this is useful, the overall problem is that the current logical
> operators, like and, or, and not, cannot be overloaded, which means projects
> like numpy and SQLAlchemy instead have to (ab)use bitwise operator
> There was a proposal to allow overloading boolean operators in Pep-335 [2],
> but that PEP was rejected for a variety of very good reasons.
The key thing is, I think, the wish for a domain specific language. I
find this to be a wholesome wish. But I'd rather create a broad
solution, than something that works just for special cases. And if at
all possible, implement domain specific languages without extending
the syntax and semantics of the language.
There is the problem of short-circuiting evaluation, as in the 'and'
and 'or' operators (and elsewhere in Python). This has to be a syntax
and semantics feature. It can't be controlled by the objects.
On Fri, Aug 3, 2018 at 1:47 PM Todd todd...@gmail.com wrote:
The operators would be:bNOT - boolean "not"bAND - boolean "and"bOR - boolean "or"bXOR - boolean "xor"
These look pretty ugly to me. But that could just be a matter of familiarity.
For what it’s worth, the Apache Spark project offers a popular DataFrame API for querying tabular data, similar to Pandas. The project overloaded the bitwise operators &, |, and ~ since they could not override the boolean operators and, or, and not.
For example:
non_python_rhode_islanders = (
person
.where(~person['is_python_programmer'])
.where(person['state'] == 'RI' & person['age'] > 18)
.select('first_name', 'last_name')
)
non_python_rhode_islanders.show(20)
This did lead to confusion among users since people (myself included) would initially try the boolean operators and wonder why they weren’t working. So the Spark devs added a warning to catch when users were making this mistake. But now it seems quite OK to me to use &, |, and ~ in the context of Spark DataFrames, even though their use doesn’t match their designed meaning. It’s unfortunate, but I think the Spark devs made a practical choice that works well enough for their users.
PEP 335 would have addressed this issue by letting developers overload the common boolean operators directly, but from what I gather of Guido’s rejection, the biggest problem was that it would have had an undue performance impact on non-users of boolean operator overloading. (Not sure if I interpreted his email correctly.)
The project overloaded the bitwise operators
&,|, and~since they could not
override the boolean operators
and,or, andnot.
As I see it this proposal only proposes a different syntax and doesn't
solve this problem.
The only real solution for this would be a new set of operators but I
agree with Chris that overriding the bitwise operators is good enough
for most cases and a new set of operators really is a bit over the top
just for this. I especially dislike using || and && as they are
prominently used in other programming languages and this would be
extremely confusing for newcomers from those languages. Also if the
syntax isn't clear and consice I feel it doesn't really add any value as
the main point of operator overloading is to make code easy to read and
understand. This really only would be the case if we could overload the
boolean operators. Otherweise I think using a function or overloading
the bitwise ops is the best solution.
I've been re-reading PEP 335 and I think that the __and1__ method isn't
needed.
The __bool__ method is called anyway, and currently must return either
False or True, but what if it could return the special value
NeedOtherOperand mentioned in the PEP?
The disadvantage would be that if the first operand is a bool, the
operator could still short-circuit, and I'm not sure how much of an
issue that would be.
There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons. I think none of those reasons (besides the conversation fizzling out) apply to my proposal.Maybe I am missing something, but I don't really see how this idea solves the problems that lead to PEP 335 getting rejected. As far as I understand it the main reason for the rejection was that this would decrease performance for all boolean operations which are extremely common where as the need for overriding these operators is rather rare. (See the rejection email here: https://mail.python.org/pipermail/python-dev/2012-March/117510.html)
As I see it this proposal only proposes a different syntax and doesn't solve this problem.
The only real solution for this would be a new set of operators but I agree with Chris that overriding the bitwise operators is good enough for most cases and a new set of operators really is a bit over the top just for this. I especially dislike using || and && as they are prominently used in other programming languages and this would be extremely confusing for newcomers from those languages. Also if the syntax isn't clear and consice I feel it doesn't really add any value as the main point of operator overloading is to make code easy to read and understand. This really only would be the case if we could overload the boolean operators. Otherweise I think using a function or overloading the bitwise ops is the best solution.
On Fri, Aug 3, 2018 at 1:02 PM, Nicholas Chammas <nicholas...@gmail.com> wrote:The project overloaded the bitwise operators
&,|, and~since they could notoverride the boolean operators
and,or, andnot.I actually think that is a good solution to this problem -- the fact is that for most data types bitwise operators are useless -- and for even more not-very-useful.numpy did not do this, because, as it happens, bitwise operators can be useful for numpy arrays of integers (though as I write this, bitwise operations really aren't that common -- maybe requiring a function call for them would be a good way to go -- too late now).Also, in a common use-case, bitwise-and behaves the same as logical_and, e.g.if (arr > x) & (arr2 == y)This "works" because both arrays being bitwise-anded are boolean arrays.So you really don't need to call:np.logical_and and friends very often.so -1 on yet another set of operartors.-CHB
The proposal is for new operators. The operators would be "bNOT", "bAND", "bOR", and "bXOR". They would be completely independent of the existing "not", "and", and "or" operators, operating purely on boolean values. It would be possible to overload these operators.
> There are certainly advantages to using binary operators over named
> functions, and a shortage of good, ASCII punctuation suitable for new
> operators.
Hold that thoght.
Then again, why is it 2018 (or 5778?) and we're still stuck with ASCII?
Doesn't Unicode define a metric boatload of mathematical symbols? If
Pythong allows Unicode names,¹ why not Unicode operators?
¹ No, I'm not going to call them variables. :-)
> I don't think much of your names bOR etc.
>
> I think that before adding more ad hoc binary operators, we ought to
> consider the possibility of custom operators [...]
>
> a ~foo b
Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/
> Although possibly we might choose another pseudo-namespace, to avoid
> custom operators clashing with dunders. Trunders perhaps? (Triple
> underscores?)
>
> Under this scheme, your operators would become:
>
> ~or
> ~and
> ~xor
>
> and call trunders ___or___ etc.
And now mental gymnastics to jump from ~foo to ___foo___ or ___rfoo___.
If it's too hard to tell = from == (see endless threads on this mailing
list for proof), then it's also too hard to tell __xor__ from ___xor___.
If I want to say
a ~foo b
then why can't I also say
class A:
def ~foo(self, b):
pass # do something more useful here
Some social problems:
- allowing non-ASCII identifiers was controversial, and is still
banned for the std lib;
- according to critics of PEP 505, even ASCII operators like ?.
are virtually unreadable or unspeakably ugly and "Perlish".
If you think the uproar over PEP 572 was vicious, imagine what would
happen if we introduced new operators like ∉ ∥ ∢ ∽ ⊎ etc instead. I'm
not touching that hornet's nest with a twenty foot pole.
And some technical problems:
- keyboard support for entering the bulk of Unicode characters is
non-existent or poor;
- without keyboard support, editor support for entering Unicode
characters is as best clunky, requiring the memorization of
obscure names, hex codes, or a GUI palette;
- and font support for the more exotic code points, including most
mathematical operators, is generally rubbish.
It may be that these technical problems will *never* be solved. But let
other languages, like Julia, blaze this trail.
[...]
> > I think that before adding more ad hoc binary operators, we ought to
> > consider the possibility of custom operators [...]
> >
> > a ~foo b
>
> Great. Yet another way to spell a.foo(b). Or foo(a, b). :-/
Indeed.
Technically, we don't need *any* operators at all, possibly aside from
those that do argument short-circuiting.
But for many purposes, we much prefer infix notation to prefix
function notation. Which would you rather read and write?
or(x, 1)
x or 1
[...]
> And now mental gymnastics to jump from ~foo to ___foo___ or ___rfoo___.
Just as we do "mental gymnastics" to jump from existing operators like +
to __add__ or __radd__.
If you don't like operator overloading *at all*, that ship has
already sailed.
> If it's too hard to tell = from ==
> (see endless threads on this mailing list for proof)
> then it's also too hard to tell __xor__ from ___xor___.
*shrug*
I don't think it is, but I'm open to alternative suggestions.
> If I want to say
>
> a ~foo b
>
> then why can't I also say
>
> class A:
> def ~foo(self, b):
> pass # do something more useful here
Infix operators delegate to a pair of methods. What would you call
the second one? ~rfoo will clash with operator rfoo.
We already have a convention that operators delegate to dunder methods,
and I see no reason to make changes to that convention. It's a *good*
convention.
The smaller the number of changes needed for a proposal, the better its
chances of being accepted. My suggestion requires:
- one new piece of syntax, ~op or equivalent, as a binary operator;
- (possibly) one slight extension to an existing naming convention;
- (possibly) one new byte-code;
- no new keywords, no new syntax for methods, no new built-in types,
no changes to the execution model of the language, and no changes
to the characters allowed in Python code.
If you want to make a counter-proposal that is more extensive, be my
guest :-)
--
Steve
If you think the uproar over PEP 572 was vicious, imagine what would happen if we introduced new operators like ∉ ∥ ∢ ∽ ⊎ etc instead.
- keyboard support for entering the bulk of Unicode characters is
non-existent or poor;
- without keyboard support, editor support for entering Unicode
characters is as best clunky, requiring the memorization of
obscure names, hex codes, or a GUI palette;
_______________________________________________
On Fri, Aug 03, 2018 at 03:17:42PM -0400, Todd wrote:
> Boolean operators like the sort I am discussing have been a standard part
> of programming languages since forever. In fact, they are the basic
> operations on which modern microprocessors are built.
>
> The fact that Python, strictly speaking, doesn't have them is extremely
> unusual for a programming language.
I'm rather surprised at this claim.
Can you give a survey of such overridable boolean operators which are
available on modern microprocessors?
What programming languages already have them? When you say "forever",
are you going back to Fortran in the 1950s?
> In many cases they aren't necessary in
> Python since Python's logical operators do the job well enough, but there
> are a set of highly diverse and highly prominent cases where those logical
> operators won't work.
Can you list some of these diverse and highly prominent use-cases?
I can think of two:
- elementwise boolean operators, such as in numpy;
- SQL-like DSL languages;
plus a third rather specialised and obscure use-case:
- implementing non-binary logical operators, for (e.g. ternary
or fuzzy logic).
> There are workarounds, but they are less than
> optimal for the reasons I describe, and the previous discussion I linked to
> goes into much more detail why these new operators are important.
There are certainly advantages to using binary operators over named
functions, and a shortage of good, ASCII punctuation suitable for new
operators.
I don't think much of your names bOR etc.
I think that before adding more ad hoc binary operators, we ought to
consider the possibility of custom operators.
You say that Python doesn't have them. What aspect of boolean
operators doesn't Python have?
> I am personally very strongly against custom operators. I just have visions
> of someone not liking how addition works for some particular class and
> deciding implementing a "+" operator would be a great idea.
Eww. (Before anyone jumps in and says "uhh you already have __add__",
that is *not* U+002B PLUS SIGN, it is U+FF0B FULLWIDTH PLUS SIGN,
which would indeed be a custom operator.)
But ultimately, there is already nothing stopping people from doing this:
def Ien(obj):
"""Return object size in machine words"""
return sys.getsizeof(obj) // (sys.maxsize.bit_length() + 1)
and mixing and matching that with the built-in len function. Give
people freedom, and some will abuse it horrifically... but others will
use it usefully and safely.
ChrisA
On Sun, Aug 5, 2018 at 4:40 AM, Todd <todd...@gmail.com> wrote:
>
>
> On Sat, Aug 4, 2018 at 9:13 AM, Steven D'Aprano <st...@pearwood.info> wrote:
>>
>> On Fri, Aug 03, 2018 at 03:17:42PM -0400, Todd wrote:
>>
>> > Boolean operators like the sort I am discussing have been a standard
>> > part
>> > of programming languages since forever. In fact, they are the basic
>> > operations on which modern microprocessors are built.
>> >
>> > The fact that Python, strictly speaking, doesn't have them is extremely
>> > unusual for a programming language.
>>
>> I'm rather surprised at this claim.
>>
>> Can you give a survey of such overridable boolean operators which are
>> available on modern microprocessors?
>>
>> What programming languages already have them? When you say "forever",
>> are you going back to Fortran in the 1950s?
>
>
> Sorry I wasn't clear, I didn't mean overloadable boolean operators are
> standard, but rather boolean operators in general. I was trying to point
> out that there is nothing domain-specific about boolean operators.
You say that Python doesn't have them. What aspect of boolean
operators doesn't Python have?
> I am personally very strongly against custom operators. I just have visions
> of someone not liking how addition works for some particular class and
> deciding implementing a "+" operator would be a great idea.
Eww. (Before anyone jumps in and says "uhh you already have __add__",
that is *not* U+002B PLUS SIGN, it is U+FF0B FULLWIDTH PLUS SIGN,
which would indeed be a custom operator.)
But ultimately, there is already nothing stopping people from doing this:
def Ien(obj):
"""Return object size in machine words"""
return sys.getsizeof(obj) // (sys.maxsize.bit_length() + 1)
and mixing and matching that with the built-in len function. Give
people freedom, and some will abuse it horrifically... but others will
use it usefully and safely.
Right -- and Python has such common boolean operators.
It isn't clear that there's much need for xor, nand, nor, etc. (There
are a grand total of 16 distinct boolean operators which take two
operands, but few of them are useful except under very specialised
circumstances.)
[I asked:]
> > Can you list some of these diverse and highly prominent use-cases?
> >
> > I can think of two:
> >
> > - elementwise boolean operators, such as in numpy;
> >
> > - SQL-like DSL languages;
> >
> > plus a third rather specialised and obscure use-case:
> >
> > - implementing non-binary logical operators, for (e.g. ternary
> > or fuzzy logic).
>
> Also symbolic mathematics like in sympy. That is three.
I don't think symbolic mathematics is "highly prominent" (your words). I
would consider it in the same category as fuzzy logic: specialised and
unusual.
To my mind, this basically means there are two important use-cases:
- numpy and elementwise boolean operators;
- SQL-like queries;
and a couple of more specialised uses.
> > I think that before adding more ad hoc binary operators, we ought to
> > consider the possibility of custom operators.
>
> I am personally very strongly against custom operators. I just have
> visions of someone not liking how addition works for some particular class
> and deciding implementing a "+" operator would be a great idea.
I have visions of someone not liking how boolean operators `or` and
`and` work for some particular class and deciding that overridable
boolean operators would be a great idea.
Under my proposal, you couldn't invent new symbolic operators like +.
Operators would be limited to legal identifiers, so people can do no
worse than they can already do for method names, e.g. ugly names like
"bOR" or "bAND".
Given this proposal, your overridable boolean operators are instantly
available, and using the proper names "or" and "and". There's no
ambiguity, because the custom operators will always require a prefix (I
suggested ~ but that won't work, perhaps ! or @ will work).
And the benefit is that you don't have to come back next year with
another PEP to introduce bNAND and bNOR operators.
[Chris said:]
> > You say that Python doesn't have them. What aspect of boolean
> > operators doesn't Python have?
> >
>
> Python's "and" and "or" don't return "True" or "False" per se, they return
> one of the inputs based on their respective truthiness. So although they
> are logical operators, they are not strictly boolean operators.
According to Python's rules for truthiness, they are boolean operators.
According to Python's rules, True and False aren't the only boolean
values. They're merely the canonical true and false values, but
otherwised unprivileged.
So I don't think this is a difference that makes any real difference.
You might as well complain that Python doesn't strictly have ints,
because some other languages limit their ints to 32 or 64 bits, and
Python doesn't.
But either way, this isn't a really important factor. If we add
overridable "boolean operators" like bOR and bAND, the fact that they
can be overridden means that they won't be limited to returning True and
False either:
- numpy elementwise operators will return arrays;
- sympy will return symbolic expressions;
- ternary logic will return trits (say, true/false/maybe);
etc. So the question of Python truthiness is not really relevant.
[...]
> In your example, you are intentionally picking a character purely because
> it happens to look similar to a completely different character. That isn't
> the sort of thing that can happen innocently or by accident.
I see lots of newbies, and experienced coders who ought to know better,
using variables like l and sometimes even O. Don't underestimate the
power of laziness and thoughtlessness.
On the other hand, such poor choices are easily fixed with a gentle or
not-so-gentle application of the Clue Bat and a bit of minor
refactoring. Changing variable names is easy. Likewise, if somebody
chooses an ugly custom operator like O01l it isn't hard to refactor it
to something more meaningful.
> By contrast,
> using a valid mathematical symbol for the corresponding mathematical
> operation is exactly the sort of thing allowing new operators is meant to
> support.
The term "strawman fallacy" gets misused a lot on the internet, mostly
by people who use it as a short-hand for:
Dammit, you've just found the flaw in my argument I didn't
notice, so I'll try to distract attention by falsely accusing
you of a fallacy.
But your comments about symbols like + truly are a strawman:
Substituting a person’s actual position or argument with a
distorted, exaggerated, or misrepresented version of the
position of the argument.
https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/169/Strawman-Fallacy
I never proposed supporting arbitrary Unicode symbols like + (full
width plus sign), in fact the opposite, I explicitly ruled it out. In
response to a question about supporting Unicode operators, I said
"I'm not touching that hornet's nest with a twenty foot pole."
and listed a number of social and technical reasons for not supporting
Unicode operators. I said that the operators would have to be legal
identifiers.
So no, operators like ∉ ∥ ∢ ∽ ⊎ and + are not an option under my
proposal.
--
Steve
Funny this come back after all this time
Also, in a common use-case, bitwise-and behaves the same as logical_and, e.g.if (arr > x) & (arr2 == y)This "works" because both arrays being bitwise-anded are boolean arrays.
There are a few problems with using the bitwise operators.
First, and most important in my opinion, is that the precedence is significantly off from that of the logical operators.
if you are switching back and forth between, say, array logical operations and "normal" logical operations it is easy to mess up.
Third is that it allows both boolean and bitwise operations to be carried out on the same data types. Numpy is a special case where the two basically are equivalent if you are working with boolean arrays. But that is a special case.
So any new class that doesn't already make use of the bitwise operators can do that.
Coming back to the previous discussion about a new set of overloadable boolean operators [1], I have an idea for overloadable boolean operators that I think might work. The idea would be to define four new operators that take two inputs and return a boolean result based on them. This behavior can be overridden in appropriate dunder methods. These operators would have similar precedence to existing logical operators. The operators would be:
bNOT - boolean "not"bAND - boolean "and"bOR - boolean "or"bXOR - boolean "xor"
With corresponding dunder methods:__bNOT__ and _rbNOT__ (or __r_bNOT__)__bAND__ and _rbAND__ (or __r_bAND__)__bOR__ and _rbOR__ (or __r_bOR__)__bXOR__ and _rbXOR__ (or __r_bXOR__)The basic idea is that the "b" is short for "boolean", and we change the rest of the operator to upercase to avoid confusions with the existing operators. I think these operators would be preferably to the proposals so far (see [1] again) for a few reasons:1. They are not easy to mistake with existing operators. They are clearly not similar to the existing bitwise operators like & or |, and although they are clearly related to the "not", "and", and "or" I think they are distinct enough that it should not be easy to confuse the two or accidentally use one in place of the other.2. They are related to the operations they carry out, which is also an advantage over the existing bitwise operators.3. The corresponding dunder methods (such as __bAND__ and _rbAND__) are obvious and not easily confused with anything else.4. The unusual capitalization means they are not likely to be used much in existing Python code. It doesn't fall under any standard capitalization scheme I am aware of.5. At least for english the capitalization means they are not easy to confuse with existing words. For example Band is a word, but it is not likely to be capitalized as bAND.As to why this is useful, the overall problem is that the current logical operators, like and, or, and not, cannot be overloaded, which means projects like numpy and SQLAlchemy instead have to (ab)use bitwise operators to define their own boolean operations (for example elementwise "and" in numpy arrays). This has a variety of problems, such not having appropriate precedence leading to precedence errors being common, and the simple fact that this precludes them from using the bitwise operators for bitwise operations.
There was a proposal to allow overloading boolean operators in Pep-335 [2], but that PEP was rejected for a variety of very good reasons. I think none of those reasons (besides the conversation fizzling out) apply to my proposal.
So the alternative proposal that has been floating around is to instead define new operators specifically for this. Although there seemed to be some support for this in principle, the actually operators so far have not met with much enthusiasm. So far the main operators proposed so far seem to be:1. Double bitwise operators, such as && and ||. These have the disadvantage of looking like they should be a type of bitwise operator.2. the existing operators, with some non-letter character at the front and back, like ".and.". These have the advantage that they are currently not valid syntax in most cases, but I think are too similar to existing logical operators, to easy to confuse, and it is not immediately obvious in what way they should differ from existing operators. They also mean different things in other languages.So I think my proposal addresses the main issues raised with existing proposals, but has the downside that it requires new keywords.Thoughts?
--
---
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/LgwmlPp6YqM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Also, not having xor is made more painful by this proposal (or for any proposal for new Boolean operators using variants of and/or/not)...
I have been bitten a few times writing xor in my code (not often, because xor is done less often), it already feel like it's missing from python. With additional duplicated operators, including bXOR, the missing xor is annoying like a missing teeth: even if you don't use it so much, you think of it all the time ;-)
Greg.
Adding 4 operators, just for the sake of a bit of syntaxic suggar for
DSL based projects is never going to fly.
And I say that as a long time SQLA user.
Le 03/08/2018 à 19:46, Todd a écrit :
> Coming back to the previous discussion about a new set of overloadable
> boolean operators [1], I have an idea for overloadable boolean operators
> that I think might work. The idea would be to define four new operators
> that take two inputs and return a boolean result based on them. This
> behavior can be overridden in appropriate dunder methods. These
> operators would have similar precedence to existing logical operators.
> The operators would be:
>
> bNOT - boolean "not"
> bAND - boolean "and"
> bOR - boolean "or"
> bXOR - boolean "xor"
>
> With corresponding dunder methods:
>
> __bNOT__ and _rbNOT__ (or __r_bNOT__)
> __bAND__ and _rbAND__ (or __r_bAND__)
> __bOR__ and _rbOR__ (or __r_bOR__)
> __bXOR__ and _rbXOR__ (or __r_bXOR__)
>
> The basic idea is that the "b" is short for "boolean", and we change the
> rest of the operator to upercase to avoid confusions with the existing
> operators. I think these operators would be preferably to the proposals
> so far (see [1] again) for a few reasons: