Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

numpy.where() and multiple comparisons

319 views
Skip to first unread message

John Ladasky

unread,
Jan 17, 2014, 8:51:17 PM1/17/14
to
Hi folks,

I am awaiting my approval to join the numpy-discussion mailing list, at scipy.org. I realize that would be the best place to ask my question. However, numpy is so widely used, I figure that someone here would be able to help.

I like to use numpy.where() to select parts of arrays. I have encountered what I would consider to be a bug when you try to use where() in conjunction with the multiple comparison syntax of Python. Here's a minimal example:

Python 3.3.2+ (default, Oct 9 2013, 14:50:09)
[GCC 4.8.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = np.where(a < 5)
>>> b
(array([0, 1, 2, 3, 4]),)
>>> c = np.where(2 < a < 7)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Defining b works as I want and expect. The array contains the indices (not the values) of a where a < 5.

For my definition of c, I expect (array([3, 4, 5, 6]),). As you can see, I get a ValueError instead. I have seen the error message about "the truth value of an array with more than one element" before, and generally I understand how I (accidentally) provoke it. This time, I don't see it. In defining c, I expect to be stepping through a, one element at a time, just as I did when defining b.

Does anyone understand why this happens? Is there a smart work-around? Thanks.

duncan smith

unread,
Jan 17, 2014, 9:16:28 PM1/17/14
to
>>> a = np.arange(10)
>>> c = np.where((2 < a) & (a < 7))
>>> c
(array([3, 4, 5, 6]),)
>>>

Duncan

John Ladasky

unread,
Jan 17, 2014, 11:00:20 PM1/17/14
to
On Friday, January 17, 2014 6:16:28 PM UTC-8, duncan smith wrote:

> >>> a = np.arange(10)
> >>> c = np.where((2 < a) & (a < 7))
> >>> c
> (array([3, 4, 5, 6]),)

Nice! Thanks!

Now, why does the multiple comparison fail, if you happen to know?

Peter Otten

unread,
Jan 18, 2014, 3:50:00 AM1/18/14
to pytho...@python.org
2 < a < 7

is equivalent to

2 < a and a < 7

Unlike `&` `and` cannot be overridden (*), so the above implies that the
boolean value bool(2 < a) is evaluated. That triggers the error because the
numpy authors refused to guess -- and rightly so, as both implementable
options would be wrong in a common case like yours.

(*) I assume overriding would collide with short-cutting of boolean
expressions.

Tim Roberts

unread,
Jan 18, 2014, 4:20:54 PM1/18/14
to
Peter Otten <__pet...@web.de> wrote:

>John Ladasky wrote:
>
>> On Friday, January 17, 2014 6:16:28 PM UTC-8, duncan smith wrote:
>>
>>> >>> a = np.arange(10)
>>> >>> c = np.where((2 < a) & (a < 7))
>>> >>> c
>>> (array([3, 4, 5, 6]),)
>>
>> Nice! Thanks!
>>
>> Now, why does the multiple comparison fail, if you happen to know?
>
>2 < a < 7
>
>is equivalent to
>
>2 < a and a < 7
>
>Unlike `&` `and` cannot be overridden (*),,,,

And just in case it isn't obvious to the original poster, the expression "2
< a" only works because the numpy.array class has an override for the "<"
operator. Python natively has no idea how to compare an integer to a
numpy.array object.

Similarly, (2 < a) & (a > 7) works because numpy.array has an override for
the "&" operator. So, that expression is compiled as

numpy.array.__and__(
numpy.array.__lt__(2, a),
numpy.array.__lt__(a, 7)
)

As Peter said, there's no way to override the "and" operator.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Terry Reedy

unread,
Jan 18, 2014, 7:12:35 PM1/18/14
to pytho...@python.org
On 1/18/2014 3:50 AM, Peter Otten wrote:

> Unlike `&` `and` cannot be overridden (*),

> (*) I assume overriding would collide with short-cutting of boolean
> expressions.

Yes. 'and' could be called a 'control-flow operator', but in Python it
is not a functional operator.

A functional binary operator expression like 'a + b' abbreviates a
function call, without using (). In this case, it could be written
'operator.add(a,b)'. This function, or it internal equivalent, calls
either a.__add__(b) or b.__radd__(a) or both. It is the overloading of
the special methods that overrides the operator.

The control flow expression 'a and b' cannot abbreviate a function call
because Python calls always evaluate all arguments first. It is
equivalent* to the conditional (control flow) *expression* (also not a
function operator) 'a if not a else b'. Evaluation of either expression
calls bool(a) and hence a.__bool__ or a.__len__.

'a or b' is equivalent* to 'a if a else b'

* 'a (and/or) b' evaluates 'a' once, whereas 'a if (not/)a else b'
evaluates 'a' twice. This is not equivalent when there are side-effects.
Here is an example where this matters.
input('enter a non-0 number :') or 1

--
Terry Jan Reedy

0 new messages