[Numpy-discussion] numpy all unexpected result (generator)

59 views
Skip to first unread message

Neal Becker

unread,
Jan 31, 2012, 8:26:34 AM1/31/12
to numpy-di...@scipy.org
I was just bitten by this unexpected behavior:

In [24]: all ([i> 0 for i in xrange (10)])
Out[24]: False

In [25]: all (i> 0 for i in xrange (10))
Out[25]: True

Turns out:
In [31]: all is numpy.all
Out[31]: True

So numpy.all doesn't seem to do what I would expect when given a generator.
Bug?

_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Robert Kern

unread,
Jan 31, 2012, 9:07:31 AM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 13:26, Neal Becker <ndbe...@gmail.com> wrote:
> I was just bitten by this unexpected behavior:
>
> In [24]: all ([i>  0 for i in xrange (10)])
> Out[24]: False
>
> In [25]: all (i>  0 for i in xrange (10))
> Out[25]: True
>
> Turns out:
> In [31]: all is numpy.all
> Out[31]: True
>
> So numpy.all doesn't seem to do what I would expect when given a generator.
> Bug?

Expected behavior. numpy.all(), like nearly all numpy functions,
converts the input to an array using numpy.asarray(). numpy.asarray()
knows nothing special about generators and other iterables that are
not sequences, so it thinks it's a single scalar object. This scalar
object happens to have a __nonzero__() method that returns True like
most Python objects that don't override this.

In order to use generic iterators that are not sequences, you need to
explicitly use numpy.fromiter() to convert them to ndarrays. asarray()
and array() can't do it in general because they need to autodiscover
the shape and dtype all at the same time.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Dag Sverre Seljebotn

unread,
Jan 31, 2012, 9:14:24 AM1/31/12
to numpy-di...@scipy.org
On 01/31/2012 03:07 PM, Robert Kern wrote:
> On Tue, Jan 31, 2012 at 13:26, Neal Becker<ndbe...@gmail.com> wrote:
>> I was just bitten by this unexpected behavior:
>>
>> In [24]: all ([i> 0 for i in xrange (10)])
>> Out[24]: False
>>
>> In [25]: all (i> 0 for i in xrange (10))
>> Out[25]: True
>>
>> Turns out:
>> In [31]: all is numpy.all
>> Out[31]: True
>>
>> So numpy.all doesn't seem to do what I would expect when given a generator.
>> Bug?
>
> Expected behavior. numpy.all(), like nearly all numpy functions,
> converts the input to an array using numpy.asarray(). numpy.asarray()
> knows nothing special about generators and other iterables that are
> not sequences, so it thinks it's a single scalar object. This scalar
> object happens to have a __nonzero__() method that returns True like
> most Python objects that don't override this.
>
> In order to use generic iterators that are not sequences, you need to
> explicitly use numpy.fromiter() to convert them to ndarrays. asarray()
> and array() can't do it in general because they need to autodiscover
> the shape and dtype all at the same time.

Perhaps np.asarray could specifically check for a generator argument and
raise an exception? I imagine that would save people some time when
running into this...

If you really want

In [7]: x = np.asarray(None)

In [8]: x[()] = (i for i in range(10))

In [9]: x
Out[9]: array(<generator object <genexpr> at 0x4553fa0>, dtype=object)

...then one can type it out?

Dag

Neal Becker

unread,
Jan 31, 2012, 9:33:55 AM1/31/12
to numpy-di...@scipy.org
Dag Sverre Seljebotn wrote:

The reason it surprised me, is that python 'all' doesn't behave as numpy 'all'
in this respect - and using ipython, I didn't even notice that 'all' was
numpy.all rather than standard python all. All in all, rather unfortunate :)

Alan G Isaac

unread,
Jan 31, 2012, 10:03:55 AM1/31/12
to Discussion of Numerical Python
On 1/31/2012 8:26 AM, Neal Becker wrote:
> I was just bitten by this unexpected behavior:
>
> In [24]: all ([i> 0 for i in xrange (10)])
> Out[24]: False
>
> In [25]: all (i> 0 for i in xrange (10))
> Out[25]: True
>
> Turns out:
> In [31]: all is numpy.all
> Out[31]: True


>>> np.array([i> 0 for i in xrange (10)])
array([False, True, True, True, True, True, True, True, True, True], dtype=bool)
>>> np.array(i> 0 for i in xrange (10))
array(<generator object <genexpr> at 0x0267A210>, dtype=object)
>>> import this


Cheers,
Alan

Benjamin Root

unread,
Jan 31, 2012, 10:13:54 AM1/31/12
to Discussion of Numerical Python


On Tuesday, January 31, 2012, Alan G Isaac <alan....@gmail.com> wrote:
> On 1/31/2012 8:26 AM, Neal Becker wrote:
>> I was just bitten by this unexpected behavior:
>>
>> In [24]: all ([i>   0 for i in xrange (10)])
>> Out[24]: False
>>
>> In [25]: all (i>   0 for i in xrange (10))
>> Out[25]: True
>>
>> Turns out:
>> In [31]: all is numpy.all
>> Out[31]: True
>
>
>>>> np.array([i>  0 for i in xrange (10)])
> array([False,  True,  True,  True,  True,  True,  True,  True,  True,  True], dtype=bool)
>>>> np.array(i>  0 for i in xrange (10))
> array(<generator object <genexpr> at 0x0267A210>, dtype=object)
>>>> import this
>
>
> Cheers,
> Alan
>

Is np.all() using np.array() or np.asanyarray()?  If the latter, I would expect it to return a numpy array from a generator.  If the former, why isn't it using asanyarray()?

Ben Root

Robert Kern

unread,
Jan 31, 2012, 10:18:26 AM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 15:13, Benjamin Root <ben....@ou.edu> wrote:

> Is np.all() using np.array() or np.asanyarray()?  If the latter, I would
> expect it to return a numpy array from a generator.

Why would you expect that?

[~/scratch]
|37> np.asanyarray(i>5 for i in range(10))
array(<generator object <genexpr> at 0xdc24a08>, dtype=object)

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Dag Sverre Seljebotn

unread,
Jan 31, 2012, 10:19:29 AM1/31/12
to numpy-di...@scipy.org
On 01/31/2012 04:13 PM, Benjamin Root wrote:
>
>
> On Tuesday, January 31, 2012, Alan G Isaac <alan....@gmail.com

Your expectation is probably wrong:

In [12]: np.asanyarray(i for i in range(10))
Out[12]: array(<generator object <genexpr> at 0x455d9b0>, dtype=object)

Dag Sverre

Benjamin Root

unread,
Jan 31, 2012, 10:35:38 AM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 9:18 AM, Robert Kern <rober...@gmail.com> wrote:
On Tue, Jan 31, 2012 at 15:13, Benjamin Root <ben....@ou.edu> wrote:

> Is np.all() using np.array() or np.asanyarray()?  If the latter, I would
> expect it to return a numpy array from a generator.

Why would you expect that?

[~/scratch]
|37> np.asanyarray(i>5 for i in range(10))
array(<generator object <genexpr> at 0xdc24a08>, dtype=object)

--
Robert Kern

What possible use-case could there be for a numpy array of generators?  Furthermore, from the documentation:

numpy.asanyarray = asanyarray(a, dtype=None, order=None, maskna=None, ownmaskna=False)
     Convert the input to an ndarray, but pass ndarray subclasses through.
   
     Parameters
     ----------
     a : array_like
         Input data, in any form that can be converted to an array.  This
         includes scalars, lists, lists of tuples, tuples, tuples of tuples,
         tuples of lists, and ndarrays.
 
Emphasis mine.  A generator is an input that could be converted into an array.  (Setting aside the issue of non-terminating generators such as those from cycle()).

Ben Root

Dag Sverre Seljebotn

unread,
Jan 31, 2012, 10:46:01 AM1/31/12
to numpy-di...@scipy.org
On 01/31/2012 04:35 PM, Benjamin Root wrote:
>
>
> On Tue, Jan 31, 2012 at 9:18 AM, Robert Kern <rober...@gmail.com
> <mailto:rober...@gmail.com>> wrote:
>
> On Tue, Jan 31, 2012 at 15:13, Benjamin Root <ben....@ou.edu
> <mailto:ben....@ou.edu>> wrote:
>
> > Is np.all() using np.array() or np.asanyarray()? If the latter,
> I would
> > expect it to return a numpy array from a generator.
>
> Why would you expect that?
>
> [~/scratch]
> |37> np.asanyarray(i>5 for i in range(10))
> array(<generator object <genexpr> at 0xdc24a08>, dtype=object)
>
> --
> Robert Kern
>
>
> What possible use-case could there be for a numpy array of generators?
> Furthermore, from the documentation:
>
> numpy.asanyarray = asanyarray(a, dtype=None, order=None, maskna=None,
> ownmaskna=False)
> Convert the input to an ndarray, but pass ndarray subclasses through.
>
> Parameters
> ----------
> a : array_like
> *Input data, in any form that can be converted to an array*. This

> includes scalars, lists, lists of tuples, tuples, tuples of
> tuples,
> tuples of lists, and ndarrays.
>
> Emphasis mine. A generator is an input that could be converted into an
> array. (Setting aside the issue of non-terminating generators such as
> those from cycle()).

Splitting semantic hairs doesn't help here -- it *does* return an array,
it just happens to be a completely useless 0-dimensional one.

The question is, is the current confusing and less than useful? (I vot
for "yes"). list and tuple are special-cased, why not generators (at
least to raise an exception)

Going OT, look at this gem:

????

In [3]: a
Out[3]: array([1, 2, 3], dtype=object)

In [4]: a.shape
Out[4]: ()

???

In [9]: b
Out[9]: array([1, 2, 3], dtype=object)

In [10]: b.shape
Out[10]: (3,)

Figuring out the "???" is left as an exercise to the reader :-)

Dag Sverre

Alan G Isaac

unread,
Jan 31, 2012, 10:48:05 AM1/31/12
to Discussion of Numerical Python
On 1/31/2012 10:35 AM, Benjamin Root wrote:
> A generator is an input that could be converted into an array.


def mygen():
i = 0
while True:
yield i
i += 1

Alan Isaac

Robert Kern

unread,
Jan 31, 2012, 10:50:15 AM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 15:35, Benjamin Root <ben....@ou.edu> wrote:
>
>
> On Tue, Jan 31, 2012 at 9:18 AM, Robert Kern <rober...@gmail.com> wrote:
>>
>> On Tue, Jan 31, 2012 at 15:13, Benjamin Root <ben....@ou.edu> wrote:
>>
>> > Is np.all() using np.array() or np.asanyarray()?  If the latter, I would
>> > expect it to return a numpy array from a generator.
>>
>> Why would you expect that?
>>
>> [~/scratch]
>> |37> np.asanyarray(i>5 for i in range(10))
>> array(<generator object <genexpr> at 0xdc24a08>, dtype=object)
>>
>> --
>> Robert Kern
>
>
> What possible use-case could there be for a numpy array of generators?

Not many. This isn't an intentional feature, just a logical
consequence of all of the other intentional features being applied
consistently.

> Furthermore, from the documentation:
>
> numpy.asanyarray = asanyarray(a, dtype=None, order=None, maskna=None,
> ownmaskna=False)
>      Convert the input to an ndarray, but pass ndarray subclasses through.
>
>      Parameters
>      ----------
>      a : array_like
>          Input data, in any form that can be converted to an array.  This
>          includes scalars, lists, lists of tuples, tuples, tuples of tuples,
>          tuples of lists, and ndarrays.
>
> Emphasis mine.  A generator is an input that could be converted into an
> array.  (Setting aside the issue of non-terminating generators such as those
> from cycle()).

I'm sorry, but this is not true. In general, it's too hard to do all
of the magic autodetermination that asarray() and array() do when
faced with an indeterminate-length iterable. We tried. That's why we
have fromiter(). By restricting the domain to an iterable yielding
scalars and requiring that the user specify the desired dtype,
fromiter() can figure out the rest.

Like it or not, "array_like" is practically defined by the behavior of
np.asarray(), not vice-versa.

Olivier Delalleau

unread,
Jan 31, 2012, 11:05:34 AM1/31/12
to Discussion of Numerical Python

In that case I agree with whoever said ealier it would be best to detect this case and throw an exception, as it'll probably save some headaches.

-=- Olivier

Robert Kern

unread,
Jan 31, 2012, 11:11:19 AM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 15:35, Benjamin Root <ben....@ou.edu> wrote:

> Furthermore, from the documentation:
>
> numpy.asanyarray = asanyarray(a, dtype=None, order=None, maskna=None,
> ownmaskna=False)
>      Convert the input to an ndarray, but pass ndarray subclasses through.
>
>      Parameters
>      ----------
>      a : array_like
>          Input data, in any form that can be converted to an array.  This
>          includes scalars, lists, lists of tuples, tuples, tuples of tuples,
>          tuples of lists, and ndarrays.

I should also add that this verbiage is also in np.asarray(). The only
additional feature of np.asanyarray() is that is does not convert
ndarray subclasses like matrix to ndarray objects. np.asanyarray()
does not accept more types of objects than np.asarray().

Benjamin Root

unread,
Jan 31, 2012, 11:46:56 AM1/31/12
to Discussion of Numerical Python

I'll agree with this statement.  This bug has popped up a few times in the mpl bug tracker due to the pylab mode.  While I would prefer if it were possible to evaluate the generator into an array, silently returning True incorrectly for all() and any() is probably far worse.

That said, is it still impossible to make np.all() and np.any() special to have similar behavior to the built-in all() and any()?  Maybe it could catch the above exception and then return the result from python's built-ins?

Cheers!
Ben Root

Chris Barker

unread,
Jan 31, 2012, 12:07:20 PM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 6:33 AM, Neal Becker <ndbe...@gmail.com> wrote:
> The reason it surprised me, is that python 'all' doesn't behave as numpy 'all'
> in this respect - and using ipython, I didn't even notice that 'all' was
> numpy.all rather than standard python all.

"namespaces are one honking great idea"

-- sorry, I couldn't help myself....


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Travis Oliphant

unread,
Jan 31, 2012, 5:17:28 PM1/31/12
to Discussion of Numerical Python
I also agree that an exception should be raised at the very least.

It might also be possible to make the NumPy any, all, and sum functions behave like the builtins when given a generator.  It seems worth exploring at least.

Travis 

--
Travis Oliphant
(on a mobile)

Robert Kern

unread,
Jan 31, 2012, 5:22:17 PM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 22:17, Travis Oliphant <tra...@continuum.io> wrote:
> I also agree that an exception should be raised at the very least.
>
> It might also be possible to make the NumPy any, all, and sum functions
> behave like the builtins when given a generator.  It seems worth exploring
> at least.

I would rather we deprecate the all() and any() functions in favor of
the alltrue() and sometrue() aliases that date back to Numeric.
Renaming them to match the builtin names was a mistake.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Warren Weckesser

unread,
Jan 31, 2012, 5:25:33 PM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 4:22 PM, Robert Kern <rober...@gmail.com> wrote:
On Tue, Jan 31, 2012 at 22:17, Travis Oliphant <tra...@continuum.io> wrote:
> I also agree that an exception should be raised at the very least.
>
> It might also be possible to make the NumPy any, all, and sum functions
> behave like the builtins when given a generator.  It seems worth exploring
> at least.

I would rather we deprecate the all() and any() functions in favor of
the alltrue() and sometrue() aliases that date back to Numeric.


+1  (Maybe 'anytrue' for consistency?  (And a royal blue bike shed?))

Warren

Travis Oliphant

unread,
Jan 31, 2012, 5:35:18 PM1/31/12
to Discussion of Numerical Python, Discussion of Numerical Python
Actually i believe the NumPy 'any' and 'all' names pre-date the Python usage which first appeared in Python 2.5

I agree with Chris that namespaces are a great idea. I don't agree with deprecating 'any' and 'all'

It also seems useful to revisit under what conditions 'array' could correctly interpret a generator expression, but in the context of streaming or deferred arrays.

Travis


--
Travis Oliphant
(on a mobile)
512-826-7480

josef...@gmail.com

unread,
Jan 31, 2012, 8:45:52 PM1/31/12
to Discussion of Numerical Python
On Tue, Jan 31, 2012 at 5:35 PM, Travis Oliphant <tra...@continuum.io> wrote:
> Actually i believe the NumPy 'any' and 'all' names pre-date the Python usage which first appeared in Python 2.5
>
> I agree with Chris that namespaces are a great idea.  I don't agree with deprecating 'any' and 'all'

I completely agree here.

I also like to keep np.all, np.any, np.max, ...

>>> np.max((i> 0 for i in xrange (10)))
<generator object <genexpr> at 0x046493F0>
>>> max((i> 0 for i in xrange (10)))
True

I used an old-style matplotlib example as recipe yesterday, and the
first thing I did is getting rid of the missing name spaces, and I had
to think twice what amax and amin are.

aall, aany ??? ;)

Josef

Reply all
Reply to author
Forward
0 new messages