Is there a better way?
--
Neil Cerutti
class Iterable(object):
def __iter__(self):
pass
class NotIterable(object):
pass
def is_iterable(thing):
return '__iter__' in dir(thing)
print 'list is iterable = ', is_iterable(list())
print 'int is iterable = ', is_iterable(10)
print 'float is iterable = ', is_iterable(1.2)
print 'dict is iterable = ', is_iterable(dict())
print 'Iterable is iterable = ', is_iterable(Iterable())
print 'NotIterable is iterable = ', is_iterable(NotIterable())
Results:
list is iterable = True
int is iterable = False
float is iterable = False
dict is iterable = True
Iterable is iterable = True
NotIterable is iterable = False
> You can use the built-in dir() function to determine whether or not
> the __iter__ method exists:
Doesn't work:
In [58]: is_iterable('hello')
Out[58]: False
But strings *are* iterable.
And just calling `iter()` doesn't work either:
In [72]: class A:
....: def __getitem__(self, key):
....: if key == 42:
....: return 'answer'
....: raise KeyError
....:
In [73]: iter(A())
Out[73]: <iterator object at 0xb7829b2c>
In [74]: a = iter(A())
In [75]: a.next()
---------------------------------------------------------------------------
<type 'exceptions.KeyError'> Traceback (most recent call last)
/home/bj/<ipython console> in <module>()
/home/bj/<ipython console> in __getitem__(self, key)
<type 'exceptions.KeyError'>:
So there's no reliable way to test for "iterables" other than actually
iterate over the object.
Ciao,
Marc 'BlackJack' Rintsch
Testing for __iter__ alone is not enough:
>>> class X(object):
... def __getitem__(self,i):
... if i<10: return i
... else: raise IndexError, i
...
>>> x = X()
>>> is_iterable(x)
False
>>> iter(x)
<iterator object at 0xb7f0182c>
>>> for i in x: print i
...
0
1
2
3
4
5
6
7
8
9
No __iter__ in sight, but the object is iterable.
--
Carsten Haese
http://informixdb.sourceforge.net
I think you might have to check for __getitem__, also, which may
support the sequence protocol.
> def is_iterable(thing):
> return '__iter__' in dir(thing)
So then:
def is_iterable(thing):
return '__iter__' in dir(thing) or '__getitem__' in dir(thing)
Speaking of the iter builtin function, is there an example of the
use of the optional sentinel object somewhere I could see?
--
Neil Cerutti
-Jeff
> Speaking of the iter builtin function, is there an example of the
> use of the optional sentinel object somewhere I could see?
for line in iter(open('somefile.txt', 'r').readline, ''):
print line
A TypeError exception is perhaps too generic for comfort in this
use case:
def deeply_mapped(func, iterable):
for item in iterable:
try:
for it in flattened(item):
yield func(it)
except TypeError:
yield func(item)
I'd be more confortable excepting some sort of IterationError (or
using an is_iterable function, of course). I guess there's always
itertools. ;)
--
Neil Cerutti
You seem to say that your 'a' object is not iterable. I disagree. While
it's true that it raises an exception upon retrieval of the zeroth
iteration, that situation is quite different from attempting to iterate
over the number 10, where you can't even ask for a zeroth iteration.
To illustrate this point further, imagine you write an object that
iterates over the lines of text read from a socket. If the connection is
faulty and closes the socket before you read the first line, the zeroth
iteration raises an exception. Does that mean the object is iterable or
not depending on the reliability of the socket connection? I find that
notion hard to swallow.
Example 1: If you use a DB-API module that doesn't support direct cursor
iteration with "for row in cursor", you can simulate it this way:
for row in iter(cursor.fetchone, None):
# do something
Example 2: Reading a web page in chunks of 8kB:
f = urllib.urlopen(url)
for chunk in iter(lambda:f.read(8192), ""):
# do something
HTH,
Ah! Thanks for the examples. That's much simpler than I was
imagining. It's also somewhat evil, but I suppose it conserves a
global name to do it that way.
--
Neil Cerutti
> Speaking of the iter builtin function, is there an example of the
> use of the optional sentinel object somewhere I could see?
# iterate over random numbers from 1 to 10; use 0 as a sentinel to
stop the iteration
for n in iter(lambda:random.randrange(10), 0):
print n
More generally, iter(callable, sentinel) is just a convenience
function for the following generator:
def iter(callable, sentinel):
while True:
c = callable()
if c == sentinel: break
yield c
George
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
The only other alternative I see is worse:
def iterable(obj):
# strings are iterable and don't have an __iter__ method...
for name in ('__iter__', '__getitem__'):
try:
getattr(obj, name)
return True
except AttributeError:
pass
else:
return False
def is_iterable(obj):
try:
iter(obj).next()
return True
except TypeError:
return False
except KeyError:
return False
The call to iter will fail for objects that don't support the
iterator protocol, and the call to next will fail for a
(hopefully large) subset of the objects that don't support the
sequence protocol.
This seems preferable to cluttering code with exception handling
and inspecting tracebacks. But it's still basically wrong, I
guess.
To repost my use case:
def deeply_mapped(func, iterable):
""" Recursively apply a function to every item in a iterable object,
recursively descending into items that are iterable. The result is an
iterator over the mapped values. Similar to the builtin map function, func
may be None, causing the items to returned unchanged.
>>> import functools
>>> flattened = functools.partial(deeply_mapped, None)
>>> list(flattened([[1], [2, 3, []], 4]))
[1, 2, 3, 4]
>>> list(flattened(((1), (2, 3, ()), 4)))
[1, 2, 3, 4]
>>> list(flattened([[[[]]], 1, 2, 3, 4]))
[1, 2, 3, 4]
>>> list(flattened([1, [[[2, 3]], 4]]))
[1, 2, 3, 4]
"""
for item in iterable:
if is_iterable(item):
for it in deeply_mapped(func, item):
if func is None:
yield it
else:
yield func(it)
else:
if func is None:
yield item
else:
yield func(item)
--
Neil Cerutti
> Based on the discussions in this thread (thanks all for your
> thoughts), I'm settling for:
>
> def is_iterable(obj):
> try:
> iter(obj).next()
> return True
> except TypeError:
> return False
> except KeyError:
> return False
>
> The call to iter will fail for objects that don't support the
> iterator protocol, and the call to next will fail for a
> (hopefully large) subset of the objects that don't support the
> sequence protocol.
And the `next()` consumes an element if `obj` is not "re-iterable".
Ciao,
Marc 'BlackJack' Rintsch
Crap.
So how best to imlement deeply_mapped?
The following still breaks for objects that don't provide
__iter__, do provide __getitem__, but don't support the sequence
protocol.
def deeply_mapped(func, iterable):
""" Recursively apply a function to every item in a nested container,
recursively descending into items that are iterable. The result is an
iterator over the mapped values. Similar to the builtin map function, func
may be None, causing the items to returned unchanged.
>>> import functools
>>> flattened = functools.partial(deeply_mapped, None)
>>> list(flattened([[1], [2, 3, []], 4]))
[1, 2, 3, 4]
>>> list(flattened(((1), (2, 3, ()), 4)))
[1, 2, 3, 4]
>>> list(flattened([[[[]]], 1, 2, 3, 4]))
[1, 2, 3, 4]
>>> list(flattened([1, [[[2, 3]], 4]]))
[1, 2, 3, 4]
>>> def magic(o):
... if o == 3:
... raise TypeError('Three is a magic number')
... return o
>>> list(deeply_mapped(magic, [1, [[[2, 3]], 4]]))
Traceback (most recent call last):
...
TypeError: Three is a magic number
"""
frame = inspect.currentframe()
info = inspect.getframeinfo(frame)
filename = info[0]
funcname = info[2]
for item in iterable:
try:
for it in deeply_mapped(func, item):
if func is None:
yield it
else:
yield func(it)
except TypeError:
eframe = inspect.trace()[-1]
efilename = eframe[1]
efuncname = eframe[3]
if efilename != filename or efuncname != funcname:
raise
That's not the only problem; try a string element to see it break too.
More importantly, do you *always* want to handle strings as
iterables ?
The best general way to do what you're trying to is pass is_iterable()
as an optional argument with a
sensible default, but allow the user to pass a different one that is
more appropriate for the task at hand:
def is_iterable(obj):
try: iter(obj)
except: return False
else: return True
def flatten(obj, is_iterable=is_iterable):
if is_iterable(obj):
for item in obj:
for flattened in flatten(item, is_iterable):
yield flattened
else:
yield obj
from functools import partial
flatten_nostr = partial(flatten, is_iterable=lambda obj: not
isinstance(obj,basestring)
and
is_iterable(obj))
print list(flatten_nostr([1, [[[2, 'hello']], (4, u'world')]]))
By the way, it's bad design to couple two distinct tasks: flattening a
(possibly nested) iterable and applying a function to its elements.
Once you have a flatten() function, deeply_mapped is reduced down to
itertools.imap.
HTH,
George
> def flatten(obj, is_iterable=is_iterable):
That makes good sense.
Plus the subtly different way you composed is_iterable is clearer
than what I originally wrote. I haven't ever used a try with an
else.
> if is_iterable(obj):
> for item in obj:
> for flattened in flatten(item, is_iterable):
> yield flattened
> else:
> yield obj
>
> By the way, it's bad design to couple two distinct tasks:
> flattening a (possibly nested) iterable and applying a function
> to its elements. Once you have a flatten() function,
> deeply_mapped is reduced down to itertools.imap.
I chose to implement deeply_mapped because it illustrated the
problem of trying to catch a TypeError exception when one might
be thrown by some other code. I agree with your opinion that it's
a design flaw, and most of my problems with the code were caused
by that flaw.
--
Neil Cerutti