I have created a class that allows ANY and ALL predicates on iterables;
it was easy to also allow vector operations (eg add something to every
item on the list).
However, to define class methods 'en masse', the only way I finally
found to do it was using the exec statement; 'setattr(klass,
attribute_name, result)' didn't do its magic.
Can you suggest a better method? Look at the 'exec' statement in the
class All definition to see what I mean. Example usage is at the end of
the code.
BTW, would you consider these as a worthy candidate for inclusion in the
standard library? Do you often need to check that some condition
applies to all items of a sequence? If yes, please suggest any changes
to the class name, module name or even to the source. I am using these
two classes quite often, and I think they add to the legibility of my
algorithms.
Thanks in advance for any replies.
*** predicates.py ***
import __builtin__, operator
class All(object):
"""Array operations on every element of its argument."""
__slots__ = ("_iterable",)
def __init__(self, iterable):
self._iterable = iterable
def _new_result(self):
"""Initialise a new list of results (internal)."""
return [None]*len(self._iterable)
def __nonzero__(self):
"""Return nonzero if all results are nonzero."""
return len(filter(None, self._iterable)) == len(self._iterable)
def _multi_(method, operate=False):
try:
default_operation = getattr(operator, method)
except AttributeError:
default_operation = None
def operate(self, *values):
"""Call a method on every item of self._iterable."""
result = self._new_result()
for ix, item in enumerate(self._iterable):
if default_operation:
result[ix] = default_operation(item, *values)
else:
result[ix] = getattr(item, method)(*values)
return self.__class__(result)
return operate
for method in \
"add,sub,mul,div,truediv,mod,floordiv,abs,neg,pos,pow,repeat,"\
"lshift,rshift," \
"cmp,eq,le,lt,gt,ge,ne," \
"call," \
"inv,invert,not,and,or,xor," \
"getitem,getslice," \
"contains,concat".split(","):
exec "%s = _multi_('%s')" % ( ("__%s__" % method,)*2 )
del _multi_, method
def result(self):
"""Return a list of results."""
return self._iterable
def __iter__(self):
"""Iterate through self._iterable ."""
return iter(self._iterable)
def __getattr__(self, attribute):
"""Return a list of every item's attribute."""
result = self._new_result()
for ix, item in enumerate(self._iterable):
result[ix] = getattr(item, attribute)
return self.__class__(result)
class Any(All):
__slots__ = ("_iterable",)
def __nonzero__(self):
"""Return nonzero if any result is non zero."""
return bool(filter(None, self._iterable))
if __name__ == "__main__":
li = [0, 2, 4, 6, 8]
ls = ["hello", "hell", "help"]
print "Integer list:", li
print "String list:", ls
# call a method on all items
if All(ls).startswith("hel"):
print "All strings start with 'hel'"
else:
print "[ERROR] Some strings do not start with 'hel'!!!"
# do math operations on all items
print "Add 5 to integer list:", (All(li)+5).result()
if "el" in All(ls):
print "All strings contain 'el'."
else:
print "[ERROR] Some strings do not contain 'el'!!!"
if All(li)%2 == 0:
print "All integers are even."
else:
print "[ERROR] Some integers are not even!!!"
if "elp" in Any(ls):
print "Some strings contain 'elp'."
else:
print "[ERROR] No strings contain 'elp'!!!"
--
TZOTZIOY, I speak England very best,
Real email address: 'dHpvdEBzaWwtdGVjLmdy\n'.decode('base64')
> Hello, people
>
> I have created a class that allows ANY and ALL predicates on iterables;
> it was easy to also allow vector operations (eg add something to every
> item on the list).
>
> However, to define class methods 'en masse', the only way I finally
> found to do it was using the exec statement; 'setattr(klass,
> attribute_name, result)' didn't do its magic.
That's because the object you're calling "klass" here doesn't exist
yet at the time you're trying to "set its attributes" -- it's created
(by its metaclass) when the class body is _finished_.
> Can you suggest a better method? Look at the 'exec' statement in the
This is exactly the kind of jobs that custom metaclasses do best. In
this case, though, if you only want to define one special class with
these wrapped special methods, an alternative that's just as good would
be to have the wrapping-loop just AFTER the class body.
> class All definition to see what I mean. Example usage is at the end of
> the code.
>
> BTW, would you consider these as a worthy candidate for inclusion in the
> standard library? Do you often need to check that some condition
> applies to all items of a sequence? If yes, please suggest any changes
No to the first, yes to the second. In the frequent case in which I
want to test if some property holds on every item of an iterable (and
your code only works on sequences, plus a few other special iterables
such as dicts, NOT on many other iterables such as files and generators,
because for example it relies on len(), and most iterables don't and
can't support that), almost invariably I much prefer to bail out as soon
as a non-compliant item is found, rather than having to operate on all
items willy-nilly. I find such semantics generally more important than
the "neat-o" syntax afforded by operator overloading. So, for example,
to check if all items in an iteable of numbers are evenly divisible by N:
def alldivisible(it, N):
for x in it:
if x%N != 0: return False
else:
return True
is almost invariably much more to my liking. Matter of taste, no doubt,
but I wouldn't want the standard library to encourage different semantics.
Alex
>> BTW, would you consider these as a worthy candidate for inclusion in the
>> standard library? Do you often need to check that some condition
>> applies to all items of a sequence? If yes, please suggest any changes
>
>No to the first, yes to the second. In the frequent case in which I
>want to test if some property holds on every item of an iterable (and
>your code only works on sequences, plus a few other special iterables
>such as dicts, NOT on many other iterables such as files and generators,
>because for example it relies on len(), and most iterables don't and
>can't support that),
Yes, you are correct about sequences and not iterables. So to be
precise, I should either:
- replace _iterable with _sequence
- change __nonzero__ to not use len, and perhaps avoid _new_result by
building the result list with .append
>almost invariably I much prefer to bail out as soon
>as a non-compliant item is found, rather than having to operate on all
>items willy-nilly. I find such semantics generally more important than
>the "neat-o" syntax afforded by operator overloading.
I care about speed too; this code actually was written before generators
initially, and while I have introduced some 2.3 syntax in it, I have not
yet attempted to turn this into a class using generators... perhaps
adding a .next method would not be such a bad start :)
Thanks for replying, Alex.
[setting class attributes with setattr in a class definition]
>In
>this case, though, if you only want to define one special class with
>these wrapped special methods, an alternative that's just as good would
>be to have the wrapping-loop just AFTER the class body.
Silly me! And this is what I had used when I created a class-factory
for tuples with named access to its members... Thanks.
You're welcome! But what you need here is a simple loop -- no generators
needed, really. Speed issues are uncertain -- benchmark things both ways --
but a simple loop is probably going to be *clearer*, and I think that
matters. There are occasions to use generators, of course, but when some
task can be handled just as well without such powerful tools then I believe
that relying on simpler tools can be preferable.
Alex
>But what you need here is a simple loop -- no generators
>needed, really.
Using generators and the new module, itertools, was the fastest version
I could finally do. Search for "predicates" in the Vaults (when they
get updated), at the end of the tests I included some benchmarking code.
Using All starts off at about 40% more time than doing everything
in-place, and it gets worse as expressions get complex; however, it's
fun, and when you don't care much for the utmost speed, I believe it can
be didactical (at least for newbies :)
Cheers, Python lovers!
Interesting. If I understand correctly, these let one write
(exists x in S)(x > 2) as Any(S) > 2
and (all x in S)(x > 2) as All(S) > 2
This is indeed a neat-o syntax, as Martelli says.
But I'm not sure I understand how these predicates can be
composed. Suppose I have a list of lists of integers, and wish to
verify that each list of integers contains something greater than
2. That is, I wish to check that
(all x in L)(exists y in x)(y > 2)
So, I write
Any(All(L)) > 2
Right?
I was not able to get this to work with the code as written
(though admittedly I did not spend much time on it), and am not
sure it could ever work sensibly. For example, what should
All(L).__iter__ return? An iterator over the elements of L? An
iterator of L's elements' iterators? A "flattened" iterator over
those elements' elements?
I think I'd prefer a scheme under which I'd write
all(L, lambda x: exists(x, lambda y: y > 2))
which is more verbose, but (to me at least) easier to parse. Also
the implementation is trivial, short-circuiting is easy, and no
difficult semantic questions arise. (It's really just a
generalization of map, etc., to generic iterables.)
--
Steven Taschuk stas...@telusplanet.net
"Its force is immeasurable. Even Computer cannot determine it."
-- _Space: 1999_ episode "Black Sun"
>Quoth Christos TZOTZIOY Georgiou:
>> I have created a class that allows ANY and ALL predicates on iterables;
>> it was easy to also allow vector operations (eg add something to every
>> item on the list).
...
>But I'm not sure I understand how these predicates can be
>composed. Suppose I have a list of lists of integers, and wish to
>verify that each list of integers contains something greater than
>2. That is, I wish to check that
> (all x in L)(exists y in x)(y > 2)
>So, I write
> Any(All(L)) > 2
>Right?
No, unfortunately not. The only way I did this was with a list
comprehension for the inner lists:
>>> from predicates import All
>>> l1=[ [1,2,3], [2,3,4], [5,6,7] ]
>>> l2=[ [1,2,3], [2,11,3], [6,4,2] ]
>>> bool(All([All(x)<10 for x in l1]))
True
>>> bool(All([All(x)<10 for x in l2]))
False
If lists had a max method (now there is only a builtin max function),
you would be able to say:
if All(L).max > 2: # ...
But in this specific example (I checked for > 10 instead of > 2), this
might be preferred:
>>> len([x for x in L if max(x)>10])==len(L)
There is a better (faster) version of this module in
<URL:http://www.sil-tec.gr/~tzot/python/predicates.py>, for which I
created a link in the Vaults yesterday, but it hasn't shown up yet. The
new version uses generators and the new itertools module, so it applies
only for post-2.3a1 Pythons (ie current CVS).
In another message which I thought I posted but quite probably sent only
through email to Alex (I must have pressed 'r' (eply) instead of 'f'
(ollow-up)), I said about including some benchmark time
>I was not able to get this to work with the code as written
>(though admittedly I did not spend much time on it), and am not
>sure it could ever work sensibly. For example, what should
>All(L).__iter__ return? An iterator over the elements of L? An
>iterator of L's elements' iterators? A "flattened" iterator over
>those elements' elements?
The current All(...) is an iterator itself, so All(...).__iter__ returns
self (ie none of the above).
>I think I'd prefer a scheme under which I'd write
> all(L, lambda x: exists(x, lambda y: y > 2))
>which is more verbose, but (to me at least) easier to parse. Also
>the implementation is trivial, short-circuiting is easy, and no
>difficult semantic questions arise. (It's really just a
>generalization of map, etc., to generic iterables.)
Yes, although I think the syntax my module allows is much more pleasant
to the (my :) eye.
Thanks for your time!
Absolutely; your syntax is very appealing...
...at least in simple cases. Even if these quantifiers could be
composed directly, it's not immediately obvious what
Any(All(L)) > 2
is supposed to mean. Also, quantifying at the variable rather
than around the whole statement (as is usual in logic) has
disadvantages when the variable occurs more than once:
All(L).startswith('a') and All(L).endswith('z')
will spin L twice, which precludes the use of this kind of
expression with generic iterables.
Imho such issues outweigh the impressively sexy syntax, and
motivate something like all(it, func). (It would be natural to
base such functions on map and filter, if those returned iterators
when their arguments were iterators... but separate, iterator-
returning versions of map and filter might be best.)
>...at least in simple cases. Even if these quantifiers could be
>composed directly, it's not immediately obvious what
> Any(All(L)) > 2
>is supposed to mean. Also, quantifying at the variable rather
>than around the whole statement (as is usual in logic) has
>disadvantages when the variable occurs more than once:
> All(L).startswith('a') and All(L).endswith('z')
>will spin L twice, which precludes the use of this kind of
>expression with generic iterables.
Yes, this is a flaw of the module, thanks for the pointer. A special
method that allows All(it, func) can always be added.
>Imho such issues outweigh the impressively sexy syntax, and
>motivate something like all(it, func). (It would be natural to
>base such functions on map and filter, if those returned iterators
>when their arguments were iterators... but separate, iterator-
>returning versions of map and filter might be best.)
This is what the itertools module is for (post-2.3a1 --thanks,
Raymond!). I used its functions in the latest version of the predicates
module, which still hasn't shown up in the Vaults...
If you wait for the next alpha release (or if you compile the current
CVS tree) you will have imap and ifilter to use :)