k_, a_, and m_

21 views
Skip to first unread message

Robert Kern

unread,
Jun 7, 2011, 1:01:10 PM6/7/11
to asq-d...@googlegroups.com
Hello!

I was just looking at asq's documentation. Very nice work! I just
wanted to mention that there are functions in the operator module that
do the same things as the k_, a_, and m_ functions, only they are
implemented in C and will avoid some of the Python function call
overhead of the lambda implementation. Specifically, k_ == itemgetter,
a_ == attrgetter, m_ == methodcaller. itemgetter and attrgetter have
the additional feature of allowing one to chain the keys/attributes;
e.g. attrgetter('x', 'y')(z) == z.x.y.

Similarly, most of the comparison functions in in predicate.py could
be replaced with mildly creative use of functools.partial() and the
comparison functions in operator. E.g.

def lt_(rhs):
return functools.partial(operator.gt, rhs)

http://docs.python.org/library/operator
http://docs.python.org/library/functools#functools.partial

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Robert Smallshire

unread,
Jun 7, 2011, 1:27:58 PM6/7/11
to asq-d...@googlegroups.com
I didn't know about attrgetter and friends. That's very useful information!

Performance is not something I've focussed on at all so far - preferring to get correct code first with good test coverage.  I'll start doing some profiling/benchmarking to see how much difference these and your other recommendations could make.

Many thanks,

Rob

Rob Smallshire

unread,
Jun 8, 2011, 5:41:17 PM6/8/11
to asq-discuss
After a little benchmarking I'm happy to report that attrgetter is
almost twice as fast as my lambda based implementation of a_ on Python
3.2.

Extracting an attribute from one million elements takes 0.72 seconds
with the lambda implementation and 0.38 seconds using attrgetter, so
attrgetter is almost twice as fast.

In the next release of asq, k_, a_ and m_ will simply be short aliases
for itemgetter, attrgetter and methodcaller respectively.

The benchmark code is below.

Thanks again,

Rob


import sys
from timeit import Timer

from asq.initiators import query
from asq.selectors import a_

class Foo(object):
def __init__(self):
self.value = 42

items = []

def setup():
for i in range(1000000):
items.append(Foo())

def bench():
results = query(items).select(a_("value")).to_list()

def main():
setup()
t = Timer("bench()", "from __main__ import bench")
print(min(t.repeat(5, 1)))
return 0

if __name__ == '__main__':
sys.exit(main())
Reply all
Reply to author
Forward
0 new messages