Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Need an identity operator because lambda is too slow

24 views
Skip to first unread message

Deron Meranda

unread,
Feb 18, 2007, 12:59:18 AM2/18/07
to
This is an optimization problem, one that occurs deep inside a loop
that gets executed thousands of times. I'm looking for something in
Python which would act like an identity operator, or I-combinator: a
do-nothing function. The best I know of is: (lambda x: x), but it is
not ideal.

Consider a much-simplified example of such an iteration where
sometimes you need to transform an item by some function, but most of
the time you don't:

if some_rare_condition:
func = some_transform_function
else:
func = lambda x: x
for item in some_sequence:
item2 = func(item)
..... # more stuff

Now the "lambda x:x" acts suitably like an identity operator. But it
is very slow, when compared to using more complex-looking code:

do_transform = some_rare_condition
for item in some_sequence:
if do_transform:
item2 = transform_function(item)
else:
item2 = item
..... # more stuff

What python needs is something like a built-in "operator.identity"
function, which acts like "lambda x:x", but which the byte compiler
could recognize and completely optimize away so there is no function
call overhead.

Does this exist someplace in the corners of the language, or are there
any ideas about how best to do something like this without having to
push a bunch of ugly if-then-else constructs inside all the loops
rather than a more elegant function object? (The need for this is
more obvious in my real-world code versus the simplified examples
above).

Thanks
--
Deron Meranda

Paul Rubin

unread,
Feb 18, 2007, 1:16:59 AM2/18/07
to
"Deron Meranda" <deron....@gmail.com> writes:
> do_transform = some_rare_condition
> for item in some_sequence:
> if do_transform:
> item2 = transform_function(item)
> else:
> item2 = item
> ..... # more stuff

This might be a little more direct:

from itertools import imap
if some_rare_condition:
mapped_sequence = imap(transform_function, some_sequence)
else:
mapped_sequence = some_sequence

for item in mapped_sequence:
# more stuff

I agree that an identity function would be useful. I think I
suggested it a long time ago and didn't get much interest.

Raymond Hettinger

unread,
Feb 18, 2007, 1:19:41 AM2/18/07
to
[Deron Meranda
>] I'm looking for something in

> Python which would act like an identity operator, or I-combinator: a
> do-nothing function. The best I know of is: (lambda x: x), but it is
> not ideal.

File a feature request on SourceForge and assign to me. This has come
up a couple of times and would be trivial to implement in the operator
or functools modules.

Raymond

Steven D'Aprano

unread,
Feb 18, 2007, 2:49:43 AM2/18/07
to
On Sat, 17 Feb 2007 21:59:18 -0800, Deron Meranda wrote:

> Consider a much-simplified example of such an iteration where
> sometimes you need to transform an item by some function, but most of
> the time you don't:
>
> if some_rare_condition:
> func = some_transform_function
> else:
> func = lambda x: x
> for item in some_sequence:
> item2 = func(item)
> ..... # more stuff

This does the test once, but does a function look-up every loop.


> Now the "lambda x:x" acts suitably like an identity operator. But it
> is very slow, when compared to using more complex-looking code:
>
> do_transform = some_rare_condition
> for item in some_sequence:
> if do_transform:
> item2 = transform_function(item)
> else:
> item2 = item
> ..... # more stuff

Despite doing the if..else comparison every loop, this is a little faster
for the case where some_rare_condition is false.

But I have to question whether this is a worthwhile optimization,
especially if some_rare_condition really is rare. Calling the identity
function "lambda x: x" one million times takes about half a second; or to
put it another way, each call to the identity function costs you about
half a microsecond. How much time does the rest of the loop processing
take? Are you sure you're optimizing something which needs to be optimized?

(E.g. if your main loop takes fifteen minutes to run, and you're trying to
shave off half a second, just don't bother.)

Compare the following, where I use an arbitrary small function as a
placeholder for whatever work you really want to do:

setup = """import time
identity = lambda x: x
do_work = lambda x: time.time() + x
x = 1
"""

Now use timeit to compare doing the work on its own with doing the work
together with an identity function:

>>> timeit.Timer("do_work(x)", setup).repeat()
[3.1834621429443359, 3.1083459854125977, 3.1382210254669189]
>>> timeit.Timer("do_work(identity(x))", setup).repeat()
[3.5951459407806396, 3.6067559719085693, 3.5801000595092773]

Is your "do_work" function really so fast that one second per two million
calls to identity() is a significant load?

If some_rare_condition really is rare, then a possible solution might
be:

if some_rare_condition:
some_sequence = [transform(item) for item in some_sequence]
# or modify in place if needed... see enumerate
for item in some_sequence:
do_something_with(item)

This should run as fast as possible in the common case, and slow down only
in the rare condition.

Another solution would be:

if some_rare_condition:
for item in some_sequence:
item = transform(item)
do_something_with(item)
else:
for item in some_sequence:
do_something_with(item)

This is probably going to be the fastest of all, but has the disadvantage
that you are writing the same code twice.


--
Steven.

Steven D'Aprano

unread,
Feb 18, 2007, 2:57:46 AM2/18/07
to

I'm surprised. The Original Poster specified

[quote]


What python needs is something like a built-in "operator.identity"
function, which acts like "lambda x:x", but which the byte compiler
could recognize and completely optimize away so there is no function
call overhead.

[end quote]

I would have guessed that optimizing the call away would require support
from the compiler.

--
Steven.

David Wahler

unread,
Feb 18, 2007, 7:10:03 AM2/18/07
to
On Feb 18, 5:59 am, "Deron Meranda" <deron.mera...@gmail.com> wrote:
> Consider a much-simplified example of such an iteration where
> sometimes you need to transform an item by some function, but most of
> the time you don't:
>
> if some_rare_condition:
> func = some_transform_function
> else:
> func = lambda x: x
> for item in some_sequence:
> item2 = func(item)
> ..... # more stuff
>
> Now the "lambda x:x" acts suitably like an identity operator. But it
> is very slow, when compared to using more complex-looking code:
>
> do_transform = some_rare_condition
> for item in some_sequence:
> if do_transform:
> item2 = transform_function(item)
> else:
> item2 = item
> ..... # more stuff
>
> What python needs is something like a built-in "operator.identity"
> function, which acts like "lambda x:x", but which the byte compiler
> could recognize and completely optimize away so there is no function
> call overhead.

Unless I'm misunderstanding you, you seem to be proposing that the
compiler should avoid generating the call to func2() if it has a
certain value. How could that information possibly be available at
compile time?

-- David

Sean McIlroy

unread,
Feb 18, 2007, 9:31:53 PM2/18/07
to
On Feb 17, 9:59 pm, "Deron Meranda" <deron.mera...@gmail.com> wrote:
[snip]

this may be really dense, but i'm curious what's wrong with the
"multiplexer" idiom:

for item in some_sequence:
item2 = (not some_rare_condition and item) or \
(some_rare_condition and
some_transform_function(item))
..... # more stuff

peace
stm

Gabriel Genellina

unread,
Feb 19, 2007, 12:02:55 AM2/19/07
to pytho...@python.org
En Sun, 18 Feb 2007 23:31:53 -0300, Sean McIlroy <sean_m...@yahoo.com>
escribió:

The main concern of the OP was that lambda x:x is slow, so it's better to
move the condition out of the loop.
Also, this and/or trick does not work when the Boolean value of item is
false (like 0, (), []...)

--
Gabriel Genellina

0 new messages