For example, I would like to make a copy of a function so I can change
the default values:
>>> from copy import copy
>>> f = lambda x: x
>>> f.func_defaults = (1,)
>>> g = copy(f)
>>> g.func_defaults = (2,)
>>> f(),g()
(2, 2)
I would like the following behaviour:
>>> f(),g()
(1,2)
I know I could use a 'functor' defining __call__ and using member
variables, but this is more complicated and quite a bit slower. (I
also know that I can use new.function to create a new copy, but I
would like to know the rational behind the decision to make functions
atomic before I shoot myself in the foot;-)
Thanks,
Michael.
This dont make functions copiable but may resolve your default arguments
problem. Under Python 2.5, see functools.partial().
Does deepcopy work?
> Does deepcopy work?
It doesn't copy a function.
The easiest way to make a modified copy of a function is to use the 'new'
module.
>>> def f(x=2): print "x=", x
>>> g = new.function(f.func_code, f.func_globals, 'g', (3,),
f.func_closure)
>>> g()
x= 3
>>> f()
x= 2
The copy module considers functions to be immutable and just returns
the object. This seems pretty clearly wrong to me - functions are
clearly not immutable and it's easy to copy a function using new, as
shown above.
>From copy.py:
def _copy_immutable(x):
return x
for t in (type(None), int, long, float, bool, str, tuple,
frozenset, type, xrange, types.ClassType,
types.BuiltinFunctionType,
types.FunctionType):
d[t] = _copy_immutable
Because Python has objects for when you need to associate
state with a function.
John Nagle
Function objects were not copyable because they were immutable. For
an immutable object, a copy cannot reasonably be distinguished from
the original object. See copy.py, around line 105.
Regards,
Martin
Then why are functions mutable?
I can understand to some extent why functions are not picklable,
because the bytecode may not be the same across python implementations
(is that true?), but I do not understand why copying functions is a
problem. The patch that allows copy to pass-through functions just
emulates pickle, but I can find no discussion or justification for not
allowing functions to be copied:
http://thread.gmane.org/gmane.comp.python.devel/76636
Michael.
"Function objects also support getting and setting arbitrary
attributes, which can be used, for example, to attach metadata to
functions. Regular attribute dot-notation is used to get and set such
attributes. Note that the current implementation only supports
function attributes on user-defined functions. Function attributes on
built-in functions may be supported in the future."
http://docs.python.org/ref/types.html
Again, rather inconsitent with the copy sematics.
One could make a case that this is a bug, a leftover from when functions
were mostly immutable. However, one can also make a case that correcting
this bug is not worth the effort. Your use case appears to be that you
want to make multiple copies of the same function, and those copies
should be almost, but not quite, the same.
The Pythonic solution is to produce the copies by a factory function
along these lines:
>>> def powerfactory(exponent):
... def inner(x):
... return x**exponent
... return inner
...
>>> square = powerfactory(2)
>>> cube = powerfactory(3)
>>> square(2)
4
>>> square(3)
9
>>> cube(2)
8
>>> cube(3)
27
This approach makes copying functions unnecessary, and as you have
pointed out yourself, if you find yourself needing to make a copy of an
existing function you can work around the unexpected copy semantics with
new.function.
-Carsten
The answer is really really simple. The implementation of copy predates
mutability. When the copy code was written, functions *were* immutable.
When functions became mutable, the copy code was not changed, and
nobody noticed or complained.
Regards,
Martin
Is there a reason for using the closure here? Using function defaults
seems to give better performance:
>>> def powerfactory(exponent):
... def inner(x,exponent=exponent):
... return x**exponent
... return inner
This is definitely one viable solution and is essentially what I had
in mind, but I did not want to have to carry the generator arround
with me: Instead, I wanted to use it once as a decorator and then
carry only the function around.
>>> @declare_options(first_option='opt1')
>>> def f(x,opt1,opt2,opt3):
... return x*(opt1+opt2*opt3)
>>> f.set_options(opt1=1,opt2=2,opt3=3)
>>> f(1)
7
>>> from copy import copy
>>> g = copy(f)
>>> g.set_options(opt1=4,opt2=5,opt3=6)
>>> f(1)
7
>>> g(1)
34
The decorator declare_options behaves like the generator above, but
adds some methods (set_options) etc. to allow me to manipulate the
options without generating a new function each time.
I have functions with many options that may be called in the core of
loops, and found that the most efficient solution was to provide all
of the options through func_defaults.
>>> def f(x,opt1,opt2,opt3):
... return x*(opt1 + opt2*opt3)
The cleanest (and fastest) solution I found was to set the options in
the defaults:
>>> f.func_defaults = (1,2,3)
Then f can be passed to the inner loops and f(x) is very quick.
Other options include using lists and dict's:
>>> opt = (1,2,3)
>>> f(1,*opt)
7
but then I have to pass f and opt around. This also appears to be
somewhat slower than the defaults method. Dictionaries have the
advantage of associating the names with the values
>>> opt = {'opt1':1, 'opt2':2, 'opt3':3}
>>> f(1,**opt)
7
but this is much slower. Wrapping the function with a generator as
you suggest also works and packages everything together, but again
suffers in performance. It also complicates my code.
The result of my declare_options decorator is that the result is a
regular function, complete with docstring etc. but with added
annotations that allow the options to be set. In addition, the
performance optimal. I though this was a very clean solution until I
realized that I could not make copies of the functions to allow for
different option values with the usual python copy symantics (for
example, a __copy__ method is ignored). I can easily get around this
by adding a custom copy() method, but wondered if there was anything
inherently dangerous with this approach that would justify the added
complications of more complicated wrappings and the performance hit.
Pickling is an obvious issue, but it seems like there is nothing wrong
with the copy semantics and that the limitations are artificial and
out of place. (It is also easily fixed: if the object has a __copy__
method, use it. Truely immutable objects will never have one. There
may be subtle issues here, but I don't know what they are.)
Thanks for all of the suggestions,
Michael.
That's probably an indication that mutable functions don't
get used all that much. Are there any instances of them in the
standard Python libraries?
John Nagle
Likely scenario, but not true. Support for copying user functions was
added on 2.5 (see
http://svn.python.org/view/python/trunk/Lib/copy.py?rev=42573&r1=38995&r2=42573)
and functions were mutable since a long time ago. On previous versions,
functions could be pickled but not copied.
The same thing happens for classes: they are mutable too, but copy
considers them immutable and returns the same object. This is clearly
stated on the documentation (but the module docstring is still outdated).
--
Gabriel Genellina
It does? Not as far as I can measure it to any significant degree on my
computer.
> This is definitely one viable solution and is essentially what I had
> in mind, but I did not want to have to carry the generator arround
> with me:
I don't know what you mean by "carry it around." Just put it in a module
and import it where you need it.
An overriding theme in this thread is that you are greatly concerned
with the speed of your solution rather than the structure and
readability of your code. How often is your function going to get called
and how much of a performance benefit are you expecting?
-Carsten
> Is there a reason for using the closure here? Using function defaults
> seems to give better performance:
What measurements show you that...?
brain:~ alex$ cat powi.py
def powerfactory1(exponent):
def inner(x):
return x**exponent
return inner
def powerfactory2(exponent):
def inner(x, exponent=exponent):
return x**exponent
return inner
brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory1(3)'
'p(27)'
1000000 loops, best of 3: 0.485 usec per loop
brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory2(3)'
'p(27)'
1000000 loops, best of 3: 0.482 usec per loop
Alex
All user-defined functions are mutable. To me, this issue is rather that
copying of functions isn't used all that much.
In addition, it is certainly true that functions are rarely modified,
even though all of them are mutable. See this for an example
py> def foo():pass
...
py> foo
<function foo at 0xb7db5f44>
py> foo.__name__='bar'
py> foo
<function bar at 0xb7db5f44>
py> foo.value
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'function' object has no attribute 'value'
py> foo.value=100
py> foo.value
100
Regards,
Martin
Interesting. I (clearly) missed that part of the defaultdict discussion
(that was a long thread).
Regards,
Martin
>> Is there a reason for using the closure here? Using function
>> defaults seems to give better performance:
>
> What measurements show you that...?
>
...
>
> brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory1(3)'
> 'p(27)'
> 1000000 loops, best of 3: 0.485 usec per loop
>
> brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory2(3)'
> 'p(27)'
> 1000000 loops, best of 3: 0.482 usec per loop
Your own benchmark seems to support Michael's assertion although the
difference in performance is so slight that it is unlikely ever to
outweigh the loss in readability.
Modifying powi.py to reduce the weight of the function call overhead and
the exponent operation indicates that using default arguments is faster,
but you have to push it to quite an extreme case before it becomes
significant:
def powerfactory1(exponent, plus):
def inner(x):
for i in range(1000):
res = x+exponent+plus
return res
return inner
def powerfactory2(exponent, plus):
def inner(x, exponent=exponent, plus=plus):
for i in range(1000):
res = x+exponent+plus
return res
return inner
C:\Temp>\python25\python -mtimeit -s "import powi; p=powi.powerfactory1
(3,999)" "p(27)"
10000 loops, best of 3: 159 usec per loop
C:\Temp>\python25\python -mtimeit -s "import powi; p=powi.powerfactory2
(3,999)" "p(27)"
10000 loops, best of 3: 129 usec per loop
I agree the performance gains are minimal. Using function defaults
rather than closures, however, seemed much cleaner an more explicit to
me. For example, I have been bitten by the following before:
>>> def f(x):
... def g():
... x = x + 1
... return x
... return g
>>> g = f(3)
>>> g()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in g
UnboundLocalError: local variable 'x' referenced before assignment
If you use default arguments, this works as expected:
>>> def f(x):
... def g(x=x):
... x = x + 1
... return x
... return g
>>> g = f(3)
>>> g()
4
The fact that there also seems to be a performance gain (granted, it
is extremely slight here) led me to ask if there was any advantage to
using closures. It seems not.
> An overriding theme in this thread is that you are greatly concerned
> with the speed of your solution rather than the structure and
> readability of your code.
Yes, it probably does seem that way, because I am burying this code
deeply and do not want to revisit it when profiling later, but my
overriding concern is reliability and ease of use. Using function
attributes seemed the best way to achieve both goals until I found out
that the pythonic way of copying functions failed. Here was how I
wanted my code to work:
@define_options(first_option='abs_tol')
def step(f,x,J,abs_tol=1e-12,rel_tol=1e-8,**kwargs):
"""Take a step to minimize f(x) using the jacobian J.
Return (new_x,converged) where converged is true if the tolerance
has been met.
"""
<compute dx and check convergence>
return (x + dx, converged)
@define_options(first_option='min_h')
def jacobian(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J
class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step,jacobian,**kwargs):
self.options = step.options + jacobian.options
self.step = step
self.jacobian = jacobian
def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step
jacobian = self.jacobian
step.set_options(**kwargs)
jacobian.set_options(**kwargs)
converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)
return x
@property
def options(self):
"""List of supported options."""
return self.options
The idea is that one can define different functions for computing the
jacobian, step etc. that take various parameters, and then make a
custom minimizer class that can provide the user with information
about the supported options etc.
The question is how to define the decorator define_options?
1) I thought the cleanest solution was to add a method f.set_options()
which would set f.func_defaults, and a list f.options for
documentation purposes. The docstring remains unmodified without any
special "wrapping", step and jacobian are still "functions" and
performance is optimal.
2) One could return an instance f of a class with f.__call__,
f.options and f.set_options defined. This would probably be the most
appropriate OO solution, but it makes the decorator much more messy,
or requires the user to define classes rather than simply define the
functions as above. In addition, this is at least a factor of 2.5
timese slower on my machine than option 1) because of the class
instance overhead. (This is my only real performance concern because
this is quite a large factor. Otherwise I would just use this
method.)
3) I could pass generators to Minimize and construct the functions
dynamically. This would have the same performance, but would require
the user to define generators, or require the decorator to return a
generator when the user appears to be defining a function. This just
seems much less elegant.
...
@define_options_generator(first_option='min_h')
def jacobian_gen(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J
class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step_gen,jacobian_gen,**kwargs):
self.options = step_gen.options + jacobian_gen.options
self.step_gen = step_gen
self.jacobian_gen = jacobian_gen
def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step_gen(**kwargs)
jacobian = self.jacobian_gen(**kwargs)
converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)
return x
...
4) Maybe there is a better, cleaner way to do this, but I thought that
my option 1) was the most clear, readable and fast. I would
appreciate any suggestions. The only problem is that it does use
mutable functions, and so the user might be tempted to try:
new_step = copy(step)
which would fail (because modifying new_step would also modify step).
I guess that this is a pretty big problem (I could provide a custom
copy function so that
new_step = step.copy()
would work) and I wondered if there was a better solution (or if maybe
copy.py should be fixed. Checking for a defined __copy__ method
*before* checking for pre-defined mutable types does not seem to break
anything.)
Thanks again everyone for your suggestions, it is really helping me
learn about python idioms.
Michael.
>>> g()
4
>>> g()
4
>>> g() # what is going on here????
You aren't getting "bit" by any problem with closures - this is a
syntax problem.
Assigning to x within the scope of g() makes x a local variable. x =
x+1 doesn't work, then, because "x+1" can't be evaluated.
If you just used "return x+1" or if you named the local variable in g
something other than "x", the closure approach would also work as
expected.
> I agree the performance gains are minimal. Using function defaults
> rather than closures, however, seemed much cleaner an more explicit to
> me. For example, I have been bitten by the following before:
>
>>>>def f(x):
>
> ... def g():
> ... x = x + 1
Too cute. Don't nest functions in Python; the scoping model
isn't really designed for it.
>>An overriding theme in this thread is that you are greatly concerned
>>with the speed of your solution rather than the structure and
>>readability of your code.
...
>
> @define_options(first_option='abs_tol')
> def step(f,x,J,abs_tol=1e-12,rel_tol=1e-8,**kwargs):
> """Take a step to minimize f(x) using the jacobian J.
> Return (new_x,converged) where converged is true if the tolerance
> has been met.
> """
Python probably isn't the right language for N-dimensional optimization
if performance is a major concern. That's a very compute-intensive operation.
I've done it in C++, with heavy use of inlines, and had to work hard to
get the performance up. (I was one of the first to do physics engines for
games and animation, which is a rather compute-intensive problem.)
If you're doing number-crunching in Python, it's essential to use
NumPy or some other C library for matrix operations, or it's going to
take way too long.
John Nagle
I understand that it is not closures that are specifically biting me.
However, I got bit, it was unplesant and I don't want to be bit
again;-)
Thus, whenever I need to pass information to a function, I use default
arguments now. Is there any reason not to do this other than the fact
that it is a bit more typing?
Michael
How can you make generators then if you don't nest?
> Python probably isn't the right language for N-dimensional optimization
> if performance is a major concern. That's a very compute-intensive operation.
> I've done it in C++, with heavy use of inlines, and had to work hard to
> get the performance up. (I was one of the first to do physics engines for
> games and animation, which is a rather compute-intensive problem.)
>
> If you're doing number-crunching in Python, it's essential to use
> NumPy or some other C library for matrix operations, or it's going to
> take way too long.
I know. I am trying to flesh out a modular optimization proposal for
SciPy. Using C++ would defeat the purpose of making it easy to extend
the optimizers. I just want to make things as clean and efficient as
possible when I stumbled on this python copy problem.
Michael.
There are different semantics when the thing you're passing is
mutable. There's also different semantics when it's rebound within the
calling scope, but then the default argument technique is probably
what you want.
> Michael
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
There's all kinds of good reasons to nest functions, and the "scoping
model isn't really designed for it" somewhat overstates the case -
it's not relevant to many of the reasons you might nest functions, and
it's not (much) of a problem for the rest of them. What you can't do
is rebind values in the enclosing scope, unless the enclosing scope is
global. That's a real, but fairly minor, limitation and you'll be able
to explicitly address your enclosing scope in 3k (or perhaps sooner).
Okay, so it is a bad design, but it illustrates the point. What is
happening is that in the body of the function f, a new function is
defined using the value of x passed as an argument to f. Thus, after
the call g = f(3), the body of f is equivalent to
def g(x=3):
x = x + 1
return x
This function is returned, so the call g() uses the default argument
x=3, then computes x = x+1 = 3+1 = 4 and returns 4. Every call is
equivalent to g() == g(3) = 4. Inside g, x is a local variable: it
does not maintain state between function calls. (You might think that
the first example would allow you to mutate the x in the closure, but
this is dangerous and exactly what python is trying to prevent by
making x a local variable when you make assignments in g. This is why
the interpreter complains.)
If you actually want to maintain state, you have to use a mutable
object like a list. The following would do what you seem to expect.
>>> def f(x0):
... def g(x=[x0]):
... x[0] = x[0] + 1
... return x[0]
... return g
...
>>> g = f(3)
>>> g()
4
>>> g()
5
>>> h = f(0)
>>> h()
1
>>> h()
2
>>> g()
6
> Thus, whenever I need to pass information to a function, I use default
> arguments now. Is there any reason not to do this other than the fact
> that it is a bit more typing?
You're giving your functions a signature that's different from the one
you expect it to be called with, and so making it impossible for the
Python runtime to diagnose certain errors on the caller's part.
For example, consider:
def makecounter_good():
counts = {}
def count(item):
result = counts[item] = 1 + counts.get(item, 0)
return result
return count
c = makecounter_good()
for i in range(3): print c(23)
def makecounter_hmmm():
counts = {}
def count(item, counts=counts):
result = counts[item] = 1 + counts.get(item, 0)
return result
return count
cc = makecounter_hmmm()
for i in range(3): print cc(23)
print cc(23, {})
print c(23, {})
Counters made by makecounter_good take exactly one argument, and
properly raise exceptions if incorrectly called with two; counters made
by makecounter_hmmm take two arguments (of which one is optional), and
thus hide some runtime call errors.
From "import this":
"""
Errors should never pass silently.
Unless explicitly silenced.
"""
The miniscule "optimization" of giving a function an argument it's not
_meant_ to have somewhat breaks this part of the "Zen of Python", and
thus I consider it somewhat unclean.
Alex
That is a pretty good reason in some contexts. Usually, the arguments
I pass are values that the user might like to change, so the kwarg
method often serves an explicit purpose allowing parameters to be
modified, but I can easily imagine cases where the extra arguments
should really not be there. I still like explicitly stating the
dependencies of a function, but I suppose I could do that with
decorators.
Thanks,
Michael.