def main():
a = ['a list','with','three elements']
print a
print fnc1(a)
print a
def fnc1(b):
return fnc2(b)
def fnc2(c):
c[1] = 'having'
return c
This is the output:
['a list', 'with', 'three elements']
['a list', 'having', 'three elements']
['a list', 'having', 'three elements']
I had expected the third print statement to give the same output as the first, but variable a had been changed by changing variable c in fnc2.
It seems that in Python, a variable inside a function is global unless it's assigned. This rule has apparently been adopted in order to reduce clutter by not having to have global declarations all over the place.
I would have thought that a function parameter would automatically be considered local to the function. It doesn't make sense to me to pass a global to a function as a parameter.
One workaround is to call a function with a copy of the list, eg in fnc1 I would have the statement "return fnc2(b[:]". But this seems ugly.
Are there others who feel as I do that a function parameter should always be local to the function? Or am I missing something here?
Henry
no, they are local
> I would have thought that a function parameter would
> automatically be considered local to the function. It doesn't
> make sense to me to pass a global to a function as a
> parameter.
it is local. But consider what you actually passed:
You did not pass a copy of the list but the list itself.
You could also say you passed a reference to the list.
All python variables only hold a pointer (the id) to
an object. This object has a reference count and is
automatically deleted when there are no more references
to it.
If you want a local copy of the list you can either
do what you called being ugly or do just that within
the function itself - which I think is cleaner and
only required once.
def fnc2(c):
c = c[:]
c[1] = 'having'
return c
--
Wolfgang
To be more accurate, the list object referred to by `a` was modified
through c, due to the fact that a, b, and c all refer to the very same
object in this case.
> It seems that in Python, a variable inside a function is global unless it's assigned. This rule has apparently been adopted in order to reduce clutter by not having to have global declarations all over the place.
>
> I would have thought that a function parameter would automatically be considered local to the function.
<snip>
> Are there others who feel as I do that a function parameter should always be local to the function? Or am I missing something here?
Function parameters *are* local variables. Function parameters are
indeed local in that *rebinding* them has no effect outside of the
function:
def foo(a):
a = 42
def bar():
b = 1
foo(b)
print b
bar() #=> outputs 1
As you've observed, *mutating* the object a variable refers to is
another matter entirely. Python does not use call-by-value and does
not copy objects unless explicitly requested to, as you've
encountered. But it does not use call-by-reference either, as my
example demonstrates. Like several other popular contemporary
languages, Python uses call-by-object for parameter passing; a good
explanation of this model can be found at
http://effbot.org/zone/call-by-object.htm It's well worth reading.
Cheers,
Chris
It doesn't look like a question of local or global. fnc2 is passed a
container object and replaces item 1 in that container. You see the results
when fnc2 prints the object it knows as `c`, and you see again when main
prints the object it knows as `a`. Python doesn't pass parameters by
handing around copies that can be thought of as local or global. Python
passes parameters by binding objects to names in the callee's namespace. In
your program the list known as `a` in main is identically the same list as
the one known as `c` in fnc2, and what happens happens.
Mel.
> On Sonntag 29 Mai 2011, Henry Olders wrote:
>> It seems that in Python, a variable inside a function is global unless
>> it's assigned.
>
> no, they are local
I'm afraid you are incorrect. Names inside a function are global unless
assigned to somewhere.
>>> a = 1
>>> def f():
... print a # Not local, global.
...
>>> f()
1
By default, names inside a function must be treated as global, otherwise
you couldn't easily refer to global functions:
def f(x):
print len(x)
because len would be a local name, which doesn't exist. In Python, built-
in names are "variables" just like any other.
Python's scoping rule is something like this:
If a name is assigned to anywhere in the function, treat it as a local,
and look it up in the local namespace. If not found, raise
UnboundLocalError.
If a name is never assigned to, or if it is declared global, then look it
up in the global namespace. If not found, look for it in the built-ins.
If still not found, raise NameError.
Nested scopes (functions inside functions) make the scoping rules a
little more complex.
If a name is a function parameter, that is equivalent to being assigned
to inside the function.
--
Steven
For what it's worth, I've noticed that use of the word "variable"
is correlated with a misunderstanding of Python's way of doing
things.
"Variable" seems to connote a box that has something in it,
so when fnc1 passes b to fnc2 which calls it c, you think
you have a box named b and a box named c, and you wonder
whether the contents of those boxes are the same or
different.
Python works in terms of objects having names, and one
object can have many names. In your example, fnc1 works
with an object that it calls b, and which it passes to fnc2,
but fnc2 chooses to call that same object c. The names b
and c aren't boxes that hold things, they are -- in the
words of one of this group's old hands -- sticky-note labels
that have been slapped on the same object.
--
To email me, substitute nowhere->spamcop, invalid->net.
Wait wha? I've never seen this... wouldn't it just create it in the
local namespace?
Can you give example code that will trigger this error? I'm curious, now...
Chris Angelico
def foo():
print bar
bar = 42
foo()
===>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in foo
UnboundLocalError: local variable 'bar' referenced before assignment
Cheers,
Chris
def f():
print a # a is not yet defined, i.e. unbound
a = 1 # this makes a local
--
Steven
Wow
I thought it basically functioned top-down. You get a different error
on the print line if there's a "bar = 42" *after* it. This could make
debugging quite confusing.
Guess it's just one of the consequences of eschewing variable
declarations. Sure it's easier, but there's complications down the
road.
Chris Angelico
> On Mon, May 30, 2011 at 4:01 AM, Chris Rebert <cl...@rebertia.com> wrote:
>> def foo():
>> print bar
>> bar = 42
>>
>> foo()
>>
>> ===>
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> File "<stdin>", line 2, in foo
>> UnboundLocalError: local variable 'bar' referenced before assignment
>
> Wow
>
> I thought it basically functioned top-down. You get a different error on
> the print line if there's a "bar = 42" *after* it. This could make
> debugging quite confusing.
UnboundLocalError is a subclass of NameError, so it will still be caught
by try...except NameError.
If you're crazy enough to be catching NameError :)
Go back to Python1.5, and there was no UnboundLocalError. It was
introduced because people were confused when they got a NameError after
forgetting to declare something global:
>>> def f():
... print a
... a = a + 1
...
>>> a = 42
>>> f()
Traceback (innermost last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in f
NameError: a
While UnboundLocalError is jargon, and not the easiest error message to
comprehend, at least it confuses in a different way :)
--
Steven
It's also a consequence of local variable access being optimized with
different bytecode: the type of storage has to be determined at
compile time. The compiler could in principle figure out that "bar"
cannot be bound at that point and make it a global reference, but it
is easy to concoct situations involving loops or conditionals where
the storage cannot be determined at compile time, and so the compiler
follows the simple rule of making everything local unless it's never
assigned.
Cheers,
Ian
Ah okay. So it is still NameError, it just doesn't look like one.
> While UnboundLocalError is jargon, and not the easiest error message to
> comprehend, at least it confuses in a different way :)
I have nothing against jargon, and specific errors are better than
generic ones (imagine if every error were thrown as Exception with a
string parameter... oh wait, that's what string exceptions are).
It still seems a little odd that a subsequent line can affect this
one. But Python's mostly doing what would be expected of it; the worst
I can come up with is this:
def f():
print(foo) # reference a global
...
for foo in bar: # variable only used in loop
pass
If you're used to C++ and declaring variables inside a for loop eg
"for (int i=0;i<10;++i)", you might not concern yourself with the fact
that 'foo' is masking a global; it's not an issue, because you don't
need that global inside that loop, right? And it would be fine, except
that that global IS used somewhere else in the function. It'd be a bit
confusing to get the UnboundLocalError up on the print(foo) line (the
one that's meant to be using the global), since that line isn't wrong;
and the "obvious fix", adding an explicit "global foo" to the top of
the function, would be worse (because it would mean that the for loop
overwrites the global).
This is why I would prefer to declare variables. The Zen of Python
says that explicit is better than implicit, but in this instance,
Python goes for DWIM, guessing whether you meant global or local. It
guesses fairly well, though.
Chris Angelico
On 2011-05-29, at 5:47 , Wolfgang Rohdewald wrote:
> On Sonntag 29 Mai 2011, Henry Olders wrote:
>> It seems that in Python, a variable inside a function is
>> global unless it's assigned.
>
> no, they are local
>
>> I would have thought that a function parameter would
>> automatically be considered local to the function. It doesn't
>> make sense to me to pass a global to a function as a
>> parameter.
>
> it is local. But consider what you actually passed:
> You did not pass a copy of the list but the list itself.
> You could also say you passed a reference to the list.
> All python variables only hold a pointer (the id) to
> an object. This object has a reference count and is
> automatically deleted when there are no more references
> to it.
>
> If you want a local copy of the list you can either
> do what you called being ugly or do just that within
> the function itself - which I think is cleaner and
> only required once.
>
> def fnc2(c):
> c = c[:]
> c[1] = 'having'
> return c
Thank you, Wolfgang. That certainly works, but to me it is still a workaround to deal with the consequence of a particular decision. From my perspective, a function parameter should be considered as having been assigned (although the exact assignment will not be known until runtime), and as an assigned variable, it should be considered local.
Henry
Function *parameters* are names, the first *local names* of the function.
>> It doesn't make sense to me to pass a global to a function as a parameter.
You are right, in a way;-). Global *names* are just names. When you call
a function, you pass *objects* as *arguments*. Of course, you may refer
to the object by a global name to pass it, or you can pass a string
object that contains a global name.
>
> It doesn't look like a question of local or global. fnc2 is passed a
> container object and replaces item 1 in that container. You see the results
> when fnc2 prints the object it knows as `c`, and you see again when main
> prints the object it knows as `a`. Python doesn't pass parameters by
> handing around copies that can be thought of as local or global. Python
> passes parameters by binding objects to names in the callee's namespace. In
> your program the list known as `a` in main is identically the same list as
> the one known as `c` in fnc2, and what happens happens.
Right. Python has one unnamed 'objectspace'. It has many, many
namespaces: builtins, globals for each module, locals for each function
and class, and attributes for some instances. Each name and each
collection slot is associated with one object. Each object can have
multiple associations, as in the example above.
--
Terry Jan Reedy
> Python works in terms of objects having names, and one
> object can have many names.
Or no names. So it's less accurate (though better than talking of
“variables”) to speak of Python objects “having names”.
> The names b and c aren't boxes that hold things, they are -- in the
> words of one of this group's old hands -- sticky-note labels that have
> been slapped on the same object.
Right. And in that analogy, the object *still* doesn't “have a name”
(since that implies the false conclusion that the object knows its own
name); rather, the name is bound to the object, and the object is
oblivious of this.
I prefer to talk not of sticky notes, but paper tags with string; the
string leading from tag to object is an important part, and the paper
tag might not even have a name written on it, allowing the same analogy
to work for other non-name references like list indices etc.
--
\ “Pinky, are you pondering what I'm pondering?” “I think so, |
`\ Brain, but where are we going to find a duck and a hose at this |
_o__) hour?” —_Pinky and The Brain_ |
Ben Finney
> > def fnc2(c):
> > c = c[:]
> > c[1] = 'having'
> > return c
>
> Thank you, Wolfgang. That certainly works, but to me it is still a
> workaround to deal with the consequence of a particular decision.
> From my perspective, a function parameter should be considered as
> having been assigned (although the exact assignment will not be known
> until runtime), and as an assigned variable, it should be considered
> local.
>
> Henry
This has nothing to do with function parameters and everything to do
with what a "variable name" actually means. You can get the same effect
with only globals:
>>> x=[1,2,3]
>>> y=x
>>> x.append(7)
>>> y
[1, 2, 3, 7]
Why in the world does "y" end with 7, even though it was appended to
"x"? Simple: because "x" and "y" are two names for the same list, as
Henry explained. No functions involved, no locals, no parameters, no
scoping. Again, if "y=x" were instead "y=x[:]" then the output would be
"[1,2,3]" because "y" would refer to a copy of the list rather than the
same list.
Chris
> From my perspective, a function parameter should be considered as
> having been assigned (although the exact assignment will not be known
> until runtime), and as an assigned variable, it should be considered
> local.
That is exactly the case for Python functions.
>>> def f(a,b):
c,d = 3,4
print(locals())
>>> f.__code__.co_varnames # local names
('a', 'b', 'c', 'd')
>>> f(1,2)
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
The requirement for a function call is that all parameters get an
assignment and that all args are used in assignments (either directly by
position or keyname or as part of a *args or **kwds assignment).
--
Terry Jan Reedy
> I just spent a considerable amount of time and effort debugging a
> program. The made-up code snippet below illustrates the problem I
> encountered:
[...]
> Are there others who feel as I do that a function parameter should
> always be local to the function? Or am I missing something here?
The nature of Henry's misunderstanding is a disguised version of the very
common "is Python call by reference or call by value?" question that
periodically crops up. I wrote a long, but I hope useful, explanation for
the tu...@python.org mailing list, which I'd like to share here:
http://mail.python.org/pipermail/tutor/2010-December/080505.html
Constructive criticism welcome.
--
Steven
> http://mail.python.org/pipermail/tutor/2010-December/080505.html
>
>
> Constructive criticism welcome.
Informative, but it “buries the lead” as our friends in the press corps
would say.
Instead you should write as though you have no idea where the reader
will stop reading, and still give them the most useful part. Write the
most important information first, and don't bury it at the end.
In this case, I'd judge the most important information to be “what is
the Python passing model?” Give it a short name; the effbot's “pass by
object” sounds good to me.
Then explain what that means.
Then, only after giving the actual information you want the reader to go
away with, you can spend the rest of the essay giving a history behind
the craziness.
<URL:http://www.computerworld.com/s/article/print/93903/I_m_OK_The_Bull_Is_Dead>
--
\ Moriarty: “Forty thousand million billion dollars? That money |
`\ must be worth a fortune!” —The Goon Show, _The Sale of |
_o__) Manhattan_ |
Ben Finney
I agree with the gist of that. My take on this is: When I'm talking to
my boss, I always assume that the phone will ring ten seconds into my
explanation. Ten seconds is enough for "Dad, I'm OK; the bull is
dead", it's enough for "I've solved Problem X, we can move on now";
it's enough for "Foo is crashing, can't ship till I debug it". If
fortune is smiling on me and the phone isn't ringing, I can explain
that Problem X was the reason Joe was unable to demo his new module,
or that the crash in Foo is something that I know I'll be able to pin
down in a one-day debugging session, but even if I don't, my boss
knows enough to keep going with.
Of course, there's a significant difference between a mailing list
post and a detailed and well copyedited article. Quite frequently I'll
ramble on list, in a way quite inappropriate to a publication that
would be linked to as a "hey guys, here's how it is" page. Different
media, different standards.
Chris Angelico
"Forty thousand million billion THEGS quotes? That must be worth a fortune!"
-- definitely a fan of THEGS --
> Of course, there's a significant difference between a mailing list
> post and a detailed and well copyedited article. Quite frequently I'll
> ramble on list, in a way quite inappropriate to a publication that
> would be linked to as a "hey guys, here's how it is" page. Different
> media, different standards.
Right. But Steven specifically asked for constructive criticism, which I
took as permission to treat the referenced post as an article in need of
copy editing :-)
--
\ “The truth is the most valuable thing we have. Let us economize |
`\ it.” —Mark Twain, _Following the Equator_ |
_o__) |
Ben Finney
Indeed. Was just saying that there are times when you need to get the
slug out first, and times when it's okay to be a little less impactual
(if that's a word). Although it's still important to deliver your
message promptly.
Of course, there are other contexts where you specifically do NOT want
to give everything away at the beginning. Certain styles of rhetoric
demand that you set the scene, build your situation, and only at the
climax reveal what it is you are trying to say.
Ahh! wordsmithing, how we love thee.
Chris Angelico
> Steven D'Aprano <steve+comp....@pearwood.info> writes:
>
>> http://mail.python.org/pipermail/tutor/2010-December/080505.html
>>
>>
>> Constructive criticism welcome.
>
> Informative, but it “buries the lead” as our friends in the press corps
> would say.
Thank you, that's a good point.
[...]
> More on this style:
>
> <URL:http://www.computerworld.com/s/article/print/93903/
I_m_OK_The_Bull_Is_Dead>
Or as they say in the fiction-writing trade, "shoot the sheriff on the
first page".
--
Steven
Could you give an example of an object that has no name ? I've missed
something ...
Laurent
def foo():
return 5
print(foo())
The int object 5 has no name here.
Cheers,
Chris
Cool. I was thinking that "5" was the name, but
>>> 5.__add__(6)
File "<stdin>", line 1
5.__add__(6)
^
SyntaxError: invalid syntax
while
>>> a=5
>>> a.__add__(6)
11
Very well. I learned something today.
Thanks
Laurent
>>> object()
<object object at 0xb73d04d8>
--
With best regards,
Daniel Kluev
> Cool. I was thinking that "5" was the name, but
> >>> 5.__add__(6)
> File "<stdin>", line 1
> 5.__add__(6)
Try 5 .__add__(6)
Modules, classes, and functions have a .__name__ attribute (I call it
their 'definition name') used to print a representation. As best I can
remember, other builtin objects do not.
--
Terry Jan Reedy
What is the rationale behind the fact to add a space between "5" and
".__add__" ?
Why does it work ?
Laurent
Because . is an operator just like + * & etc.
>>> s = "hello world"
>>> s . upper ( )
'HELLO WORLD'
In the case of integer literals, you need the space, otherwise Python
will parse 5. as a float:
>>> 5.
5.0
>>> 5.__add__
File "<stdin>", line 1
5.__add__
^
SyntaxError: invalid syntax
>>> 5 .__add__
<method-wrapper '__add__' of int object at 0x8ce3d60>
--
Steven
> Could you give an example of an object that has no name ? I've missed
> something ...
>>> mylist = [None, 42, "something"]
The list object has a name, mylist.
The three objects inside the list have no names.
--
Steven
Try asking it the other way around. Why doesn't ‘5.__add__(6)’, without
the space, work?
--
\ “Telling pious lies to trusting children is a form of abuse, |
`\ plain and simple.” —Daniel Dennett, 2010-01-12 |
_o__) |
Ben Finney
It's a hint for the tokenizer.
$ cat show_tokens.py
import sys
from tokenize import generate_tokens
from cStringIO import StringIO
from token import tok_name
_name_width = max(len(name) for name in tok_name.itervalues())
def show_tokens(s):
for token in generate_tokens(StringIO(s).readline):
name = tok_name[token[0]]
value = token[1]
print "%-*s %r" % (_name_width, name, value)
if __name__ == "__main__":
show_tokens(" ".join(sys.argv[1:]))
$ python show_tokens.py 5.__add__
NUMBER '5.'
NAME '__add__'
ENDMARKER ''
$ python show_tokens.py 5 .__add__
NUMBER '5'
OP '.'
NAME '__add__'
ENDMARKER ''
I didn't know the tokenizer. Now I understand.
Thanks
Laurent
I didn't know the tokenizer. Now I understand.
Thanks
Laurent
Oh joy.
>>> [5][0].__add__([6][-1])
11
The parser just needs the help to detect the intended token boundary
instead of another, unintened one. As the others already say.
Others have given you specific answers, here is the bigger picture.
For decades, text interpreter/compilers have generally run in two phases:
1. a lexer/tokenizer that breaks the stream of characters into tokens;
2. a parser that recognizes higher-level syntax and takes appropriate
action.
Lexers are typically based on regular grammars and implemented as very
simple and fast deterministic finite-state automata. In outline (leaving
out error handling and end-of-stream handling), something like:
def lexer(stream, lookup, initial_state):
state = initial_state
buffer = []
for char in stream:
state,out = lookup[state,char]
if out:
yield output(buffer)
# convert list of chars to token expected by parser, clear buffer
buffer += char
There is no backup and no lookahead (except for the fact that output
excludes the current char). For python, lookup[start,'5'] ==
in_number,False, and lookup[in_number,'.'] == in_float,False.
>>> 5..__add__(6)
11.0
works because lookup[in_float,'.'] == start,True, because buffer now
contains a completed float ready to output and '.' signals the start of
a new token.
I believe we read natural language text similarly, breaking it into
words and punctuation. I believe the ability to read programs depends on
being able to adjust the internal lexer a bit. Python is easier to read
than some other algorithm languages because it tends to have at most one
punctuation-like symbol unit between words, as is the case in the code
> I just spent a considerable amount of time and effort debugging a program. The made-up code snippet below illustrates the problem I encountered:
>
> def main():
> a = ['a list','with','three elements']
> print a
> print fnc1(a)
> print a
>
> def fnc1(b):
> return fnc2(b)
>
> def fnc2(c):
> c[1] = 'having'
> return c
>
> This is the output:
> ['a list', 'with', 'three elements']
> ['a list', 'having', 'three elements']
> ['a list', 'having', 'three elements']
>
> I had expected the third print statement to give the same output as the first, but variable a had been changed by changing variable c in fnc2.
>
> It seems that in Python, a variable inside a function is global unless it's assigned. This rule has apparently been adopted in order to reduce clutter by not having to have global declarations all over the place.
>
> I would have thought that a function parameter would automatically be considered local to the function. It doesn't make sense to me to pass a global to a function as a parameter.
>
> One workaround is to call a function with a copy of the list, eg in fnc1 I would have the statement "return fnc2(b[:]". But this seems ugly.
>
> Are there others who feel as I do that a function parameter should always be local to the function? Or am I missing something here?
>
My thanks to all the people who responded - I've learned a lot. Sadly, I feel that the main issue that I was trying to address, has not been dealt with. Perhaps I didn't explain myself properly; if so, my apologies.
I am trying to write python programs in a more-or-less functional programming mode, ie functions without side effects (except for print statements, which are very helpful for debugging). This is easiest when all variables declared in functions are local in scope (it would also be nice if variables assigned within certain constructs such as for loops and list comprehensions were local to that construct, but I can live without it).
It appears, from my reading of the python documentation, that a deliberate decision was made to have variables that are referenced but not assigned in a function, have a global scope. I quote from the python FAQs: "In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a new value anywhere within the function’s body, it’s assumed to be a local. If a variable is ever assigned a new value inside the function, the variable is implicitly local, and you need to explicitly declare it as ‘global’.
Though a bit surprising at first, a moment’s consideration explains this. On one hand, requiring global for assigned variables provides a bar against unintended side-effects. On the other hand, if global was required for all global references, you’d be using global all the time. You’d have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the global declaration for identifying side-effects." (http://docs.python.org/faq/programming.html)
This suggests that the decision to make unassigned (ie "free" variables) have a global scope, was made somewhat arbitrarily to prevent clutter. But I don't believe that the feared clutter would materialize. My understanding is that when a variable is referenced, python first looks for it in the function's namespace, then the module's, and finally the built-ins. So why would it be necessary to declare references to built-ins as globals?
What I would like is that the variables which are included in the function definition's parameter list, would be always treated as local to that function (and of course, accessible to nested functions) but NOT global unless explicitly defined as global. This would prevent the sort of problems that I encountered as described in my original post. I may be wrong here, but it seems that the interpreter/compiler should be able to deal with this, whether the parameter passing is by value, by reference, by object reference, or something else. If variables are not assigned (or bound) at compile time, but are included in the parameter list, then the binding can be made at runtime.
And I am NOT talking about variables that are only referenced in the body of a function definition; I am talking about parameters (or arguments) in the function's parameter list. As I stated before, there is no need to include a global variable in a parameter list, and if you want to have an effect outside of the function, that's what the return statement is for.
I don't believe I'm the only person who thinks this way. Here is a quote from wikipedia: "It is considered good programming practice to make the scope of variables as narrow as feasible so that different parts of a program do not accidentally interact with each other by modifying each other's variables. Doing so also prevents action at a distance. Common techniques for doing so are to have different sections of a program use different namespaces, or to make individual variables "private" through either dynamic variable scoping or lexical variable scoping." (http://en.wikipedia.org/wiki/Variable_(programming)#Scope_and_extent).
It also seems that other languages suitable for functional programming take the approach I think python should use. Here is another quote from the wikipedia entry for Common Lisp: "the use of lexical scope isolates program modules from unwanted interactions. Due to their restricted visibility, lexical variables are private. If one module A binds a lexical variable X, and calls another module B, references to X in B will not accidentally resolve to the X bound in A. B simply has no access to X. For situations in which disciplined interactions through a variable are desirable, Common Lisp provides special variables. Special variables allow for a module A to set up a binding for a variable X which is visible to another module B, called from A. Being able to do this is an advantage, and being able to prevent it from happening is also an advantage; consequently, Common Lisp supports both lexical and dynamic scope. (http://en.wikipedia.org/wiki/Common_Lisp#Determiners_of_scope).
If making python behave this way is impossible, then I will just have to live with it. But if it's a question of "we've always done it this way", or, " why change? I'm not bothered by it", then I will repeat my original question: Are there others who feel as I do that a function parameter should always be local to the function?
And again, thank you all for taking the time to respond.
Henry
Python doesn't have true globals. When we say "global" what we mean is
"module or built-in". Also, consider this code
from math import sin
def redundant_sin(x) :
return sin(x)
In Python, everything is an object. That includes functions. By your
definition, that function would either have to be written as
def redundant_sin(sin, x) :
and you would have to pass the function in every time you wanted to
call it or have a "global sin" declaration in your function. And you
would need to do that for every single function that you call in your
function body.
You still mis-reading docs and explanations you received from the list.
Let me try again.
First, there are objects and names. Calling either of them as
'variables' is leading to this mis-understanding. Name refers to some
object. Object may be referenced by several names or none.
Second, when you declare function `def somefunc(a, b='c')` a and b are
both local to this function. Even if there are some global a and b,
they are 'masked' in somefunc scope.
Docs portion you cited refer to other situation, when there is no
clear indicator of 'locality', like this:
def somefunc():
print a
In this - and only in this - case a is considered global.
Third, when you do function call like somefunc(obj1, obj2) it uses
call-by-sharing model. It means that it assigns exactly same object,
that was referenced by obj1, to name a. So both obj1 and _local_ a
reference same object. Therefore when you modify this object, you can
see that obj1 and a both changed (because, in fact, obj1 and a are
just PyObject*, and point to exactly same thing, which changed).
However, if you re-assign local a or global obj1 to other object,
other name will keep referencing old object:
obj1 = []
def somefunc(a):
a.append(1) # 'a' references to the list, which is referenced by
obj1, and calls append method of this list, which modifies itself in
place
global obj1
obj1 = [] # 'a' still references to original list, which is [1]
now, it have no relation to obj1 at all
somefunc(obj1)
> Sadly, I feel that the main issue that I was trying to address, has
> not been dealt with.
False. Please go back and read what I and others wrote before.
...
> What I would like is that the variables which are included in the
> function definition's parameter list, would be always treated as
> local to that function (and of course, accessible to nested
> functions)
You would like Python to be the way it is. Fine. For the Nth time,
PARAMATER NAMES ARE LOCAL NAMES. Period.
> but NOT global unless explicitly defined as global.
PARAMETER NAMES **CANNOT** BE DEFINED AS GLOBAL.
>>> def f(a):
global a
SyntaxError: name 'a' is parameter and global
Again, go back and reread what I and other wrote. I believe that you
are, in part, hypnotized by the work 'variable'. Can you define the
word? There are 10 to 20 possible variations, and yours is probably
wrong for Python.
> quote from wikipedia: "It is considered good programming practice to
> make the scope of variables as narrow as feasible so that different
> parts of a program do not accidentally interact with each other by
> modifying each other's variables.
From 'import this':
"Namespaces are one honking great idea -- let's do more of those!
Python is loaded with namespaces.
> Doing so also prevents action at a
> distance. Common techniques for doing so are to have different
> sections of a program use different namespaces, or to make individual
> variables "private" through either dynamic variable scoping or
> lexical variable scoping."
> (http://en.wikipedia.org/wiki/Variable_(programming)#Scope_and_extent).
Python is lexically scoped.
> another quote from the wikipedia entry for Common Lisp: "the use of
> lexical scope isolates program modules from unwanted interactions.
Python is lexically scoped.
> If making python behave this way is impossible,
How do you expect us to respond when you say "Please make Python the way
it is."? Or is you say "If making Python the way it is is impossible..."?
Now, if you actually want Python to drastically change its mode of
operation and break most existing programs, do not bother asking.
> Are there others who feel as I do that a
> function parameter should always be local to the function?
Yes, we all do, and they are.
--
Terry Jan Reedy
On a sidenote, I wonder what is the reason to keep word 'variable' in
python documentation at all. I believe word 'name' represents concept
better, and those, who come from other languages, would be less likely
to associate wrong definitions with it.
Side point, on variable scope.
There is a philosophical split between declaring variables and not
declaring them. Python is in the latter camp; you are not required to
put "int a; char *b; float c[4];" before you use the integer, string,
and list/array variables a, b, and c. This simplifies code
significantly, but forces the language to have an unambiguous scoping
that doesn't care where you assign to a variable.
Example:
def f():
x=1 # x is function-scope
if cond: # cond is global
x=2 # Same function-scope x as above
print(x) # Function-scope, will be 2 if cond is true
This is fine, and is going to be what you want. Another example:
def f():
print(x) # global
# .... way down, big function, you've forgotten what you're doing
for x in list_of_x:
x.frob()
By using x as a loop index, you've suddenly made it take function
scope, which stops it from referencing the global. Granted, you
shouldn't be using globals with names like 'x', but it's not hard to
have duplication of variable names. As a C programmer, I'm accustomed
to being able to declare a variable in an inner block and have that
variable cease to exist once execution leaves that block; but that
works ONLY if you declare the variables where you want them.
Infinitely-nested scoping is simply one of the casualties of a
non-declarative language.
Chris Angelico
Well, this is not accurate, as you can have 'infinitely-nested
scoping' in python, in form of nested functions. For example, you can
use map(lambda x: <expressions with x, including other
map/filter/reduce/lambda's>, list_of_x), and you will have your
isolated scopes. Although due to lambdas supporting only expressions,
following this style leads to awkward and complicated code (and/or
instead if, map instead for, and so on).
That's an incredibly messy workaround, and would get ridiculous if you
tried going many levels in. It's like saying that a C-style 'switch'
statement can be implemented in Python using a dictionary of
lambdas... and then trying to implement fall-through. But you're
right; a lambda does technically create something of a nested scope -
albeit one in which the only internal variables are its parameters.
Chris Angelico
> I am trying to write python programs in a more-or-less functional
> programming mode, ie functions without side effects (except for print
> statements, which are very helpful for debugging). This is easiest when
> all variables declared in functions are local in scope
They are.
> (it would also be
> nice if variables assigned within certain constructs such as for loops
> and list comprehensions were local to that construct, but I can live
> without it).
for loop variables are local to the function, by design.
List comprehension variables leak outside the comprehension. That is an
accident that is fixed in Python 3. Generator expression variables never
leaked.
> It appears, from my reading of the python documentation, that a
> deliberate decision was made to have variables that are referenced but
> not assigned in a function, have a global scope.
[...]
> This suggests that the decision to make unassigned (ie "free" variables)
> have a global scope, was made somewhat arbitrarily to prevent clutter.
> But I don't believe that the feared clutter would materialize.
Then you haven't understood the problem at all.
> My
> understanding is that when a variable is referenced, python first looks
> for it in the function's namespace, then the module's, and finally the
> built-ins. So why would it be necessary to declare references to
> built-ins as globals?
How else would the functions be found if they were ONLY treated as local?
Consider this code:
def spam():
print a_name
print len(a_name)
This includes two names: "a_name" and "len". They both follow the same
lookup rules. Functions in Python are first-class values, just like ints,
strings, or any other type.
You don't want spam() to see any global "a_name" without it being
declared. But how then can it see "len"?
Don't think that Python could change the rules depending on whether
you're calling a name or not. Consider this:
def spam():
print a_name
print a_name(len)
Do you expect the first lookup to fail, and the second to succeed?
> What I would like is that the variables which are included in the
> function definition's parameter list, would be always treated as local
> to that function (and of course, accessible to nested functions) but NOT
> global unless explicitly defined as global.
They are. Always have been, always will be.
> This would prevent the sort
> of problems that I encountered as described in my original post.
No it wouldn't. You are confused by what you are seeing, and interpreting
it wrongly.
I believe that what you want is pass-by-value semantics, in the old-
school Pascal sense, where passing a variable to a function makes a copy
of it. This is not possible in Python. As a design principle, Python
never copies objects unless you as it to.
> I may
> be wrong here, but it seems that the interpreter/compiler should be able
> to deal with this, whether the parameter passing is by value, by
> reference, by object reference, or something else.
Of course. You could create a language that supports any passing models
you want. Pascal and VB support pass-by-value ("copy on pass") and pass-
by-reference. Algol supports pass-by-value and (I think) pass-by-name.
There's no reason why you couldn't create a language to support pass-by-
value and pass-by-object. Whether this is a good idea is another story.
However, that language is not Python.
> If variables are not
> assigned (or bound) at compile time, but are included in the parameter
> list, then the binding can be made at runtime.
Er, yes? That already happens.
> And I am NOT talking
> about variables that are only referenced in the body of a function
> definition; I am talking about parameters (or arguments) in the
> function's parameter list.
You keep bring up the function parameter list as if that made a
difference. It doesn't.
Perhaps this will enlighten you:
>>> alist = [1, 2, 3, 4]
>>> blist = alist
>>> blist[0] = -999
>>> alist
[-999, 2, 3, 4]
Passing alist to a function is no different to any other name binding. It
doesn't make a copy of the list, it doesn't pass a reference to the name
"alist", it passes the same object to a new scope.
> As I stated before, there is no need to
> include a global variable in a parameter list, and if you want to have
> an effect outside of the function, that's what the return statement is
> for.
Function parameters are never global. You are misinterpreting what you
see if you think they are.
[...]
> If making python behave this way is impossible, then I will just have to
> live with it. But if it's a question of "we've always done it this way",
> or, " why change? I'm not bothered by it", then I will repeat my
> original question: Are there others who feel as I do that a function
> parameter should always be local to the function?
They are. You are misinterpreting what you see.
--
Steven
> This suggests that the decision to make unassigned (ie "free"
> variables) have a global scope, was made somewhat arbitrarily to
> prevent clutter. But I don't believe that the feared clutter would
> materialize. My understanding is that when a variable is referenced,
> python first looks for it in the function's namespace, then the
> module's, and finally the built-ins. So why would it be necessary to
> declare references to built-ins as globals?
Not for the builtins, but for the global ones.
Suppose you have a module
def f(x):
return 42
def g(x, y):
return f(x) + f(y)
Would you really want to need a "global f" inside g?
Besides, this doesn't have to do with your original problem at all. Even
then, a
def h(x):
x.append(5)
return x
would clobber the given object, because x is just a name for the same
object which the caller has.
> What I would like is that the variables which are included in the
> function definition's parameter list, would be always treated as
> local to that function (and of course, accessible to nested
> functions)
They are - in terms of name binding. In Python, you always have objects
which can be referred to from a variety of places under different names.
Maybe what you want are immutable objects (tuples) instead of mutable
ones (lists)...
> I don't believe I'm the only person who thinks this way. Here is a
> quote from wikipedia: "It is considered good programming practice to
> make the scope of variables as narrow as feasible
Again: It is the way here.
Think of C: there you can have a
int f(int * x) {
*x = 42;
return x;
}
The scope of the variable x is local here. Same in Python.
The object referred to by *x is "somewhere else", by design. Same in Python.
> If making python behave this way is impossible, then I will just have
> to live with it.
Even if I still think that you are confusing "scope of a name binding"
and "scope of an object", it is a conceptual thing.
Surely it would have been possible to do otherwise, but then it would be
a different language. Objects can be mutable, period.
In MATLAB, e.g., you have what you desire here: you always have to pass
your object around and get another ne back, even if you just add or
remove a field of a struct or change the value of a field.
HTH,
Thomas
the parameter is local but it points to an object from an outer
scope - that could be the scope of the calling function or maybe
the global scope. So if you change the value of this parameter,
you change that object from outer scope. But the parameter
itself is still local. If you do
def fnc2(c):
c = 5
the passed object will not be changed, c now points to another
object. This is different from other languages where the "global"
c would change (when passing c by reference)
what you really seem to want is that a function by default
cannot have any side effects (you have a side effect if a
function changes things outside of its local scope). But
that would be a very different language than python
did you read the link Steven gave you?
http://mail.python.org/pipermail/tutor/2010-December/080505.html
--
Wolfgang
> On a sidenote, I wonder what is the reason to keep word 'variable' in
> python documentation at all. I believe word 'name' represents concept
> better, and those, who come from other languages, would be less likely
> to associate wrong definitions with it.
I agree, but the term “variable” is used freely within the Python
development team to refer to Python's name-to-object bindings, and that
usage isn't likely to stop through our efforts.
So the burden is unfortunately on us to teach each newbie that this
often-used term means something other than what they might expect.
--
\ “In case you haven't noticed, [the USA] are now almost as |
`\ feared and hated all over the world as the Nazis were.” —Kurt |
_o__) Vonnegut, 2004 |
Ben Finney
This can be done in Python (to some degree), like this
@copy_args
def somefunc(a, b, c):
...
where copy_args would explicitly call deepcopy() on all args passed to
the function.
Or, to save some performance, wrap them in some CopyOnEdit proxy
(although this is tricky, as getattr/getitem can modify object too if
class overrides them).
Obviously it would not save you from functions which use
global/globals() or some other ways to change state outside their
scope.
>
> what you really seem to want is that a function by default
> cannot have any side effects (you have a side effect if a
> function changes things outside of its local scope). But
> that would be a very different language than python
You're partially right - what I want is a function that is free of side effects back through the parameters passed in the function call. Side effects via globals or print statements is fine by me.
python seems to be undergoing changes all the time. List comprehensions were added in python 2.0, according to wikipedia. I like list comprehensions and use them all the time because they are powerful and concise.
>
> did you read the link Steven gave you?
> http://mail.python.org/pipermail/tutor/2010-December/080505.html
Yes, I did, thanks.
Henry
So, you have no problem with *global* side effects, but side effects
with a /more constrained/ scope bother you?
That's kinda odd, IMO.
Cheers,
Chris
As I've pointed, you can make decorator to do that. Adding @copy_args
to each function you intend to be pure is not that hard.
import decorator
import copy
@decorator.decorator
def copy_args(f, *args, **kw):
nargs = []
for arg in args:
nargs.append(copy.deepcopy(arg))
nkw = {}
for k,v in kw.iteritems():
nkw[k] = copy.deepcopy(v)
return f(*nargs, **nkw)
@copy_args
def test(a):
a.append(1)
return a
>>> l = [0]
>>> test(l)
[0, 1]
>>> l
[0]
>>> inspect.getargspec(test)
ArgSpec(args=['a'], varargs=None, keywords=None, defaults=None)
So this decorator achieves needed result and preserves function signatures.
Yes, it would make Python quite different. If suddenly you couldn't
pass a mutable object to a function to get it muted (that sounds
wrong), then code will break. Also, there's a fairly serious
performance penalty to copying everything when it's not necessary. As
has been suggested, you can specifically and deliberately cause this
effect for any function(s) you wish to "protect" in this way; there's
no need to change the language's fundamentals to do it.
Chris Angelico
I don't know any object oriented language where it is not
possible to change objects passed in as parameters. It
is up to the passed object (a list in your case) to allow
or disallow manipulations no matter how they are invocated,
and the object is the same in the calling code and in the
called function.
--
Wolfgang
> what I want is a function that is free of side effects back through
> the parameters passed in the function call.
You can get that by refraining from mutating parameter objects.
Simple as that.
Just do not expect Python to enforce that discipline on everyone else.
To be really functional, and never mutate objects, do not use Python
lists, which are arrays. Use linked-list trees, like Lisp languages and
perhaps others do. One can easily do this with tuples, or a subclass of
tuples, or a class wrapping tuples.
Linked-lists and functional programming go together because prepending
to a linked list creates a new object while appending to a Python list
mutates an existing list. Similarly, popping from a linked list
retrieves an item and an existing sublist while popping from a Python
list retrieves and item and mutates the list.
--
Terry Jan Reedy
> Clearly, making a copy within the function eliminates the possibility of
> the side effects caused by passing in mutable objects.
Mutable objects and mutating methods and functions are a *feature* of
Python. If you do not like them, do not use them.
> Would having the compiler/interpreter do this automatically
> make python so much different?
Yes. How would you then write a function like list.sort or list.pop?
It is fundamental that parameters are simply local names that must be
bound as part of the calling process. After that, they are nothing special.
Python is a language for adults that take responsibility for what they
do. If you do not like argument-mutating functions, then do not write
them and do not use them (without making a copy yourself).
Python was not designed to model timeless immutable mathematics. It is
an information-object manipulation language and in real life, we mutate
collections and associations all the time.
--
Terry Jan Reedy
There is no "decorator" module in the standard library. This must be
some third-party module. The usual way to do this would be:
def copy_args(f):
@functools.wraps(f)
def wrapper(*args, **kw):
nargs = map(copy.deepcopy, args)
nkw = dict(zip(kw.keys(), map(copy.deepcopy, kw.values())))
return f(*nargs, **nkw)
return wrapper
Note that this will always work, whereas the "decorator.decorator"
version will break if the decorated function happens to take a keyword
argument named "f".
Cheers,
Ian
You want a functional language.
You can simulate that in python by using tuples in place of lists.
fnc2(c):
c[1] = 'having'
return c
will of course then give you an error that tuples are not assignable
(which seems to be what you want?)
So you then use (something like)
fnc2(c): return c[0:1] + c[2:]
No reason, good call.
> It means you will copy the keys as well, however they will (almost)
> certainly be strings which is effectively a no-op.
I think the keys will certainly be strings. Is there any scenario
where they might not be?
> So you then use (something like)
>
> fnc2(c): return c[0:1] + c[2:]
Er sorry -- that should have been
def fnc2(c): return c[0:1] + ('having',) + c[2:]
Shoot, that's easy! Just write your function to not have any!
~Ethan~
It would be a different language.
~Ethan~
Yes, but its very useful for decorators and provides some
not-readily-available functionality.
http://pypi.python.org/pypi/decorator/3.3.1
> Note that this will always work, whereas the "decorator.decorator"
> version will break if the decorated function happens to take a keyword
> argument named "f".
No, it will not. Its the magic of decorator library, it is
signature-preserving, while your variant breaks function signature and
causes problems to any code that relies on signatures (happens with
Pylons, for example).
>>> @copy_args
... def test(a, f=None):
... print f
...
>>> test([], f=123)
123
Basically decorator.decorator uses exec to create new function, with
signature of function you pass to your decorator, so it does not
matter what names you used for args in decorator itself.
Ah, I see. I assumed it was much simpler than it is. I found a way
to break it with Python 3, though:
>>> @copy_args
... def test(*, f):
... return f
...
>>> test(f=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in test
File "<stdin>", line 6, in copy_args
TypeError: test() needs keyword-only argument f
The interesting thing here is that the decorated function has exactly
the correct function signature; it just doesn't work.
Cheers,
Ian