I thought I thoroughly understood eval, exec, globals, and locals, but I
encountered something bewildering today. I have some short files I
want to
exec. (Users of my application write them, and the application gives
them a
command that opens a file dialog box and execs the chosen file. Users
are
expected to be able to write simple Python scripts, including function
definitions. Neither security nor errors are relevant for the purposes
of this
discussion, though I do deal with them in my actual code.)
Here is a short piece of code to exec a file and report its result.
(The file
being exec'd must assign 'result'.)
def dofile(filename):
ldict = {'result': None}
with open(filename) as file:
exec(file.read(), globals(), ldict)
print('Result for {}: {}'.format(filename, ldict['result']))
First I call dofile() on a file containing the following:
################################
def fn(arg):
return sum(range(arg))
result = fn(5)
################################
The results are as expected.
Next I call dofile() on a slightly more complex file, in which one
function
calls another function defined earlier in the same file.
################################
def fn1(val):
return sum(range(val))
def fn2(arg):
return fn1(arg)
result = fn2(5)
################################
This produces a surprise:
NameError: global name 'fn1' is not defined
[1] How is it that fn2 can be called from the top-level of the script
but fn1
cannot be called from fn2?
[2] Is this correct behavior or is there something wrong with Python
here?
[3] How should I write a file to be exec'd that defines several
functions that
call each other, as in the trivial fn1-fn2 example above?
I can get both your examples to work using the 'imp' module.
http://docs.python.org/3.1/library/imp.html#module-imp
I used python 2.6.4. Note that 3.1 also has 'importlib' module.
import imp
# the name of the python file written by a user
name = 'test1'
fp, pathname, description = imp.find_module(name)
test1 = imp.load_module(name, fp, pathname, description)
print test1.result
# remember to close file (see docs)
fp.close()
> Next I call dofile() on a slightly more complex file, in which one
> function calls another function defined earlier in the same file.
>
> ################################
> def fn1(val):
> return sum(range(val))
>
> def fn2(arg):
> return fn1(arg)
>
> result = fn2(5)
> ################################
>
> This produces a surprise:
>
> NameError: global name 'fn1' is not defined
>
> [1] How is it that fn2 can be called from the top-level of the script
> but fn1 cannot be called from fn2?
This might help you to see what's going on. Define your own cut-down
version of the global namespace, and a local namespace, and a string to
execute:
myglobals = {'__builtins__': None, 'globals': globals, 'locals': locals,
'print': print}
mylocals = {'result': None}
s = """def f():
print("Globals inside f:", globals())
print("Locals inside f:", locals())
print("Globals at the top level:", globals())
print("Locals at the top level:", locals())
f()
"""
exec(s, myglobals, mylocals)
And this is what you should see:
Globals at the top level: {'__builtins__': None, 'print': <built-in
function print>, 'globals': <built-in function globals>, 'locals': <built-
in function locals>}
Locals at the top level: {'result': None, 'f': <function f at 0xb7ddeeac>}
Globals inside f: {'__builtins__': None, 'print': <built-in function
print>, 'globals': <built-in function globals>, 'locals': <built-in
function locals>}
Locals inside f: {}
Does that clarify what's going on?
> [2] Is this correct behavior or is there something wrong with Python
> here?
This certainly surprised me too. I don't know if it is correct or not,
but it goes back to at least Python 2.5.
> [3] How should I write a file to be exec'd that defines several
> functions that call each other, as in the trivial fn1-fn2 example above?
My preference would be to say, don't use exec, just import the module.
Put responsibility on the user to ensure that they set a global "result",
and then just do this:
mod = __import__('user_supplied_file_name')
result = mod.result
But if that's unworkable for you, then try simulating the namespace setup
at the top level of a module. The thing to remember is that in the top
level of a module:
>>> globals() is locals()
True
so let's simulate that:
myglobals = {'result': None} # You probably also want __builtins__
s = """def f():
return g() + 1
def g():
return 2
result = f()
"""
exec(s, myglobals, myglobals)
myglobals['result']
This works for me.
--
Steven
> def dofile(filename):
> ldict = {'result': None}
> with open(filename) as file:
> exec(file.read(), globals(), ldict)
> print('Result for {}: {}'.format(filename, ldict['result']))
>
> Next I call dofile() on a slightly more complex file, in which one
> function
> calls another function defined earlier in the same file.
>
> ################################
> def fn1(val):
> return sum(range(val))
>
> def fn2(arg):
> return fn1(arg)
>
> result = fn2(5)
> ################################
>
> This produces a surprise:
>
> NameError: global name 'fn1' is not defined
Ok - short answer or long answer?
Short answer: Emulate how modules work. Make globals() same as locals().
(BTW, are you sure you want the file to run with the *same* globals as the
caller? It sees the dofile() function and everything you have
defined/imported there...). Simply use: exec(..., ldict, ldict)
> [1] How is it that fn2 can be called from the top-level of the script
> but fn1
> cannot be called from fn2?
Long answer: First, add these lines before result=fn2(5):
print("globals=", globals().keys())
print("locals=", locals().keys())
import dis
dis.dis(fn2)
and you'll get:
globals()= dict_keys(['dofile', '__builtins__', '__file__', '__package__',
'__name__', '__doc__'])
locals()= dict_keys(['result', 'fn1', 'fn2'])
So fn1 and fn2 are defined in the *local* namespace (as always happens in
Python, unless you use the global statement). Now look at the code of fn2:
6 0 LOAD_GLOBAL 0 (fn1)
3 LOAD_FAST 0 (arg)
6 CALL_FUNCTION 1
9 RETURN_VALUE
Again, the compiler knows that fn1 is not local to fn2, so it must be
global (because there is no other enclosing scope) and emits a LOAD_GLOBAL
instruction. But when the code is executed, 'fn1' is not in the global
scope...
Solution: make 'fn1' exist in the global scope. Since assignments (implied
by the def statement) are always in the local scope, the only alternative
is to make both scopes (global and local) the very same one.
This shows that the identity "globals() is locals()" is essential for the
module system to work.
> [2] Is this correct behavior or is there something wrong with Python
> here?
It's perfectly logical once you get it... :)
> [3] How should I write a file to be exec'd that defines several
> functions that
> call each other, as in the trivial fn1-fn2 example above?
Use the same namespace for both locals and globals: exec(file.read(),
ldict, ldict)
--
Gabriel Genellina
Statements that bind new names -- assignment, def, and class -- do so
in the local scope. While exec'ing a file the local scope is
determined by the arguments passed to exec; in my case, I passed an
explicit local scope. It was particularly obtuse of me not to notice
the effects of this because I was intentionally using it so that an
assignment to 'result' in the exec'd script would enable the exec'ing
code to retrieve the value of result. However, although the purity of
Python with respect to the binding actions of def and class statements
is wonderful and powerful, it is very difficult cognitively to view a
def on a page and think "aha! that's just like an assignment of a
newly created function to a name", even though that is precisely the
documented behavior of def. So mentally I was making an incorrect
distinction between what was getting bound locally and what was
getting bound globally in the exec'd script.
Moreover, the normal behavior of imported code, in which any function
in the module can refer to any other function in the module, seduced
me into this inappropriate distinction. To my eye I was just defining
and using function definitions the way they are in modules. There is a
key difference between module import and exec: as Steven pointed out,
inside a module locals() is globals(). On further reflection, I will
add that what appears to be happening is that during import both the
global and local dictionaries are set to a copy of the globals() from
the importing scope and that copy becomes the value of the module's
__dict__ once import has completed successfully. Top-level statements
bind names in locals(), as always, but because locals() and globals()
are the same dictionary, they are also binding them in globals(), so
that every function defined in the module uses the modified copy of
globals -- the value of the module's __dict__ -- as its globals() when
it executes. Because exec leaves locals() and globals() distinct,
functions defined at the top level of a string being exec'd don't see
other assignments and definitions that are also in the string.
Another misleading detail is that top-level expressions in the exec
can use other top-level names (assigned, def'd, etc.), which they will
find in the exec string's local scope, but function bodies do not see
the string's local scope. The problem I encountered arises because the
function definitions need to access each other through the global
scope, not the local scope. In fact, the problem would arise if one of
the functions tried to call itself recursively, since its own name
would not be in the global scope. So we have a combination of two
distinctions: the different ways module import and exec use globals
and locals and the difference between top-level statements finding
other top-level names in locals but functions looking for them in
globals.
Sorry for the long post. These distinctions go deep into the semantics
of Python namespaces, which though they are lean, pure, and beautiful,
have some consequences that can be surprising -- more so the more
familiar you are with other languages that do things differently.
Oh, and as far as using import instead of exec for my scripts, I don't
think that's appropriate, if only because I don't want my
application's namespace polluted by what could be many of these pseudo-
modules users might load during a session. (Yes, I could remove the
name once the import is finished, but importing solely for side-
effects rather than to use the imported module is offensive. Well, I
would be using one module name -- result -- but that doesn't seem to
justify the complexities of setting up the import and accessing the
module when exec does in principle just what I need.)
Finally, once all of this is really understood, there is a simple way
to change an exec string's def's to bind globally instead of locally:
simply begin the exec with a global declaration for any function
called by one of the others. In my example, adding a "global fn1" at
the beginning of the file fixes it so exec works.
################################
global fn1 # enable fn1 to be called from fn2!
This is very helpful additional information and clarification! Thanks.
>
> This shows that the identity "globals() is locals()" is essential
> for the module system to work.
Yes, though I doubt more than a few Python programmers would guess
that identity.
>
>> [2] Is this correct behavior or is there something wrong with
>> Python here?
>
> It's perfectly logical once you get it... :)
I think I'm convinced.
>
>> [3] How should I write a file to be exec'd that defines several
>> functions that
>> call each other, as in the trivial fn1-fn2 example above?
>
> Use the same namespace for both locals and globals:
> exec(file.read(), ldict, ldict)
>
I was going to say that this wouldn't work because the script couldn't
use any built-in names, but the way exec works if the value passed for
the globals argument doesn't contain an entry for '__builtins__' it
adds one. I would have a further problem in that there are some names
I want users to be able to use in their scripts, in particular classes
that have been imported into the scope of the code doing the exec, but
come to think of it I don't want to expose the entire globals()
anyway. The solution is do use the same dictionary for both globals
and locals, as you suggest, to emulate the behavior of module import,
and explicitly add to it the names I want to make available (and since
they are primarily classes, there are relatively few of those, as
opposed to an API of hundreds of functions). Thanks for the help.
> On further reflection, I will add that
> what appears to be happening is that during import both the global and
> local dictionaries are set to a copy of the globals() from the importing
> scope and that copy becomes the value of the module's __dict__ once
> import has completed successfully.
I have no idea why you think that. The module dict starts empty except
for __name__, __file__, and perhaps a couple of other 'hidden' items. It
is not a copy and has nothing to do with importing scopes.
> and that copy becomes the value of the module's __dict__ once
> import has completed successfully.
That new dict becomes .... .
> Because exec leaves locals() and globals() distinct,
Not necessarily.
In 3.x, at least,
exec(s)
executes s in the current scope. If this is top level, where locals is
globals, then same should be true within exec.
d = {}
exec(s, d)
In 3.x, at least, d will also be used as locals.
exec(s, d, d)
Again, globals and locals are not distinct.
It would seem that in 3.x, the only way for exec to have distinct
globals and locals is to call exec(s) where they are distinct or to pass
distince globals and locals.
Some of the issues of this thread are discussed in Language Reference
4.1, Naming and Binding. I suppose it could be clearer that it is, but
the addition of nonlocal scope complicated things.
Terry Jan Reedy