Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

List comprehension - NameError: name '_[1]' is not defined ?

212 views
Skip to first unread message

mario ruggier

unread,
Jan 15, 2009, 6:29:59 AM1/15/09
to
Hello,

I would like to evaluate list comprehension expressions, from within
which I'd like to call a function. For a first level it works fine but
for second level it seems to lose the "_[1]" variable it uses
internally to accumulate the results. Some sample code is:

class GetItemEvaluator(object):
def __init__(self):
self.globals = globals() # some dict (never changes)
self.globals["ts"] = self.ts
self.globals["join"] = "".join
self.locals = {} # changes on each evaluation
def __getitem__(self, expr):
return eval(expr, self.globals, self.locals)
def ts(self, ts, name, value):
self.locals[name] = value
#print ts, name, value, "::::", self.locals, "::::", ts % self
return ts % self

gie = GetItemEvaluator()
gie.locals["inner"] = ("a","b","c","d")
print """
pre %(join([ts("%s."%(j)+'%(k)s ', 'k', k) for j,k in enumerate
(inner)]))s post
""" % gie
# OK, outputs: pre 0.a 1.b 2.c 3.d post

gie = GetItemEvaluator()
gie.locals["outer"] = [ ("m","n","o","p"), ("q","r","s","t")]
print """
pre %(join([ts(
'''inner pre
%(join([ts("%s.%s."%(i, j)+'%(k)s ', 'k', k) for j,k in enumerate
(inner)]))s
inner post''',
"inner", inner) # END CALL outer ts()
for i,inner in enumerate(outer)])
)s post
""" % gie

The second 2-level comprehension gives:

File "scratch/eval_test.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 4, in <module>
NameError: name '_[1]' is not defined

If the print was to be enable, the last line printed out is:

0.3.%(k)s k p :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r', 's',
't')], 'i': 0, 'k': 'p', 'j': 3, '_[1]': ['0.0.m ', '0.1.n ', '0.2.o
'], 'inner': ('m', 'n', 'o', 'p')} :::: 0.3.p

i.e. it has correctly processed the first inner sequence, until the
(last) "p" element. But on exit of the last inner ts() call, it seems
to lose the '_[1]' on self.locals.

Any ideas why?

Note, i'd like that the first parameter to ts() is as independent as
possible from teh context in expression context, a sort of independent
mini-template. Thus, the i,j enumerate counters would normally not be
subbed *within* the comprehension itself, but in a similar way to how
k is evaluated, within the call to ts() -- I added them this way here
to help follow easier what the execution trail is. Anyhow, within that
mini-template, i'd like to embed other expressions for the % operator,
and that may of course also be list comprehensions.

Thanks!

bearoph...@lycos.com

unread,
Jan 15, 2009, 7:48:23 AM1/15/09
to
mario ruggier, that's a hack that you can forget. Your code can't be
read. Don't use list comps for that purpose. Write code that can be
read.

Bye,
bearophile

mario ruggier

unread,
Jan 15, 2009, 7:52:02 AM1/15/09
to
On Jan 15, 12:29 pm, mario ruggier <mario.rugg...@gmail.com> wrote:
> Any ideas why?
>
> Note, i'd like that the first parameter to ts() is as independent as
> possible from the context in expression context, a sort of independent

> mini-template. Thus, the i,j enumerate counters would normally not be
> subbed *within* the comprehension itself, but in a similar way to how
> k is evaluated, within the call to ts() -- I added them this way here
> to help follow easier what the execution trail is. Anyhow, within that
> mini-template, i'd like to embed other expressions for the % operator,
> and that may of course also be list comprehensions.

OK, here's the same sample code somewhat simplified
and maybe be easier to follow what may be going on:


class GetItemEvaluator(object):
def __init__(self):
self.globals = globals() # some dict (never changes)
self.globals["ts"] = self.ts
self.globals["join"] = " ".join
self.locals = {} # changes on each evaluation
def __getitem__(self, expr):
return eval(expr, self.globals, self.locals)

def ts(self, ts):
print "ts:", ts, "::::", self.locals
return ts % self

# one level


gie = GetItemEvaluator()
gie.locals["inner"] = ("a","b","c","d")

TS1 = """
pre %(join([
ts('%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
"""
OUT1 = TS1 % gie
print "Output 1:", OUT1

# two level


gie = GetItemEvaluator()
gie.locals["outer"] = [ ("m","n","o","p"), ("q","r","s","t")]

TS2 = """
leading %(join([
ts(
'''
pre %(join([
ts('%(i)s.%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
''' # identical to TS1, except for additional '%(s)s.'


)
for i,inner in enumerate(outer)])

)s trailing
"""
OUT2 = TS2 % gie
print "Output 2:", OUT2


As the gie.locals dict is being automagically
updated from within the list comprehension
expression, I simplified the previous call to ts().
Thus, executing this with the prints enabled as
shown will produce the following output:


$ python2.6 scratch/eval_test_4.py
ts: %(j)s.%(k)s :::: {'_[1]': [], 'k': 'a', 'j': 0, 'inner': ('a',


'b', 'c', 'd')}

ts: %(j)s.%(k)s :::: {'_[1]': ['0.a'], 'k': 'b', 'j': 1, 'inner':


('a', 'b', 'c', 'd')}

ts: %(j)s.%(k)s :::: {'_[1]': ['0.a', '1.b'], 'k': 'c', 'j': 2,
'inner': ('a', 'b', 'c', 'd')}
ts: %(j)s.%(k)s :::: {'_[1]': ['0.a', '1.b', '2.c'], 'k': 'd', 'j': 3,
'inner': ('a', 'b', 'c', 'd')}
Output 1:


pre 0.a 1.b 2.c 3.d post

ts:
pre %(join([
ts('%(i)s.%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
:::: {'_[1]': [], 'i': 0, 'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'inner': ('m', 'n', 'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'm', 'j': 0, '_[1]': [], 'inner': ('m', 'n',
'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'n', 'j': 1, '_[1]': ['0.0.m'], 'inner':


('m', 'n', 'o', 'p')}

ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'o', 'j': 2, '_[1]': ['0.0.m', '0.1.n'],


'inner': ('m', 'n', 'o', 'p')}

ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',


's', 't')], 'i': 0, 'k': 'p', 'j': 3, '_[1]': ['0.0.m', '0.1.n',
'0.2.o'], 'inner': ('m', 'n', 'o', 'p')}

Traceback (most recent call last):
File "scratch/eval_test.py", line 40, in <module>
OUT2 = TS2 % gie
File "scratch/eval_test_4.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 9, in <module>


NameError: name '_[1]' is not defined


Anyone can help clarify what may be going on?

m.

mario ruggier

unread,
Jan 15, 2009, 7:58:01 AM1/15/09
to

Ya, agree with you whole-heartily, but then so are most
optimizations ;-) It is just an idea I am exploring, and that code
would be never be humanly written (that's why it seems more convoluted
than necessary). I hope the simplified boiled down sample gets the
intention out better... i'd still would like to understand why the '_
[1]' variable is disappearing after first inner loop!

> Bye,
> bearophile

Peter Otten

unread,
Jan 15, 2009, 8:02:54 AM1/15/09
to
mario ruggier wrote:

I have no idea what you are trying to do. Please reread the Zen of Python ;)

What happens is:

List comprehensions delete the helper variable after completion:

>>> def f(): [i for i in [1]]
...
>>> dis.dis(f)
1 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 0 (_[1])
7 LOAD_CONST 1 (1)
10 BUILD_LIST 1
13 GET_ITER
>> 14 FOR_ITER 13 (to 30)
17 STORE_FAST 1 (i)
20 LOAD_FAST 0 (_[1])
23 LOAD_FAST 1 (i)
26 LIST_APPEND
27 JUMP_ABSOLUTE 14
>> 30 DELETE_FAST 0 (_[1])
33 POP_TOP
34 LOAD_CONST 0 (None)
37 RETURN_VALUE

If you manage to run two nested listcomps in the same namespace you get a
name clash and the inner helper variable overwrites/deletes the outer:

>>> def xeval(x): return eval(x, ns)
...
>>> ns = dict(xeval=xeval)
>>> xeval("[xeval('[k for k in ()]') for i in (1,)]")


Traceback (most recent call last):

File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in xeval
File "<string>", line 1, in <module>


NameError: name '_[1]' is not defined

Peter

Peter Otten

unread,
Jan 15, 2009, 8:19:10 AM1/15/09
to
Peter Otten wrote:

I'd like to add: this can only happen because the code snippets are compiled
independently. Otherwise Python uses different names for each listcomp:

>>> def f():
... [i for i in ()]
... [i for i in ()]
...
>>> f.func_code.co_varnames
('_[1]', 'i', '_[2]')

Peter

mario ruggier

unread,
Jan 15, 2009, 8:22:26 AM1/15/09
to

Ah, brilliant, thanks for the clarification!

To verify if I understood you correctly, I have modified
the ts() method above to:

def ts(self, ts):
_ns = self.locals
self.locals = self.locals.copy()


print "ts:", ts, "::::", self.locals

try:
return ts % self
finally:
self.locals = _ns

And, it executes correctly, thus the 2nd output is:

Output 2:
leading
pre 0.0.m 0.1.n 0.2.o 0.3.p post

pre 1.0.q 1.1.r 1.2.s 1.3.t post
trailing

But, the need to do a copy() will likely kill any potential
optimization gains... so, I will only be forced to rite more readable
code ;-)

Thanks!

Steven D'Aprano

unread,
Jan 15, 2009, 10:06:41 AM1/15/09
to
On Thu, 15 Jan 2009 03:29:59 -0800, mario ruggier wrote:

> Hello,
>
> I would like to evaluate list comprehension expressions, from within
> which I'd like to call a function. For a first level it works fine but
> for second level it seems to lose the "_[1]" variable it uses internally
> to accumulate the results. Some sample code is:
>
> class GetItemEvaluator(object):
> def __init__(self):
> self.globals = globals() # some dict (never changes)

Would you like to put a small wager on that?

>>> len(gie.globals)
64
>>> something_new = 0
>>> len(gie.globals)
65

> self.globals["ts"] = self.ts
> self.globals["join"] = "".join
> self.locals = {} # changes on each evaluation
> def __getitem__(self, expr):
> return eval(expr, self.globals, self.locals)

Can you say "Great Big Security Hole"?

>>> gie = GetItemEvaluator()
>>> gie['__import__("os").system("ls")']
dicttest dumb.py rank.py sorting
startup.py
0


http://cwe.mitre.org/data/definitions/95.html

--
Steven

mario ruggier

unread,
Jan 15, 2009, 10:56:02 AM1/15/09
to
On Jan 15, 4:06 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:

Hi Steve!

> > class GetItemEvaluator(object):
> >     def __init__(self):
> >         self.globals = globals() # some dict (never changes)

Ya, this is just a boiled down sample, and for simplicity I set to to
the real globals(), so of course it will change when that changes...
but in the application this is a distinct dict, that is entirely
managed by the application, and it never changes as a result of an
*evaluation*.

> Would you like to put a small wager on that?
>
> >>> len(gie.globals)
> 64
> >>> something_new = 0
> >>> len(gie.globals)
>
> 65


> >         self.globals["ts"] = self.ts
> >         self.globals["join"] = "".join
> >         self.locals = {} # changes on each evaluation
> >     def __getitem__(self, expr):
> >         return eval(expr, self.globals, self.locals)
>
> Can you say "Great Big Security Hole"?

With about the same difficulty as "Rabbit-Proof Fence" ;-)
Again, it is just a boiled down sample, for communication purposes. As
I mentioned in another thread, the real application behind all this is
one of the *few* secure templating systems around. Some info on its
security is at: http://evoque.gizmojo.org/usage/restricted/
Tell you what, if you find a security hole there (via exposed template
source on a Domain(restricted=True) setup) I'll offer you a nice
dinner (including the beer!) somewhere, maybe at some py conference,
but even remotely if that is not feasible... ;-) The upcoming 0.4
release will run on 2.4 thru to 3.0 -- you can have some fun with that
one (the current 0.3 runs on 2.5 and 2.6).

> --
> Steven

Cheers, mario

mario ruggier

unread,
Jan 15, 2009, 11:09:53 AM1/15/09
to
The listcomps exploration above was primarily an attempt
(unsuccessful) to make this piece of code go a little faster:

s = " text %(item)s text "
acc = []
for value in iterator:
some_dict["item"] = value
acc.append(s % evaluator)
"".join(acc)

The item=value pair is essentially a loop variable, and the evaluator
(something like the gie instance above) uses it via the updated
some_dict.

Is there any way to express the above as a list comp or so? Any ideas
how it might be made to go faster?

m.

Mark Wooding

unread,
Jan 15, 2009, 3:36:36 PM1/15/09
to
mario ruggier <mario....@gmail.com> writes:

> Some info on its security is at:
> http://evoque.gizmojo.org/usage/restricted/

> Tell you what, if you find a security hole there (via exposed template
> source on a Domain(restricted=True) setup) I'll offer you a nice
> dinner (including the beer!) somewhere, maybe at some py conference,
> but even remotely if that is not feasible... ;-) The upcoming 0.4
> release will run on 2.4 thru to 3.0 -- you can have some fun with that
> one (the current 0.3 runs on 2.5 and 2.6).

I'm pretty sure I can break this on 3.0, because the f_restricted frame
flag has gone. Here's how:

>>> import template, domain
>>> dom = domain.Domain('/tmp/mdw/', restricted = True, quoting = 'str')
>>> t = template.Template(dom, 'evil', from_string = True, src =
>>> "${inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw/target').read()}")
2009-01-15 20:30:29,177 ERROR [evoque] RuntimeError: restricted
attribute: File "<string>", line 1, in <module>
: EvalError(inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw/target').read())
u'[RuntimeError: restricted attribute: File "<string>", line 1, in
<module>\n:
EvalError(inspect.func_globals[\'_\'*2+\'builtins\'+\'_\'*2].open(\'/tmp/mdw/target\').read())]'

which means that it's depending on the func_globals attribute being
rejected by the interpreter -- which it won't be because 3.0 doesn't
have restricted evaluation any more.

Python is very leaky. I don't think trying to restrict Python execution
is a game that's worth playing.

-- [mdw]

ajaksu

unread,
Jan 15, 2009, 4:35:05 PM1/15/09
to
On Jan 15, 1:56 pm, mario ruggier <mario.rugg...@gmail.com> wrote:
> As
> I mentioned in another thread, the real application behind all this is
> one of the *few* secure templating systems around. Some info on its
> security is at:http://evoque.gizmojo.org/usage/restricted/
> Tell you what, if you find a security hole there (via exposed template
> source on a Domain(restricted=True) setup) I'll offer you a nice
> dinner (including the beer!) somewhere, maybe at some py conference,
> but even remotely if that is not feasible... ;-)

If you could provide a bare-bones instance of your evaluator to test
against, without using the whole evoque (I get DUMMY MODE ON from
'self.template.collection.domain.globals'), it'd be more interesting
to try :)

mario ruggier

unread,
Jan 15, 2009, 4:52:46 PM1/15/09
to
On Jan 15, 9:36 pm, Mark Wooding <m...@distorted.org.uk> wrote:


$ touch /tmp/mdw.test
mr:evoque mario$ python3.0
Python 3.0 (r30:67503, Dec 8 2008, 18:45:31)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from evoque import domain, template
>>> d = domain.Domain("/", restricted=True, quoting="str")
>>> t = template.Template(d, "mdw1", from_string=True, src="${inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw.test').read()}")
>>> t.evoque()
2009-01-15 22:26:18,704 ERROR [evoque] AttributeError: 'function'
object has no attribute 'func_globals': File "<string>", line 1, in


<module>
: EvalError(inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/

mdw.test').read())
'[AttributeError: \'function\' object has no attribute \'func_globals
\': File "<string>", line 1, in <module>\n: EvalError
(inspect.func_globals[\'_\'*2+\'builtins\'+\'_\'*2].open(\'/tmp/
mdw.test\').read())]'

But even if inspect did have the func_globals attribute, the "open"
builtin will not be found on __builtins__ (that is cleaned out when
restricted=True).

But, I guess it is necessary to keep an eye on what is available/
allowed by the different python versions, and adjust as needed,
probably to the lowest common denominator. In addition to what is
mentioned on the doc on evoque's resticted mode at the above URL, do
you have specific suggestions what may be a good idea to also block
out?

> Python is very leaky.  I don't think trying to restrict Python execution
> is a game that's worth playing.

It may not be worth risking your life on it, but it is certainly worth
playing ;-)

Thanks.. with you permission I am adding your evil expression to the
restricted tests?

Cheers, mario

> -- [mdw]

mario ruggier

unread,
Jan 15, 2009, 5:21:02 PM1/15/09
to

OK! Here's a small script to make it easier...
Just accumulate any expression you can dream of,
and pass it to get_expr_template() to get the template,
and on that then call evoque()... i guess you'd have to
test with 0.3, but 0.4 (also runs on py3) is just
around the corner....

Let it rip... the beer'd be on me ;-!


# evoque_restricted_test.py

from os.path import abspath, join, dirname


from evoque import domain, template

import logging
# uncomment to hide the plentiful ERROR logs:
#logging_level = logging.CRITICAL

# set the base for for the defualt collection
DEFAULT_DIR = abspath("/")

# 3 -> renders, 4 -> raises any evaluation errors,
# see: http://evoque.gizmojo.org/usage/errors/
ERRORS=2

# a restricted domain instance
d = domain.Domain(DEFAULT_DIR, restricted=True, errors=ERRORS,
quoting='str')
count = 0

# utility to easily init a template from any expression
def get_expr_template(expr):
global count
count += 1
name = "test%s"%(count)
src = "${%s}" % (expr)
d.set_template(name, src=src, from_string=True)
return d.get_template(name)

# some test expressions
exprs = [
"open('test.txt', 'w')",
"getattr(int, '_' + '_abs_' + '_')",
"().__class__.mro()[1].__subclasses__()",
"inspect.func_globals['_'*2+'builtins'+'_'*2]",
]

# execute
for expr in exprs:
print
print expr
print get_expr_template(expr).evoque()

Terry Reedy

unread,
Jan 15, 2009, 5:35:16 PM1/15/09
to pytho...@python.org
Peter Otten wrote:

> List comprehensions delete the helper variable after completion:

I do not believe they did in 2.4. Not sure of 2.5. There is certainly
a very different implementation in 3.0 and, I think, 2.6. OP
neglected to mention Python version he tested on. Code meant to run on
2.4 to 3.0 cannot depend on subtle listcomp details.

>>>> def f(): [i for i in [1]]
> ...
>>>> dis.dis(f)
> 1 0 BUILD_LIST 0
> 3 DUP_TOP
> 4 STORE_FAST 0 (_[1])
> 7 LOAD_CONST 1 (1)
> 10 BUILD_LIST 1
> 13 GET_ITER
> >> 14 FOR_ITER 13 (to 30)
> 17 STORE_FAST 1 (i)
> 20 LOAD_FAST 0 (_[1])
> 23 LOAD_FAST 1 (i)
> 26 LIST_APPEND
> 27 JUMP_ABSOLUTE 14
> >> 30 DELETE_FAST 0 (_[1])
> 33 POP_TOP
> 34 LOAD_CONST 0 (None)
> 37 RETURN_VALUE
>

In 3.0


>>> def f(): [i for i in [1]]

>>> import dis
>>> dis.dis(f)
1 0 LOAD_CONST 1 (<code object <listcomp> at
0x01349BF0, file "<pyshell#12>", line 1>)
3 MAKE_FUNCTION 0
6 LOAD_CONST 2 (1)
9 BUILD_LIST 1
12 GET_ITER
13 CALL_FUNCTION 1
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE

Running OP code in 3.0 with print ()s added gives

pre 0.a 1.b 2.c 3.d post

Traceback (most recent call last):
File "C:\Programs\Python30\misc\temp7.py", line 32, in <module>
""" % gie)
File "C:\Programs\Python30\misc\temp7.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 7, in <module>
File "<string>", line 7, in <listcomp>
File "C:\Programs\Python30\misc\temp7.py", line 12, in ts
return ts % self
File "C:\Programs\Python30\misc\temp7.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 2, in <module>
File "<string>", line 1, in <listcomp>
NameError: global name 'i' is not defined

> If you manage to run two nested listcomps in the same namespace you get a
> name clash and the inner helper variable overwrites/deletes the outer:
>
>>>> def xeval(x): return eval(x, ns)
> ...
>>>> ns = dict(xeval=xeval)
>>>> xeval("[xeval('[k for k in ()]') for i in (1,)]")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "<stdin>", line 1, in xeval
> File "<string>", line 1, in <module>
> NameError: name '_[1]' is not defined

Which Python? 3.0 prints "[[]]"! But I think the nested listcomp *is*
in a separate namespace here. I will leave it to you or OP to disect
how his and your code essentially differ from 3.0 (and maybe 2.6)
implementation's viewpoint.

Terry Jan Reedy

Terry Reedy

unread,
Jan 15, 2009, 5:41:37 PM1/15/09
to pytho...@python.org
mario ruggier wrote:
> On Jan 15, 4:06 pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>
> Hi Steve!
>
>>> class GetItemEvaluator(object):
>>> def __init__(self):
>>> self.globals = globals() # some dict (never changes)
>
> Ya, this is just a boiled down sample, and for simplicity I set to to
> the real globals(), so of course it will change when that changes...
> but in the application this is a distinct dict, that is entirely
> managed by the application, and it never changes as a result of an
> *evaluation*.

It would have been less confusing if you had written

self.globals = {} # a constant dict
or even
self.constants = {} # empty here only for simplicity

This might also make the 3.0 error message clearer (see other post).

tjr

Steven D'Aprano

unread,
Jan 15, 2009, 5:44:14 PM1/15/09
to
On Thu, 15 Jan 2009 07:56:02 -0800, mario ruggier wrote:

> On Jan 15, 4:06 pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>
> Hi Steve!
>
>> > class GetItemEvaluator(object):
>> >     def __init__(self):
>> >         self.globals = globals() # some dict (never changes)
>
> Ya, this is just a boiled down sample, and for simplicity I set to to
> the real globals(),


You should make that more clear when posting, in the code snippet as well
as the descriptive text.

And if you *did* make it clear, then *I* should read your post more
carefully.

Regards,


--
Steven

mario ruggier

unread,
Jan 15, 2009, 6:05:18 PM1/15/09
to

I was testing on 2.6, but running it thru 2.4 and 2.5 it seems
behaviour is the same there. For 3.0 it does change... and there seems
not to be the "_[1]" key defined, and, what's more, it gives a:
NameError: name 'j' is not defined.

In any case, that was an exploration to get a feeling for how the
listcomps behave (performance) if evaluated directly as opposed to
doing the equivalent from within a function. It turned out to be
slower, so I moved on... but, should it have been faster, then
differences between how the different python versions handle list
comps internally would have been a next issue to address.

I think the globals dict is not touched by eval'ing a list comp... it
is any not "constant" as such, just that it is not affected by
evaluations (unless the python application decides to affect it in
some way or another). But, evaluating a template by definition does
not change the globals dict.

> Terry Jan Reedy

ajaksu

unread,
Jan 15, 2009, 8:30:25 PM1/15/09
to
On Jan 15, 8:21 pm, mario ruggier <mario.rugg...@gmail.com> wrote:
> OK! Here's a small script to make it easier...

Thanks! I think I found a quick way around the restrictions (correct
me if I borked it), but I think you can block this example by
resetting your globals/builtins:

exprs = [
'(x for x in range(1)).gi_frame.f_globals.clear()',
'open("where_is_ma_beer.txt", "w").write("Thanks for the fun ")'
]

Regards,
Daniel

Mark Wooding

unread,
Jan 15, 2009, 9:50:23 PM1/15/09
to
mario ruggier <mario....@gmail.com> writes:

> 2009-01-15 22:26:18,704 ERROR [evoque] AttributeError: 'function'
> object has no attribute 'func_globals': File "<string>", line 1, in
> <module>

Damn. So that doesn't work. :-(

> But even if inspect did have the func_globals attribute, the "open"
> builtin will not be found on __builtins__ (that is cleaned out when
> restricted=True).

Irrelevant. I wasn't trying to get at my __builtins__ but the one
attached to a function I was passed in, which has a different
environment.

You define a function (a method, actually, but it matters little). The
function's globals dictionary is attached as an attribute. You didn't
do anything special here, so the globals have the standard __builtins__
binding. It contains open.

You now run my code in a funny environment with a stripped-down
__builtins__. But that doesn't matter, because my environment contains
your function. And that function has your __builtins__ hanging off the
side of it.

... but I can't actually get there because of f_restricted. You don't
mention the fact that Python magically limits access to these attributes
if __builtins__ doesn't match the usual one, so I think you got lucky.

-- [mdw]

mario ruggier

unread,
Jan 16, 2009, 2:09:51 AM1/16/09
to

Cool, the beer that is ;) Under 2.6... why does python allow the
f_globals lookup in this case, but for the previous example for
func_globals it does not?

If you look at the top of the file test/test_restricted.py, there is:

# Attempt at accessing these attrs under restricted execution on an
object
# that has them should raise a RuntimeError
RESTRICTED_ATTRS = [
'im_class', 'im_func', 'im_self', 'func_code', 'func_defaults',
'func_globals', #'func_name',
#'tb_frame', 'tb_next',
#'f_back', 'f_builtins', 'f_code', 'f_exc_traceback',
'f_exc_type',
#'f_exc_value', 'f_globals', 'f_locals'
]

I have not yet finished working this list off to ensure that any
lookup of these attrs wherever they occur will be refused, but I guess
that would block this kind of lookup out. I should also block any
attempt to access any "gi_*" attribute... Laboriously doing all these
checks on each expr eval will be very performance heavy, so I hope to
be able to limit access to all these more efficiently. Suggestions?

Cheers, Mario

> Regards,
> Daniel

Peter Otten

unread,
Jan 16, 2009, 4:31:37 AM1/16/09
to
Terry Reedy wrote:

> Peter Otten wrote:
>
>> List comprehensions delete the helper variable after completion:
>
> I do not believe they did in 2.4. Not sure of 2.5.

As Mario said, 2.4, 2.5, and 2.6 all show the same behaviour.

> There is certainly
> a very different implementation in 3.0 and, I think, 2.6. OP
> neglected to mention Python version he tested on. Code meant to run on
> 2.4 to 3.0 cannot depend on subtle listcomp details.

3.0 behaves different. Like generator expressions listcomps no longer leak
the loop variable, and this is implemented by having each listcomp execute
as a nested function:

> In 3.0
> >>> def f(): [i for i in [1]]
>
> >>> import dis
> >>> dis.dis(f)
> 1 0 LOAD_CONST 1 (<code object <listcomp> at
> 0x01349BF0, file "<pyshell#12>", line 1>)
> 3 MAKE_FUNCTION 0
> 6 LOAD_CONST 2 (1)
> 9 BUILD_LIST 1
> 12 GET_ITER
> 13 CALL_FUNCTION 1
> 16 POP_TOP
> 17 LOAD_CONST 0 (None)
> 20 RETURN_VALUE

This is more robust (at least I can't think of a way to break it like the
2.x approach) but probably slower due to the function call overhead. The
helper variable is still there, but the possibility of a clash with another
helper is gone (disclaimer: I didn't check this in the Python source) so
instead of

# 2.5 and 2.6 (2.4 has the names in a different order)

>>> def f():
... [[i for i in ()] for k in ()]
...
>>> f.func_code.co_varnames
('_[1]', 'k', '_[2]', 'i')

we get

# 3.0

>>> def f():
... [[i for i in ()] for k in ()]
...
>>> f.__code__.co_varnames
()

The variables are gone from f's scope, as 3.x listcomps no longer leak their
loop variables.

>>> f.__code__.co_consts
(None, <code object <listcomp> at 0x2b8d7f6d7530, file "<stdin>", line 2>,
())
>>> outer = f.__code__.co_consts[1]
>>> outer.co_varnames
('.0', '_[1]', 'k')

Again the inner listcomp is separated from the outer.

>>> outer.co_consts
(<code object <listcomp> at 0x2b8d7f6d26b0, file "<stdin>", line 2>, ())
>>> inner = outer.co_consts[0]
>>> inner.co_varnames
('.0', '_[1]', 'i')


Peter

ajaksu

unread,
Jan 16, 2009, 7:35:23 AM1/16/09
to
On Jan 16, 5:09 am, mario ruggier <mario.rugg...@gmail.com> wrote:
> Laboriously doing all these
> checks on each expr eval will be very performance heavy, so I hope to
> be able to limit access to all these more efficiently. Suggestions?

None regarding the general issue, a try:except to handle this one:

'(x for x in ()).throw("bork")'

mario ruggier

unread,
Jan 16, 2009, 12:45:34 PM1/16/09
to

What is the potential security risk with this one?

To handle this and situations like the ones pointed out above on this
thread, I will probably affect the following change to the
evoque.evaluator.RestrictedEvaluator class, and that is to replace the
'if name.find("__")!=-1:' with an re.search... where the re is defined
as:

restricted = re.compile(r"|\.".join([
"__", "func_", "f_", "im_", "tb_", "gi_", "throw"]))

and the test becomes simply:

if restricted.search(name):

All the above attempts will be blocked this way. Any other disallow-
sub-strings to add to the list above?

And thanks a lot Daniel, need to find a way to get somebeer over to
ya... ;-)

mario

Paul Rubin

unread,
Jan 16, 2009, 1:17:59 PM1/16/09
to
mario ruggier <mario....@gmail.com> writes:
> All the above attempts will be blocked this way. Any other disallow-
> sub-strings to add to the list above?

I think what you are trying to do is fundamentally hopeless. You
might look at web.py (http://webpy.org) for another approach, that
puts a complete interpreter for a Python-like language into the
template engine.

ajaksu

unread,
Jan 16, 2009, 6:04:33 PM1/16/09
to
On Jan 16, 3:45 pm, mario ruggier <mario.rugg...@gmail.com> wrote:
> > '(x for x in ()).throw("bork")'
>
> What is the potential security risk with this one?

I don't see a concrete issue, just found it tempting... raising hand-
crafted objects :)

> All the above attempts will be blocked this way. Any other disallow-
> sub-strings to add to the list above?

None that I know of, but I suggest testing with dir, globals, locals
and '__' enabled (which I haven't done yet), as spotting possible
flaws should be easier. If you can get BOM+encoded garbage tested (see
http://tinyurl.com/72d98y ), it might be worth it too.

This one fails in lots of interesting ways when you juggle keyword-
args around:
exprs = [
'evoque("hmm", filters=[unicode.upper ] ,src="/etc/python2.5/
site.py")',
]

> And thanks a lot Daniel, need to find a way to get somebeer over to
> ya... ;-)

You're welcome! Don't worry about the beer, I'd only consider a real
promise if it involved chocolate :D

Regards,
Daniel

mario ruggier

unread,
Jan 17, 2009, 8:09:13 AM1/17/09
to
On Jan 17, 12:04 am, ajaksu <aja...@gmail.com> wrote:
> On Jan 16, 3:45 pm, mario ruggier <mario.rugg...@gmail.com> wrote:
>
> > > '(x for x in ()).throw("bork")'
>
> > What is the potential security risk with this one?
>
> I don't see a concrete issue, just found it tempting... raising hand-
> crafted objects :)

OK, I can think of no good reson why anyone would want to do that from
within a temlate, so I'd be fine with blocking out any attribute whose
name starts with "throw" to block this out.

> > All the above attempts will be blocked this way. Any other disallow-
> > sub-strings to add to the list above?
>
> None that I know of, but I suggest testing with dir, globals, locals
> and '__' enabled (which I haven't done yet), as spotting possible

> flaws should be easier. If you can get BOM+encoded garbage tested (seehttp://tinyurl.com/72d98y), it might be worth it too.

The BOM stuff is interesting... from that discussion, I think it would
be also a good idea to blacklist "object" out of the restricted
builtins. I played with this, and prepared a file template as well as
a little script to run it... see below.

To tweak any disallwoed builtins back into the restricted namespace
for testing, you can just do something like:

d.set_on_globals("dir", dir)

for each name you'd like to add, when setting up the domain (see
script below).

To re-enable "__" lookups, you'd need to tweak the regexp above, in
the RestrictedEvaluator class.

> This one fails in lots of interesting ways when you juggle keyword-
> args around:
> exprs = [
>     'evoque("hmm", filters=[unicode.upper ] ,src="/etc/python2.5/
> site.py")',
> ]

Not sure what you mean... it just renders that source code file
uppercased (if it finds it as per the domain setup) ?!?


Here's (a) a mini testing py2-py3 script, similar to previous one
above but to read a template from a file (there may be additional
tricks possible that way), and (b) a sample companion test template.

evoque_restricted_file_test.py
----
# in lieu of print, py2/py3
import sys
def pr(*args):
sys.stdout.write(" ".join([str(arg) for arg in args])+'\n')
#


from os.path import abspath, join, dirname
from evoque import domain, template

# set the base for for the defualt collection
DEFAULT_DIR = abspath((dirname(__file__)))

# a restricted domain instance

d = domain.Domain(DEFAULT_DIR, restricted=True, errors=3,
quoting='str')
# errors: 3 -> renders, 4 -> raises any evaluation errors,
# see: http://evoque.gizmojo.org/usage/errors/

# Tweak domain.globals to add specific callables for testing:
d.set_on_globals("dir", dir)
d.set_on_globals("gobals", globals)
d.set_on_globals("locals", locals)

pr("domain", d.default_collection.dir,
d.restricted and "RESTRICTED" or "*** unrestricted ***")

t = d.get_template(restricted_exprs.txt)
pr(t.evoque())
----

restricted_exprs.txt
----
#[
BOM + encoded trickery

Note:
when evaluated in python interpreter:
>>> eval("# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-")
<built-in method __subclasses__ of type object at 0x1f1860>

but when specified within a template here as:
${# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-}
gives **pre-evaluation**:
SyntaxError: unknown encoding: utf7
]#
${"# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-"},


#[
Attempt to subversively build string expressions
]#
Explicitly written target expression: ().__class__.mro()
[1].__subclasses__()
evaluates: ${().__class__.mro()[1].__subclasses__()}

Subversive variation: "()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()"
evaluates (to just the str!): ${"()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()"}

Attempt to "set" same subsersively built expr to a loop variable
and then "evaluate" that variable:
$for{
expr in [
str("()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()")
] }
evaluates (to just the str!): ${expr}
attempt eval(...): ${eval(expr)}
$rof
(Note: evoque does not explicitly allow arbitrary setting of
variables, except within for loops.)
----


mario

mario ruggier

unread,
Jan 17, 2009, 12:11:50 PM1/17/09
to
Just to add that a further potential subversion possibility could have
been to build the expr in some way from within a template, and then
dynamically setting that string as the source of a new template with
from_string=True. This is precisely the reason why **from within a
template** evoque has never supported the from_string parameter i.e.
you *cannot* "load/create" a template from a string source from within
another template -- you may *only* do that from within the python app.
To illustrate, an extension of the previous example template could
thus become:

$for{expr in [
str("()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()")
]}

${% evoque("test", src="$${"+expr+"}", from_string=True) %}
$rof

That would fail, as it would simply take the value of src i.e. "$
{().__class__.mro()[1].__subclasses__()}" to mean the sub-path to the
template file within the collection (from_string would be simply
interpreted as an evaluation data parameter). See:
http://evoque.gizmojo.org/directives/evoque/

mario ruggier

unread,
Jan 22, 2009, 5:30:51 AM1/22/09
to
On Jan 16, 7:17 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:

Well, that is a bold statement... but maybe it is explained by what
you refer to, so I did a cursory look. But I miss to notice any
reference of an embedded "python-like language -- is there some sort
of overview of how web.py implements this e.g. something like the
equivalent of the doc describing how evoque implements it's sandbox:
http://evoque.gizmojo.org/usage/restricted/

I get the feeling you may also be ignoring contextual factors...
restricting the full python interpreter is not what we are talking
about here, but templating systems (such as web.py?) that just allow
embedding of any and all python code will require exactly that. And
*that* may well seem fundamentally hopeless.

Evoque chooses to allow only expressions, and those under a *managed*
context. To make that secure is a whole different (smaller) task.

0 new messages