I would like to evaluate list comprehension expressions, from within
which I'd like to call a function. For a first level it works fine but
for second level it seems to lose the "_[1]" variable it uses
internally to accumulate the results. Some sample code is:
class GetItemEvaluator(object):
def __init__(self):
self.globals = globals() # some dict (never changes)
self.globals["ts"] = self.ts
self.globals["join"] = "".join
self.locals = {} # changes on each evaluation
def __getitem__(self, expr):
return eval(expr, self.globals, self.locals)
def ts(self, ts, name, value):
self.locals[name] = value
#print ts, name, value, "::::", self.locals, "::::", ts % self
return ts % self
gie = GetItemEvaluator()
gie.locals["inner"] = ("a","b","c","d")
print """
pre %(join([ts("%s."%(j)+'%(k)s ', 'k', k) for j,k in enumerate
(inner)]))s post
""" % gie
# OK, outputs: pre 0.a 1.b 2.c 3.d post
gie = GetItemEvaluator()
gie.locals["outer"] = [ ("m","n","o","p"), ("q","r","s","t")]
print """
pre %(join([ts(
'''inner pre
%(join([ts("%s.%s."%(i, j)+'%(k)s ', 'k', k) for j,k in enumerate
(inner)]))s
inner post''',
"inner", inner) # END CALL outer ts()
for i,inner in enumerate(outer)])
)s post
""" % gie
The second 2-level comprehension gives:
File "scratch/eval_test.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 4, in <module>
NameError: name '_[1]' is not defined
If the print was to be enable, the last line printed out is:
0.3.%(k)s k p :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r', 's',
't')], 'i': 0, 'k': 'p', 'j': 3, '_[1]': ['0.0.m ', '0.1.n ', '0.2.o
'], 'inner': ('m', 'n', 'o', 'p')} :::: 0.3.p
i.e. it has correctly processed the first inner sequence, until the
(last) "p" element. But on exit of the last inner ts() call, it seems
to lose the '_[1]' on self.locals.
Any ideas why?
Note, i'd like that the first parameter to ts() is as independent as
possible from teh context in expression context, a sort of independent
mini-template. Thus, the i,j enumerate counters would normally not be
subbed *within* the comprehension itself, but in a similar way to how
k is evaluated, within the call to ts() -- I added them this way here
to help follow easier what the execution trail is. Anyhow, within that
mini-template, i'd like to embed other expressions for the % operator,
and that may of course also be list comprehensions.
Thanks!
Bye,
bearophile
OK, here's the same sample code somewhat simplified
and maybe be easier to follow what may be going on:
class GetItemEvaluator(object):
def __init__(self):
self.globals = globals() # some dict (never changes)
self.globals["ts"] = self.ts
self.globals["join"] = " ".join
self.locals = {} # changes on each evaluation
def __getitem__(self, expr):
return eval(expr, self.globals, self.locals)
def ts(self, ts):
print "ts:", ts, "::::", self.locals
return ts % self
# one level
gie = GetItemEvaluator()
gie.locals["inner"] = ("a","b","c","d")
TS1 = """
pre %(join([
ts('%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
"""
OUT1 = TS1 % gie
print "Output 1:", OUT1
# two level
gie = GetItemEvaluator()
gie.locals["outer"] = [ ("m","n","o","p"), ("q","r","s","t")]
TS2 = """
leading %(join([
ts(
'''
pre %(join([
ts('%(i)s.%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
''' # identical to TS1, except for additional '%(s)s.'
)
for i,inner in enumerate(outer)])
)s trailing
"""
OUT2 = TS2 % gie
print "Output 2:", OUT2
As the gie.locals dict is being automagically
updated from within the list comprehension
expression, I simplified the previous call to ts().
Thus, executing this with the prints enabled as
shown will produce the following output:
$ python2.6 scratch/eval_test_4.py
ts: %(j)s.%(k)s :::: {'_[1]': [], 'k': 'a', 'j': 0, 'inner': ('a',
'b', 'c', 'd')}
ts: %(j)s.%(k)s :::: {'_[1]': ['0.a'], 'k': 'b', 'j': 1, 'inner':
('a', 'b', 'c', 'd')}
ts: %(j)s.%(k)s :::: {'_[1]': ['0.a', '1.b'], 'k': 'c', 'j': 2,
'inner': ('a', 'b', 'c', 'd')}
ts: %(j)s.%(k)s :::: {'_[1]': ['0.a', '1.b', '2.c'], 'k': 'd', 'j': 3,
'inner': ('a', 'b', 'c', 'd')}
Output 1:
pre 0.a 1.b 2.c 3.d post
ts:
pre %(join([
ts('%(i)s.%(j)s.%(k)s')
for j,k in enumerate(inner)]))s post
:::: {'_[1]': [], 'i': 0, 'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'inner': ('m', 'n', 'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'm', 'j': 0, '_[1]': [], 'inner': ('m', 'n',
'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'n', 'j': 1, '_[1]': ['0.0.m'], 'inner':
('m', 'n', 'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'o', 'j': 2, '_[1]': ['0.0.m', '0.1.n'],
'inner': ('m', 'n', 'o', 'p')}
ts: %(i)s.%(j)s.%(k)s :::: {'outer': [('m', 'n', 'o', 'p'), ('q', 'r',
's', 't')], 'i': 0, 'k': 'p', 'j': 3, '_[1]': ['0.0.m', '0.1.n',
'0.2.o'], 'inner': ('m', 'n', 'o', 'p')}
Traceback (most recent call last):
File "scratch/eval_test.py", line 40, in <module>
OUT2 = TS2 % gie
File "scratch/eval_test_4.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 9, in <module>
NameError: name '_[1]' is not defined
Anyone can help clarify what may be going on?
m.
Ya, agree with you whole-heartily, but then so are most
optimizations ;-) It is just an idea I am exploring, and that code
would be never be humanly written (that's why it seems more convoluted
than necessary). I hope the simplified boiled down sample gets the
intention out better... i'd still would like to understand why the '_
[1]' variable is disappearing after first inner loop!
> Bye,
> bearophile
I have no idea what you are trying to do. Please reread the Zen of Python ;)
What happens is:
List comprehensions delete the helper variable after completion:
>>> def f(): [i for i in [1]]
...
>>> dis.dis(f)
1 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 0 (_[1])
7 LOAD_CONST 1 (1)
10 BUILD_LIST 1
13 GET_ITER
>> 14 FOR_ITER 13 (to 30)
17 STORE_FAST 1 (i)
20 LOAD_FAST 0 (_[1])
23 LOAD_FAST 1 (i)
26 LIST_APPEND
27 JUMP_ABSOLUTE 14
>> 30 DELETE_FAST 0 (_[1])
33 POP_TOP
34 LOAD_CONST 0 (None)
37 RETURN_VALUE
If you manage to run two nested listcomps in the same namespace you get a
name clash and the inner helper variable overwrites/deletes the outer:
>>> def xeval(x): return eval(x, ns)
...
>>> ns = dict(xeval=xeval)
>>> xeval("[xeval('[k for k in ()]') for i in (1,)]")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in xeval
File "<string>", line 1, in <module>
NameError: name '_[1]' is not defined
Peter
I'd like to add: this can only happen because the code snippets are compiled
independently. Otherwise Python uses different names for each listcomp:
>>> def f():
... [i for i in ()]
... [i for i in ()]
...
>>> f.func_code.co_varnames
('_[1]', 'i', '_[2]')
Peter
Ah, brilliant, thanks for the clarification!
To verify if I understood you correctly, I have modified
the ts() method above to:
def ts(self, ts):
_ns = self.locals
self.locals = self.locals.copy()
print "ts:", ts, "::::", self.locals
try:
return ts % self
finally:
self.locals = _ns
And, it executes correctly, thus the 2nd output is:
Output 2:
leading
pre 0.0.m 0.1.n 0.2.o 0.3.p post
pre 1.0.q 1.1.r 1.2.s 1.3.t post
trailing
But, the need to do a copy() will likely kill any potential
optimization gains... so, I will only be forced to rite more readable
code ;-)
Thanks!
> Hello,
>
> I would like to evaluate list comprehension expressions, from within
> which I'd like to call a function. For a first level it works fine but
> for second level it seems to lose the "_[1]" variable it uses internally
> to accumulate the results. Some sample code is:
>
> class GetItemEvaluator(object):
> def __init__(self):
> self.globals = globals() # some dict (never changes)
Would you like to put a small wager on that?
>>> len(gie.globals)
64
>>> something_new = 0
>>> len(gie.globals)
65
> self.globals["ts"] = self.ts
> self.globals["join"] = "".join
> self.locals = {} # changes on each evaluation
> def __getitem__(self, expr):
> return eval(expr, self.globals, self.locals)
Can you say "Great Big Security Hole"?
>>> gie = GetItemEvaluator()
>>> gie['__import__("os").system("ls")']
dicttest dumb.py rank.py sorting
startup.py
0
http://cwe.mitre.org/data/definitions/95.html
--
Steven
Hi Steve!
> > class GetItemEvaluator(object):
> > def __init__(self):
> > self.globals = globals() # some dict (never changes)
Ya, this is just a boiled down sample, and for simplicity I set to to
the real globals(), so of course it will change when that changes...
but in the application this is a distinct dict, that is entirely
managed by the application, and it never changes as a result of an
*evaluation*.
> Would you like to put a small wager on that?
>
> >>> len(gie.globals)
> 64
> >>> something_new = 0
> >>> len(gie.globals)
>
> 65
> > self.globals["ts"] = self.ts
> > self.globals["join"] = "".join
> > self.locals = {} # changes on each evaluation
> > def __getitem__(self, expr):
> > return eval(expr, self.globals, self.locals)
>
> Can you say "Great Big Security Hole"?
With about the same difficulty as "Rabbit-Proof Fence" ;-)
Again, it is just a boiled down sample, for communication purposes. As
I mentioned in another thread, the real application behind all this is
one of the *few* secure templating systems around. Some info on its
security is at: http://evoque.gizmojo.org/usage/restricted/
Tell you what, if you find a security hole there (via exposed template
source on a Domain(restricted=True) setup) I'll offer you a nice
dinner (including the beer!) somewhere, maybe at some py conference,
but even remotely if that is not feasible... ;-) The upcoming 0.4
release will run on 2.4 thru to 3.0 -- you can have some fun with that
one (the current 0.3 runs on 2.5 and 2.6).
> --
> Steven
Cheers, mario
s = " text %(item)s text "
acc = []
for value in iterator:
some_dict["item"] = value
acc.append(s % evaluator)
"".join(acc)
The item=value pair is essentially a loop variable, and the evaluator
(something like the gie instance above) uses it via the updated
some_dict.
Is there any way to express the above as a list comp or so? Any ideas
how it might be made to go faster?
m.
> Some info on its security is at:
> http://evoque.gizmojo.org/usage/restricted/
> Tell you what, if you find a security hole there (via exposed template
> source on a Domain(restricted=True) setup) I'll offer you a nice
> dinner (including the beer!) somewhere, maybe at some py conference,
> but even remotely if that is not feasible... ;-) The upcoming 0.4
> release will run on 2.4 thru to 3.0 -- you can have some fun with that
> one (the current 0.3 runs on 2.5 and 2.6).
I'm pretty sure I can break this on 3.0, because the f_restricted frame
flag has gone. Here's how:
>>> import template, domain
>>> dom = domain.Domain('/tmp/mdw/', restricted = True, quoting = 'str')
>>> t = template.Template(dom, 'evil', from_string = True, src =
>>> "${inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw/target').read()}")
2009-01-15 20:30:29,177 ERROR [evoque] RuntimeError: restricted
attribute: File "<string>", line 1, in <module>
: EvalError(inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw/target').read())
u'[RuntimeError: restricted attribute: File "<string>", line 1, in
<module>\n:
EvalError(inspect.func_globals[\'_\'*2+\'builtins\'+\'_\'*2].open(\'/tmp/mdw/target\').read())]'
which means that it's depending on the func_globals attribute being
rejected by the interpreter -- which it won't be because 3.0 doesn't
have restricted evaluation any more.
Python is very leaky. I don't think trying to restrict Python execution
is a game that's worth playing.
-- [mdw]
If you could provide a bare-bones instance of your evaluator to test
against, without using the whole evoque (I get DUMMY MODE ON from
'self.template.collection.domain.globals'), it'd be more interesting
to try :)
$ touch /tmp/mdw.test
mr:evoque mario$ python3.0
Python 3.0 (r30:67503, Dec 8 2008, 18:45:31)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from evoque import domain, template
>>> d = domain.Domain("/", restricted=True, quoting="str")
>>> t = template.Template(d, "mdw1", from_string=True, src="${inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/mdw.test').read()}")
>>> t.evoque()
2009-01-15 22:26:18,704 ERROR [evoque] AttributeError: 'function'
object has no attribute 'func_globals': File "<string>", line 1, in
<module>
: EvalError(inspect.func_globals['_'*2+'builtins'+'_'*2].open('/tmp/
mdw.test').read())
'[AttributeError: \'function\' object has no attribute \'func_globals
\': File "<string>", line 1, in <module>\n: EvalError
(inspect.func_globals[\'_\'*2+\'builtins\'+\'_\'*2].open(\'/tmp/
mdw.test\').read())]'
But even if inspect did have the func_globals attribute, the "open"
builtin will not be found on __builtins__ (that is cleaned out when
restricted=True).
But, I guess it is necessary to keep an eye on what is available/
allowed by the different python versions, and adjust as needed,
probably to the lowest common denominator. In addition to what is
mentioned on the doc on evoque's resticted mode at the above URL, do
you have specific suggestions what may be a good idea to also block
out?
> Python is very leaky. I don't think trying to restrict Python execution
> is a game that's worth playing.
It may not be worth risking your life on it, but it is certainly worth
playing ;-)
Thanks.. with you permission I am adding your evil expression to the
restricted tests?
Cheers, mario
> -- [mdw]
OK! Here's a small script to make it easier...
Just accumulate any expression you can dream of,
and pass it to get_expr_template() to get the template,
and on that then call evoque()... i guess you'd have to
test with 0.3, but 0.4 (also runs on py3) is just
around the corner....
Let it rip... the beer'd be on me ;-!
# evoque_restricted_test.py
from os.path import abspath, join, dirname
from evoque import domain, template
import logging
# uncomment to hide the plentiful ERROR logs:
#logging_level = logging.CRITICAL
# set the base for for the defualt collection
DEFAULT_DIR = abspath("/")
# 3 -> renders, 4 -> raises any evaluation errors,
# see: http://evoque.gizmojo.org/usage/errors/
ERRORS=2
# a restricted domain instance
d = domain.Domain(DEFAULT_DIR, restricted=True, errors=ERRORS,
quoting='str')
count = 0
# utility to easily init a template from any expression
def get_expr_template(expr):
global count
count += 1
name = "test%s"%(count)
src = "${%s}" % (expr)
d.set_template(name, src=src, from_string=True)
return d.get_template(name)
# some test expressions
exprs = [
"open('test.txt', 'w')",
"getattr(int, '_' + '_abs_' + '_')",
"().__class__.mro()[1].__subclasses__()",
"inspect.func_globals['_'*2+'builtins'+'_'*2]",
]
# execute
for expr in exprs:
print
print expr
print get_expr_template(expr).evoque()
> List comprehensions delete the helper variable after completion:
I do not believe they did in 2.4. Not sure of 2.5. There is certainly
a very different implementation in 3.0 and, I think, 2.6. OP
neglected to mention Python version he tested on. Code meant to run on
2.4 to 3.0 cannot depend on subtle listcomp details.
>>>> def f(): [i for i in [1]]
> ...
>>>> dis.dis(f)
> 1 0 BUILD_LIST 0
> 3 DUP_TOP
> 4 STORE_FAST 0 (_[1])
> 7 LOAD_CONST 1 (1)
> 10 BUILD_LIST 1
> 13 GET_ITER
> >> 14 FOR_ITER 13 (to 30)
> 17 STORE_FAST 1 (i)
> 20 LOAD_FAST 0 (_[1])
> 23 LOAD_FAST 1 (i)
> 26 LIST_APPEND
> 27 JUMP_ABSOLUTE 14
> >> 30 DELETE_FAST 0 (_[1])
> 33 POP_TOP
> 34 LOAD_CONST 0 (None)
> 37 RETURN_VALUE
>
In 3.0
>>> def f(): [i for i in [1]]
>>> import dis
>>> dis.dis(f)
1 0 LOAD_CONST 1 (<code object <listcomp> at
0x01349BF0, file "<pyshell#12>", line 1>)
3 MAKE_FUNCTION 0
6 LOAD_CONST 2 (1)
9 BUILD_LIST 1
12 GET_ITER
13 CALL_FUNCTION 1
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE
Running OP code in 3.0 with print ()s added gives
pre 0.a 1.b 2.c 3.d post
Traceback (most recent call last):
File "C:\Programs\Python30\misc\temp7.py", line 32, in <module>
""" % gie)
File "C:\Programs\Python30\misc\temp7.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 7, in <module>
File "<string>", line 7, in <listcomp>
File "C:\Programs\Python30\misc\temp7.py", line 12, in ts
return ts % self
File "C:\Programs\Python30\misc\temp7.py", line 8, in __getitem__
return eval(expr, self.globals, self.locals)
File "<string>", line 2, in <module>
File "<string>", line 1, in <listcomp>
NameError: global name 'i' is not defined
> If you manage to run two nested listcomps in the same namespace you get a
> name clash and the inner helper variable overwrites/deletes the outer:
>
>>>> def xeval(x): return eval(x, ns)
> ...
>>>> ns = dict(xeval=xeval)
>>>> xeval("[xeval('[k for k in ()]') for i in (1,)]")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "<stdin>", line 1, in xeval
> File "<string>", line 1, in <module>
> NameError: name '_[1]' is not defined
Which Python? 3.0 prints "[[]]"! But I think the nested listcomp *is*
in a separate namespace here. I will leave it to you or OP to disect
how his and your code essentially differ from 3.0 (and maybe 2.6)
implementation's viewpoint.
Terry Jan Reedy
It would have been less confusing if you had written
self.globals = {} # a constant dict
or even
self.constants = {} # empty here only for simplicity
This might also make the 3.0 error message clearer (see other post).
tjr
> On Jan 15, 4:06 pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>
> Hi Steve!
>
>> > class GetItemEvaluator(object):
>> > def __init__(self):
>> > self.globals = globals() # some dict (never changes)
>
> Ya, this is just a boiled down sample, and for simplicity I set to to
> the real globals(),
You should make that more clear when posting, in the code snippet as well
as the descriptive text.
And if you *did* make it clear, then *I* should read your post more
carefully.
Regards,
--
Steven
I was testing on 2.6, but running it thru 2.4 and 2.5 it seems
behaviour is the same there. For 3.0 it does change... and there seems
not to be the "_[1]" key defined, and, what's more, it gives a:
NameError: name 'j' is not defined.
In any case, that was an exploration to get a feeling for how the
listcomps behave (performance) if evaluated directly as opposed to
doing the equivalent from within a function. It turned out to be
slower, so I moved on... but, should it have been faster, then
differences between how the different python versions handle list
comps internally would have been a next issue to address.
I think the globals dict is not touched by eval'ing a list comp... it
is any not "constant" as such, just that it is not affected by
evaluations (unless the python application decides to affect it in
some way or another). But, evaluating a template by definition does
not change the globals dict.
> Terry Jan Reedy
Thanks! I think I found a quick way around the restrictions (correct
me if I borked it), but I think you can block this example by
resetting your globals/builtins:
exprs = [
'(x for x in range(1)).gi_frame.f_globals.clear()',
'open("where_is_ma_beer.txt", "w").write("Thanks for the fun ")'
]
Regards,
Daniel
> 2009-01-15 22:26:18,704 ERROR [evoque] AttributeError: 'function'
> object has no attribute 'func_globals': File "<string>", line 1, in
> <module>
Damn. So that doesn't work. :-(
> But even if inspect did have the func_globals attribute, the "open"
> builtin will not be found on __builtins__ (that is cleaned out when
> restricted=True).
Irrelevant. I wasn't trying to get at my __builtins__ but the one
attached to a function I was passed in, which has a different
environment.
You define a function (a method, actually, but it matters little). The
function's globals dictionary is attached as an attribute. You didn't
do anything special here, so the globals have the standard __builtins__
binding. It contains open.
You now run my code in a funny environment with a stripped-down
__builtins__. But that doesn't matter, because my environment contains
your function. And that function has your __builtins__ hanging off the
side of it.
... but I can't actually get there because of f_restricted. You don't
mention the fact that Python magically limits access to these attributes
if __builtins__ doesn't match the usual one, so I think you got lucky.
-- [mdw]
Cool, the beer that is ;) Under 2.6... why does python allow the
f_globals lookup in this case, but for the previous example for
func_globals it does not?
If you look at the top of the file test/test_restricted.py, there is:
# Attempt at accessing these attrs under restricted execution on an
object
# that has them should raise a RuntimeError
RESTRICTED_ATTRS = [
'im_class', 'im_func', 'im_self', 'func_code', 'func_defaults',
'func_globals', #'func_name',
#'tb_frame', 'tb_next',
#'f_back', 'f_builtins', 'f_code', 'f_exc_traceback',
'f_exc_type',
#'f_exc_value', 'f_globals', 'f_locals'
]
I have not yet finished working this list off to ensure that any
lookup of these attrs wherever they occur will be refused, but I guess
that would block this kind of lookup out. I should also block any
attempt to access any "gi_*" attribute... Laboriously doing all these
checks on each expr eval will be very performance heavy, so I hope to
be able to limit access to all these more efficiently. Suggestions?
Cheers, Mario
> Regards,
> Daniel
> Peter Otten wrote:
>
>> List comprehensions delete the helper variable after completion:
>
> I do not believe they did in 2.4. Not sure of 2.5.
As Mario said, 2.4, 2.5, and 2.6 all show the same behaviour.
> There is certainly
> a very different implementation in 3.0 and, I think, 2.6. OP
> neglected to mention Python version he tested on. Code meant to run on
> 2.4 to 3.0 cannot depend on subtle listcomp details.
3.0 behaves different. Like generator expressions listcomps no longer leak
the loop variable, and this is implemented by having each listcomp execute
as a nested function:
> In 3.0
> >>> def f(): [i for i in [1]]
>
> >>> import dis
> >>> dis.dis(f)
> 1 0 LOAD_CONST 1 (<code object <listcomp> at
> 0x01349BF0, file "<pyshell#12>", line 1>)
> 3 MAKE_FUNCTION 0
> 6 LOAD_CONST 2 (1)
> 9 BUILD_LIST 1
> 12 GET_ITER
> 13 CALL_FUNCTION 1
> 16 POP_TOP
> 17 LOAD_CONST 0 (None)
> 20 RETURN_VALUE
This is more robust (at least I can't think of a way to break it like the
2.x approach) but probably slower due to the function call overhead. The
helper variable is still there, but the possibility of a clash with another
helper is gone (disclaimer: I didn't check this in the Python source) so
instead of
# 2.5 and 2.6 (2.4 has the names in a different order)
>>> def f():
... [[i for i in ()] for k in ()]
...
>>> f.func_code.co_varnames
('_[1]', 'k', '_[2]', 'i')
we get
# 3.0
>>> def f():
... [[i for i in ()] for k in ()]
...
>>> f.__code__.co_varnames
()
The variables are gone from f's scope, as 3.x listcomps no longer leak their
loop variables.
>>> f.__code__.co_consts
(None, <code object <listcomp> at 0x2b8d7f6d7530, file "<stdin>", line 2>,
())
>>> outer = f.__code__.co_consts[1]
>>> outer.co_varnames
('.0', '_[1]', 'k')
Again the inner listcomp is separated from the outer.
>>> outer.co_consts
(<code object <listcomp> at 0x2b8d7f6d26b0, file "<stdin>", line 2>, ())
>>> inner = outer.co_consts[0]
>>> inner.co_varnames
('.0', '_[1]', 'i')
Peter
None regarding the general issue, a try:except to handle this one:
'(x for x in ()).throw("bork")'
What is the potential security risk with this one?
To handle this and situations like the ones pointed out above on this
thread, I will probably affect the following change to the
evoque.evaluator.RestrictedEvaluator class, and that is to replace the
'if name.find("__")!=-1:' with an re.search... where the re is defined
as:
restricted = re.compile(r"|\.".join([
"__", "func_", "f_", "im_", "tb_", "gi_", "throw"]))
and the test becomes simply:
if restricted.search(name):
All the above attempts will be blocked this way. Any other disallow-
sub-strings to add to the list above?
And thanks a lot Daniel, need to find a way to get somebeer over to
ya... ;-)
mario
I think what you are trying to do is fundamentally hopeless. You
might look at web.py (http://webpy.org) for another approach, that
puts a complete interpreter for a Python-like language into the
template engine.
I don't see a concrete issue, just found it tempting... raising hand-
crafted objects :)
> All the above attempts will be blocked this way. Any other disallow-
> sub-strings to add to the list above?
None that I know of, but I suggest testing with dir, globals, locals
and '__' enabled (which I haven't done yet), as spotting possible
flaws should be easier. If you can get BOM+encoded garbage tested (see
http://tinyurl.com/72d98y ), it might be worth it too.
This one fails in lots of interesting ways when you juggle keyword-
args around:
exprs = [
'evoque("hmm", filters=[unicode.upper ] ,src="/etc/python2.5/
site.py")',
]
> And thanks a lot Daniel, need to find a way to get somebeer over to
> ya... ;-)
You're welcome! Don't worry about the beer, I'd only consider a real
promise if it involved chocolate :D
Regards,
Daniel
OK, I can think of no good reson why anyone would want to do that from
within a temlate, so I'd be fine with blocking out any attribute whose
name starts with "throw" to block this out.
> > All the above attempts will be blocked this way. Any other disallow-
> > sub-strings to add to the list above?
>
> None that I know of, but I suggest testing with dir, globals, locals
> and '__' enabled (which I haven't done yet), as spotting possible
> flaws should be easier. If you can get BOM+encoded garbage tested (seehttp://tinyurl.com/72d98y), it might be worth it too.
The BOM stuff is interesting... from that discussion, I think it would
be also a good idea to blacklist "object" out of the restricted
builtins. I played with this, and prepared a file template as well as
a little script to run it... see below.
To tweak any disallwoed builtins back into the restricted namespace
for testing, you can just do something like:
d.set_on_globals("dir", dir)
for each name you'd like to add, when setting up the domain (see
script below).
To re-enable "__" lookups, you'd need to tweak the regexp above, in
the RestrictedEvaluator class.
> This one fails in lots of interesting ways when you juggle keyword-
> args around:
> exprs = [
> 'evoque("hmm", filters=[unicode.upper ] ,src="/etc/python2.5/
> site.py")',
> ]
Not sure what you mean... it just renders that source code file
uppercased (if it finds it as per the domain setup) ?!?
Here's (a) a mini testing py2-py3 script, similar to previous one
above but to read a template from a file (there may be additional
tricks possible that way), and (b) a sample companion test template.
evoque_restricted_file_test.py
----
# in lieu of print, py2/py3
import sys
def pr(*args):
sys.stdout.write(" ".join([str(arg) for arg in args])+'\n')
#
from os.path import abspath, join, dirname
from evoque import domain, template
# set the base for for the defualt collection
DEFAULT_DIR = abspath((dirname(__file__)))
# a restricted domain instance
d = domain.Domain(DEFAULT_DIR, restricted=True, errors=3,
quoting='str')
# errors: 3 -> renders, 4 -> raises any evaluation errors,
# see: http://evoque.gizmojo.org/usage/errors/
# Tweak domain.globals to add specific callables for testing:
d.set_on_globals("dir", dir)
d.set_on_globals("gobals", globals)
d.set_on_globals("locals", locals)
pr("domain", d.default_collection.dir,
d.restricted and "RESTRICTED" or "*** unrestricted ***")
t = d.get_template(restricted_exprs.txt)
pr(t.evoque())
----
restricted_exprs.txt
----
#[
BOM + encoded trickery
Note:
when evaluated in python interpreter:
>>> eval("# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-")
<built-in method __subclasses__ of type object at 0x1f1860>
but when specified within a template here as:
${# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-}
gives **pre-evaluation**:
SyntaxError: unknown encoding: utf7
]#
${"# coding: utf7\n
+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-"},
#[
Attempt to subversively build string expressions
]#
Explicitly written target expression: ().__class__.mro()
[1].__subclasses__()
evaluates: ${().__class__.mro()[1].__subclasses__()}
Subversive variation: "()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()"
evaluates (to just the str!): ${"()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()"}
Attempt to "set" same subsersively built expr to a loop variable
and then "evaluate" that variable:
$for{
expr in [
str("()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()")
] }
evaluates (to just the str!): ${expr}
attempt eval(...): ${eval(expr)}
$rof
(Note: evoque does not explicitly allow arbitrary setting of
variables, except within for loops.)
----
mario
$for{expr in [
str("()."+"_"*2+"class"+"_"*2+".mro()
[1]."+"_"*2+"subclasses"+"_"*2+"()")
]}
${% evoque("test", src="$${"+expr+"}", from_string=True) %}
$rof
That would fail, as it would simply take the value of src i.e. "$
{().__class__.mro()[1].__subclasses__()}" to mean the sub-path to the
template file within the collection (from_string would be simply
interpreted as an evaluation data parameter). See:
http://evoque.gizmojo.org/directives/evoque/
Well, that is a bold statement... but maybe it is explained by what
you refer to, so I did a cursory look. But I miss to notice any
reference of an embedded "python-like language -- is there some sort
of overview of how web.py implements this e.g. something like the
equivalent of the doc describing how evoque implements it's sandbox:
http://evoque.gizmojo.org/usage/restricted/
I get the feeling you may also be ignoring contextual factors...
restricting the full python interpreter is not what we are talking
about here, but templating systems (such as web.py?) that just allow
embedding of any and all python code will require exactly that. And
*that* may well seem fundamentally hopeless.
Evoque chooses to allow only expressions, and those under a *managed*
context. To make that secure is a whole different (smaller) task.