Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

safe eval of moderately simple math expressions

121 views
Skip to first unread message

Joel Hedlund

unread,
Apr 9, 2009, 11:56:49 AM4/9/09
to
Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers. One possibility that comes to mind is to use the eval function,
but since that sends up all kinds of warning flags in my head, I thought
I'd put my idea out here first so you guys can tell me if I'm insane. :-)

This is the gist of it:
----------------------------------------------------------
import math

globals = dict((s, getattr(math, s)) for s in dir(math) if '_' not in s)
globals.update(__builtins__=None, divmod=divmod, round=round)

def calc(expr, x):
if '_' in expr:
raise ValueError("expr must not contain '_' characters")
try:
return eval(expr, globals, dict(x=x))
except:
raise ValueError("bad expr or x")

print calc('cos(x*pi)', 1.33)
----------------------------------------------------------

This lets the user do stuff like "exp(-0.01*x)" or "round(100*x)" but
prevents malevolent stuff like "__import__('os').system('del *.*')" or
"(t for t in (42).__class__.__base__.__subclasses__() if t.__name__ ==
'file').next()" from messing things up.

I assume there's lots of nasty and absolutely lethal stuff that I've
missed, and I kindly request you show me the error of my ways.

Thank you for your time!
/Joel Hedlund

Matt Nordhoff

unread,
Apr 9, 2009, 12:25:51 PM4/9/09
to pytho...@python.org

I'm way too dumb and lazy to provide a working example, but someone
could work around the _ restriction by obfuscating them a bit, like this:

>>> '\x5f'
'_'
>>> getattr(42, '\x5f\x5fclass\x5f\x5f') # __class__
<type 'int'>

Is that enough to show you the error of your ways? :-D Cuz seriously,
it's a bad idea.

I'm sorry, but I don't know a good solution. The simplicity of eval is
definitely very attractive, but it's just not safe.

(BTW: What if a user tries to do some ridiculously large calculation to
DoS the app? Is that a problem?)
--

Aaron Brady

unread,
Apr 9, 2009, 12:55:02 PM4/9/09
to
On Apr 9, 10:56 am, Joel Hedlund <joel.hedl...@gmail.com> wrote:
> Hi all!
>
> I'm writing a program that presents a lot of numbers to the user, and I
> want to let the user apply moderately simple arithmentics to these
> numbers. One possibility that comes to mind is to use the eval function,
> but since that sends up all kinds of warning flags in my head, I thought
> I'd put my idea out here first so you guys can tell me if I'm insane. :-)
>
> This is the gist of it:
snip

> def calc(expr, x):
>      if '_' in expr:
>          raise ValueError("expr must not contain '_' characters")
snip

> I assume there's lots of nasty and absolutely lethal stuff that I've
> missed, and I kindly request you show me the error of my ways.
>
> Thank you for your time!
> /Joel Hedlund

Would you be willing to examine a syntax tree to determine if there
are any class accesses? Would it work?

Terry Reedy

unread,
Apr 9, 2009, 1:13:50 PM4/9/09
to pytho...@python.org
Joel Hedlund wrote:
> Hi all!
>
> I'm writing a program that presents a lot of numbers to the user, and I
> want to let the user apply moderately simple arithmentics to these
> numbers. One possibility that comes to mind is to use the eval function,
> but since that sends up all kinds of warning flags in my head,

Where does the program execute? If on the user's own machine, no
problem. Eval is no more dangerous than Python itself.

Paul McGuire

unread,
Apr 9, 2009, 2:28:35 PM4/9/09
to
On Apr 9, 10:56 am, Joel Hedlund <joel.hedl...@gmail.com> wrote:
> Hi all!
>
> I'm writing a program that presents a lot of numbers to the user, and I
> want to let the user apply moderately simple arithmentics to these
> numbers.

Joel -

Take a look at the examples page on the pyparsing wiki (http://
pyparsing.wikispaces.com/Examples). Look at the examples fourFn.py
and simpleArith.py for some expression parsers that you could extend
to support whatever math builtins you wish. Since you would be doing
your own parsing and eval code, you could be sure that no dangerous
code was being run, just simple arithmetic.

-- Paul

CTO

unread,
Apr 9, 2009, 3:16:57 PM4/9/09
to

Steven D'Aprano

unread,
Apr 10, 2009, 8:54:39 PM4/10/09
to
On Thu, 09 Apr 2009 13:13:50 -0400, Terry Reedy wrote:

> Joel Hedlund wrote:
>> Hi all!
>>
>> I'm writing a program that presents a lot of numbers to the user, and I
>> want to let the user apply moderately simple arithmentics to these
>> numbers. One possibility that comes to mind is to use the eval
>> function, but since that sends up all kinds of warning flags in my
>> head,
>
> Where does the program execute? If on the user's own machine, no
> problem.

Until the user naively executes a code sample he downloaded from the
Internet, and discovers to his horror that his *calculator* is able to
upload his banking details to an IRC server hosted in Bulgaria.

How quickly we forget... for twenty or thirty years all malware
infections was via programs executed on the user's own machine.


> Eval is no more dangerous than Python itself.

But users know Python is a Turing-complete programming language that can
do anything their computer can do. It would come to an unpleasant
surprise to discover that (say) your icon editor was also a Turing-
complete programming language capable of doing anything your C-compiler
could do. The same holds for applications written in Python.

--
Steven

Aaron Brady

unread,
Apr 11, 2009, 3:41:21 AM4/11/09
to
On Apr 10, 7:54 pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Thu, 09 Apr 2009 13:13:50 -0400, Terry Reedy wrote:
> > Joel Hedlund wrote:
> >> Hi all!
>
> >> I'm writing a program that presents a lot of numbers to the user, and I
> >> want to let the user apply moderately simple arithmentics to these
> >> numbers. One possibility that comes to mind is to use the eval
> >> function, but since that sends up all kinds of warning flags in my
> >> head,
>
> > Where does the program execute?  If on the user's own machine, no
> > problem.
>
> Until the user naively executes a code sample he downloaded from the
> Internet, and discovers to his horror that his *calculator* is able to
> upload his banking details to an IRC server hosted in Bulgaria.

Mine does that anyway! ..Often without telling anyone.

>
> How quickly we forget... for twenty or thirty years all malware
> infections was via programs executed on the user's own machine.
>
> > Eval is no more dangerous than Python itself.
>
> But users know Python is a Turing-complete programming language that can
> do anything their computer can do. It would come to an unpleasant
> surprise to discover that (say) your icon editor was also a Turing-
> complete programming language capable of doing anything your C-compiler
> could do. The same holds for applications written in Python.

Don't they know that his calculator is written in Python? Do many
applications include a programming language?

Why do I get the feeling that the authors of 'pyparsing' are out of
breath?

I wonder if you could do something like copy and paste a "fork" of the
'ast' module, and just remove non-arithmetic classes; then do a normal
walk and transform of the foreign code...

Joel Hedlund

unread,
Apr 11, 2009, 4:18:09 AM4/11/09
to
Aaron Brady wrote:
> Would you be willing to examine a syntax tree to determine if there
> are any class accesses?

Sure? How do I do that? I've never done that type of thing before so I
can't really say if it would work or not.

/Joel

Joel Hedlund

unread,
Apr 11, 2009, 4:22:05 AM4/11/09
to
Matt Nordhoff wrote:
>>>> '\x5f'
> '_'
>>>> getattr(42, '\x5f\x5fclass\x5f\x5f') # __class__
> <type 'int'>
>
> Is that enough to show you the error of your ways?

No, because

>>> print '_' in '\x5f\x5fclass\x5f\x5f'
True

> :-D Cuz seriously, it's a bad idea.

Yes probably, but that's not why. :-)

> (BTW: What if a user tries to do some ridiculously large calculation to
> DoS the app? Is that a problem?)

Nope. If the user wants to hang her own app that's fine with me.

/Joel

Joel Hedlund

unread,
Apr 11, 2009, 4:30:07 AM4/11/09
to
Matt Nordhoff wrote:
>>>> '\x5f'
> '_'
>>>> getattr(42, '\x5f\x5fclass\x5f\x5f') # __class__
> <type 'int'>
>
> Is that enough to show you the error of your ways?

No, because

>>> print '_' in '\x5f\x5fclass\x5f\x5f'
True

> :-D Cuz seriously, it's a bad idea.

Yes probably, but that's not why. :-)

> (BTW: What if a user tries to do some ridiculously large calculation to


> DoS the app? Is that a problem?)

Nope. If the user wants to hang her own app that's fine with me.

/Joel

Peter Otten

unread,
Apr 11, 2009, 4:37:35 AM4/11/09
to
Joel Hedlund wrote:

> Matt Nordhoff wrote:
>>>>> '\x5f'
>> '_'
>>>>> getattr(42, '\x5f\x5fclass\x5f\x5f') # __class__
>> <type 'int'>
>>
>> Is that enough to show you the error of your ways?
>
> No, because
>
> >>> print '_' in '\x5f\x5fclass\x5f\x5f'
> True

But what your planning to do seems more like

>>> def is_it_safe(source):
... return "_" not in source
...
>>> source = "getattr(42, '\\x5f\\x5fclass\\x5f\\x5f')"
>>> if is_it_safe(source):
... print eval(source)
...
<type 'int'>

Peter

Peter Otten

unread,
Apr 11, 2009, 4:38:32 AM4/11/09
to
Joel Hedlund wrote:

> Matt Nordhoff wrote:
>>>>> '\x5f'
>> '_'
>>>>> getattr(42, '\x5f\x5fclass\x5f\x5f') # __class__
>> <type 'int'>
>>
>> Is that enough to show you the error of your ways?
>
> No, because
>
> >>> print '_' in '\x5f\x5fclass\x5f\x5f'
> True

But what you're planning to do seems more like

Joel Hedlund

unread,
Apr 11, 2009, 5:03:16 AM4/11/09
to
Peter Otten wrote:
> But what you're planning to do seems more like
>
>>>> def is_it_safe(source):
> ... return "_" not in source
> ...
>>>> source = "getattr(42, '\\x5f\\x5fclass\\x5f\\x5f')"
>>>> if is_it_safe(source):
> ... print eval(source)
> ...
> <type 'int'>

Bah. You are completely right of course.

Just as a thought experiment, would this do the trick?

def is_it_safe(source):
return "_" not in source and r'\' not in source

I'm not asking because I'm hellbent on having eval in my app, but
because it's always useful to see what hazards you don't know about.

/Joel

Peter Otten

unread,
Apr 11, 2009, 5:19:41 AM4/11/09
to
Joel Hedlund wrote:

> Peter Otten wrote:
>> But what you're planning to do seems more like
>>
>>>>> def is_it_safe(source):
>> ... return "_" not in source
>> ...
>>>>> source = "getattr(42, '\\x5f\\x5fclass\\x5f\\x5f')"
>>>>> if is_it_safe(source):
>> ... print eval(source)
>> ...
>> <type 'int'>
>
> Bah. You are completely right of course.
>
> Just as a thought experiment, would this do the trick?
>
> def is_it_safe(source):
> return "_" not in source and r'\' not in source

>>> "".join(map(chr, [95, 95, 110, 111, 95, 95]))
'__no__'

By the way, a raw string may not end with a backslash:

>>> r'\'
File "<stdin>", line 1
r'\'
^
SyntaxError: EOL while scanning single-quoted string

Peter

Joel Hedlund

unread,
Apr 11, 2009, 5:27:06 AM4/11/09
to
Peter Otten wrote:
>> def is_it_safe(source):
>> return "_" not in source and r'\' not in source
>
>>>> "".join(map(chr, [95, 95, 110, 111, 95, 95]))
> '__no__'

But you don't have access to neither map or chr?

/Joel

Peter Otten

unread,
Apr 11, 2009, 5:31:06 AM4/11/09
to
Joel Hedlund wrote:

>>> '5f5f7374696c6c5f6e6f745f736166655f5f'.decode("hex")
'__still_not_safe__'


Joel Hedlund

unread,
Apr 11, 2009, 5:38:50 AM4/11/09
to

Now *that's* a thing of beauty. A horrible, horrible kind of beauty.

Thanks for blowing holes in my inflated sense of security!
/Joel

Aaron Brady

unread,
Apr 11, 2009, 6:22:19 AM4/11/09
to

NO PROMISES. No warranty is made, express or implied.

Of course, something this devious, a "white" list, may just make it so
your enemy finds out its weakness before you do.

It's ostensibly for Python 3, but IIRC there's a way to do it in 2.

'ast.literal_eval' appears to evaluate a literal, but won't do
expressions, which is what you are looking for. We should refer
people to it more often.

+1 ast.walk, btw.

If you want subtraction and division, you'll have to add them
yourself. You could probably compress the 'is_it_safe' function to
one line, provided that it's sound to start with: if all( x in
safe_node_classes for x in ast.walk( ast.parse( exp ) ) ), or better
yet, if set( ast.walk( ast.parse( exp ) ) )<= safe_node_classes. +1!

/Source:
import ast
safe_exp= '( 2+ 4 )* 7'
unsafe_exp= '( 2+ 4 ).__class__'
unsafe_exp2= '__import__( "os" )'

safe_node_classes= set( [
ast.Module,
ast.Expr,
ast.BinOp,
ast.Mult,
ast.Add,
ast.Num
] )

def is_it_safe( exp ):
print( 'trying %s'% exp )
top= ast.parse( exp )
for node in ast.walk( top ):
print( node )
if node.__class__ not in safe_node_classes:
return False
print( 'ok!' )
return True

print( safe_exp, is_it_safe( safe_exp ) )
print( )
print( unsafe_exp, is_it_safe( unsafe_exp ) )
print( )
print( unsafe_exp2, is_it_safe( unsafe_exp2 ) )
print( )

/Output:

trying ( 2+ 4 )* 7
<_ast.Module object at 0x00BB5DF0>
<_ast.Expr object at 0x00BB5E10>
<_ast.BinOp object at 0x00BB5E30>
<_ast.BinOp object at 0x00BB5E50>
<_ast.Mult object at 0x00BAF590>
<_ast.Num object at 0x00BB5EB0>
<_ast.Num object at 0x00BB5E70>
<_ast.Add object at 0x00BAF410>
<_ast.Num object at 0x00BB5E90>
ok!
( 2+ 4 )* 7 True

trying ( 2+ 4 ).__class__
<_ast.Module object at 0x00BB5E90>
<_ast.Expr object at 0x00BB5DF0>
<_ast.Attribute object at 0x00BB5E10>
( 2+ 4 ).__class__ False

trying __import__( "os" )
<_ast.Module object at 0x00BB5E10>
<_ast.Expr object at 0x00BB5E30>
<_ast.Call object at 0x00BB5E50>
__import__( "os" ) False

Steven D'Aprano

unread,
Apr 11, 2009, 6:46:45 AM4/11/09
to

Can we pass your test and still write to a file? Too easy.


>>> file('spam.txt', 'r') # prove that the file doesn't exist
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: 'spam.txt'
>>>
>>> source = "4+(file('spam.txt', 'w').write('spam spam spam') or 0)+5"


>>> if is_it_safe(source):
... print eval(source)
...

9
>>> file('spam.txt', 'r').read()
'spam spam spam'


Can we pass your test and import a module and grab its docstring?

>>> source = "getattr(eval(chr(90+5)*2+'im'+'por'+'t'+chr(None or 95)*2+'('+chr(39)+'os'+chr(39)+')'), chr(95)*2+'doc'+chr(99-4)*2)"
>>> if is_it_safe(source):
... eval(source)
...
"OS routines for Mac, NT, or Posix depending ... "

Restricting Python is hard. No, not hard. It's *REALLY HARD*. Experts
have tried and failed. A good example is Tav's recent attempt to secure
Python code from *one* threat: writing a file on the local disk. Should
be simple, right?

If only.

http://tav.espians.com/a-challenge-to-break-python-security.html

The first exploit came an hour after Tav went public.

You can read the discussion on the Python-Dev list starting here:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html


More here:
http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html

http://tav.espians.com/update-on-securing-the-python-interpreter.html


My recommendation is that you do one of these:

(1) Give up on making your code "safe". Recognise that the threat is
relatively small, but real, and put a warning in your documentation about
the risk to user's own system if they evaluate arbitrary code, and then
just use eval and hope for the best.

(2) Decide that you don't want your calculate to be a full-fledged
programming language, and give up on making eval safe. Write your own
mini-parser to do arithmetic expressions. It's really not that difficult:
really easy with PyParsing, and not that hard without.

--
Steven

Paul McGuire

unread,
Apr 11, 2009, 9:09:14 AM4/11/09
to
On Apr 11, 2:41 am, Aaron Brady <castiro...@gmail.com> wrote:
>
> Why do I get the feeling that the authors of 'pyparsing' are out of
> breath?
>

What kind of breathlessness do you mean? I'm still breathing, last
time I checked.

The-rumors-of-my-demise-have-been-greatly-exaggerated'ly yours,
-- Paul


Aaron Brady

unread,
Apr 11, 2009, 12:29:15 PM4/11/09
to

Gasping, not panting.

0 new messages