Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

limited python virtual machine (WAS: Another scripting language implemented into Python itself?)

11 views
Skip to first unread message

Steven Bethard

unread,
Jan 25, 2005, 2:22:13 PM1/25/05
to
Fuzzyman wrote:
> Cameron Laird wrote:
> [snip..]
>
>>This is a serious issue.
>>
>>It's also one that brings Tcl, mentioned several
>>times in this thread, back into focus. Tcl presents
>>the notion of "safe interpreter", that is, a sub-
>>ordinate virtual machine which can interpret only
>>specific commands. It's a thrillingly powerful and
>>correct solution to the main problem Jeff and others
>>have described.
>
> A better (and of course *vastly* more powerful but unfortunately only
> a dream ;-) is a similarly limited python virutal machine.....

Yeah, I think there are a lot of people out there who would like
something like this, but it's not quite clear how to go about it. If
you search Google Groups, there are a lot of examples of how you can use
Python's object introspection to retrieve "unsafe" functions.

I wish there was a way to, say, exec something with no builtins and with
import disabled, so you would have to specify all the available
bindings, e.g.:

exec user_code in dict(ClassA=ClassA, ClassB=ClassB)

but I suspect that even this wouldn't really solve the problem, because
you can do things like:

py> class ClassA(object):
... pass
...
py> object, = ClassA.__bases__
py> object
<type 'object'>
py> int = object.__subclasses__()[2]
py> int
<type 'int'>

so you can retrieve a lot of the builtins. I don't know how to retrieve
__import__ this way, but as soon as you figure that out, you can then
do pretty much anything you want to.

Steve

Michael Spencer

unread,
Jan 25, 2005, 2:46:42 PM1/25/05
to pytho...@python.org
Steven Bethard wrote:

>
> I wish there was a way to, say, exec something with no builtins and
with
> import disabled, so you would have to specify all the available
> bindings, e.g.:
>
> exec user_code in dict(ClassA=ClassA, ClassB=ClassB)
>
> but I suspect that even this wouldn't really solve the problem,
because
> you can do things like:
>
> py> class ClassA(object):
> ... pass
> ...
> py> object, = ClassA.__bases__
> py> object
> <type 'object'>
> py> int = object.__subclasses__()[2]
> py> int
> <type 'int'>
>
> so you can retrieve a lot of the builtins. I don't know how to
retrieve
> __import__ this way, but as soon as you figure that out, you can then

> do pretty much anything you want to.
>
> Steve

Steve

Safe eval recipe posted to cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469

Couldn't safe exec be programmed similarly?

'import' and 'from' are syntax, so trivially avoided

Likewise, function calls are easily intercepted

As you say, attribute access to core functions appears to present the
challenge. It is easy to intercept attribute access, harder to know
what's safe. If there were a known set of 'dangerous' objects e.g.,
sys, file, os etc... then these could be checked by identity against any
attribute returned

Of course, execution would be painfully slow, due to double -
interpretation.

Michael

Michael Spencer

unread,
Jan 25, 2005, 3:05:53 PM1/25/05
to pytho...@python.org
Steven Bethard wrote:

>
> I wish there was a way to, say, exec something with no builtins and
> with import disabled, so you would have to specify all the available
> bindings, e.g.:
>
> exec user_code in dict(ClassA=ClassA, ClassB=ClassB)
>
> but I suspect that even this wouldn't really solve the problem,
> because you can do things like:
>
> py> class ClassA(object):
> ... pass
> ...
> py> object, = ClassA.__bases__
> py> object
> <type 'object'>
> py> int = object.__subclasses__()[2]
> py> int
> <type 'int'>
>
> so you can retrieve a lot of the builtins. I don't know how to
> retrieve __import__ this way, but as soon as you figure that out, you
> can then do pretty much anything you want to.
>
> Steve

Steve

Steven Bethard

unread,
Jan 25, 2005, 3:24:03 PM1/25/05
to
Michael Spencer wrote:
> Safe eval recipe posted to cookbook:
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469

This recipe only evaluates constant expressions:

"Description:
Evaluate constant expressions, including list, dict and tuple using the
abstract syntax tree created by compiler.parse"

It means you can't eval arbitrary Python code -- it's basically just a
data parser. Handy in some situations, but not the equivalent of a
limited Python virtual machine.

> Likewise, function calls are easily intercepted

I'm not sure I follow this... How do you intend to intercept all
function calls?

> As you say, attribute access to core functions appears to present the
> challenge. It is easy to intercept attribute access, harder to know
> what's safe. If there were a known set of 'dangerous' objects e.g.,
> sys, file, os etc... then these could be checked by identity against any
> attribute returned

It sounds like you're suggesting overriding the global attribute access
mechanism. Is that right? So that every time Python encountered an
attribute access, you would verify that the attribute being accessed is
not on the 'dangerous' list? I don't know how to do that without
basically rewriting some of Python's C code, though certainly I'm no
expert in the area...

Also, I'm not sure identity is sufficient:

py> import sys
py> import new
py> new.module('newsys')
py> newsys = new.module('newsys')
py> newsys.__dict__.update(sys.__dict__)
py> newsys is sys
False
py> newsys == sys
False

Steve

Steven Bethard

unread,
Jan 25, 2005, 4:26:13 PM1/25/05
to
Michael Spencer wrote:

> Steven Bethard wrote:
>
>> Michael Spencer wrote:
>>
>>> Safe eval recipe posted to cookbook:
>>> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469
>>
>> This recipe only evaluates constant expressions
>
[snip
>
> Indeed. But it's easy to extend this to arbitrary constructs. You just
> need to decide what code to emit for the other 50 or so ast node types.
> Many of those are boiler-plate binops.

Ahh, gotcha. Thanks for the clarification.

I haven't ever spent much time dealing with Python's ASTs, but my guess
is doing anything here is probably worth putting off until the AST
branch is merged into main CVS for Python 2.5. (I understand there are
supposed to be some substantial changes, but I don't know exactly what
they are or what they affect.)

> Right - the crux of the problem is how to identify dangerous objects.
> My point is that if such as test is possible, then safe exec is very
> easily implemented within current Python. If it is not, then it is
> essentially impossible.
>
[snip]
>
> It might still be possible to have a reliable test within a
> problem-specific domain i.e., white-listing.

Yeah, that was basically my intent -- provide a white-list of the usable
objects. I wonder how complicated this would be... You also probably
have to white-list the types of all the attributes of the objects you
provide...

Steve

Alexander Schremmer

unread,
Jan 25, 2005, 4:08:01 PM1/25/05
to
On Tue, 25 Jan 2005 12:22:13 -0700, Steven Bethard wrote:

> >>This is a serious issue.
> >>
> >>It's also one that brings Tcl, mentioned several
> >>times in this thread, back into focus. Tcl presents
> >>the notion of "safe interpreter", that is, a sub-
> >>ordinate virtual machine which can interpret only
> >>specific commands. It's a thrillingly powerful and
> >>correct solution to the main problem Jeff and others
> >>have described.
> >
> > A better (and of course *vastly* more powerful but unfortunately only
> > a dream ;-) is a similarly limited python virutal machine.....
>
> Yeah, I think there are a lot of people out there who would like
> something like this, but it's not quite clear how to go about it. If
> you search Google Groups, there are a lot of examples of how you can use
> Python's object introspection to retrieve "unsafe" functions.

IMHO a safe Python would consist of a special mode that disallows all
systemcalls that could spy/harm data (IO etc.) and imports of
non-whitelisted modules. Additionally, a loop counter in the interpreter
loop would ensure that the code does not stall the process/machine.

>>> sys.safecall(func, maxcycles=1000)
could enter the safe mode and call the func.

I am not sure how big the patch would be, it is mainly a C macro at the
begginning of every relevant function that checks the current "mode" and
raises an exception if it is not correct. The import handler would need to
check if the module is whitelisted (based on the path etc.).

Python is too dynamic to get this working while just using tricks that
manipulate some builtins/globals etc.

Kind regards,
Alexander

Michael Spencer

unread,
Jan 25, 2005, 3:51:25 PM1/25/05
to pytho...@python.org
Steven Bethard wrote:
> Michael Spencer wrote:
>
>> Safe eval recipe posted to cookbook:
>> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469
>
>
> This recipe only evaluates constant expressions:
>
> "Description:
> Evaluate constant expressions, including list, dict and tuple using the
> abstract syntax tree created by compiler.parse"
>
> It means you can't eval arbitrary Python code -- it's basically just a
> data parser. Handy in some situations, but not the equivalent of a
> limited Python virtual machine.

Indeed. But it's easy to extend this to arbitrary constructs. You just need to

decide what code to emit for the other 50 or so ast node types. Many of those
are boiler-plate binops.
>

>> Likewise, function calls are easily intercepted
>
> I'm not sure I follow this... How do you intend to intercept all
> function calls?

Sorry, should have been more precise. In the AST, Function calls have their own
node type, so it is easy to 'intercept' them and execute them conditionally
>
[snip]


>
> It sounds like you're suggesting overriding the global attribute access
> mechanism. Is that right? So that every time Python encountered an
> attribute access, you would verify that the attribute being accessed is
> not on the 'dangerous' list?

Just in the context of the AST-walker, yes


I don't know how to do that without
> basically rewriting some of Python's C code, though certainly I'm no
> expert in the area...

Not messing with the CPython interpreter


>
> Also, I'm not sure identity is sufficient:
>
> py> import sys
> py> import new
> py> new.module('newsys')
> py> newsys = new.module('newsys')
> py> newsys.__dict__.update(sys.__dict__)
> py> newsys is sys
> False
> py> newsys == sys
> False

Right - the crux of the problem is how to identify dangerous objects. My point

is that if such as test is possible, then safe exec is very easily implemented
within current Python. If it is not, then it is essentially impossible.

Let's assume that it is indeed not possible to know in general whether an object
is safe, either by inspecting its attributes, or by matching its identity
against a black list.

It might still be possible to have a reliable test within a problem-specific

domain i.e., white-listing. This, I think, is what you meant when you said:

> I wish there was a way to, say, exec something with no builtins and with import disabled, so you would have to specify all the available bindings, e.g.:
>
> exec user_code in dict(ClassA=ClassA, ClassB=ClassB)

I believe that if you can come up with a white-list, then the rest of the
problem is easy.

Michael

Cameron Laird

unread,
Jan 25, 2005, 6:08:13 PM1/25/05
to
In article <mailman.1301.1106686...@python.org>,
Michael Spencer <ma...@telcopartners.com> wrote:
.
.
.

>Right - the crux of the problem is how to identify dangerous objects. My point
>is that if such as test is possible, then safe exec is very easily implemented
>within current Python. If it is not, then it is essentially impossible.
>
>Let's assume that it is indeed not possible to know in general whether
>an object
>is safe, either by inspecting its attributes, or by matching its identity
>against a black list.
>
>It might still be possible to have a reliable test within a problem-specific
>domain i.e., white-listing. This, I think, is what you meant when you said:
>
>> I wish there was a way to, say, exec something with no builtins and
>with import disabled, so you would have to specify all the available
>bindings, e.g.:
>>
>> exec user_code in dict(ClassA=ClassA, ClassB=ClassB)
>
>I believe that if you can come up with a white-list, then the rest of the
>problem is easy.
>
>Michael
>

I'll suggest yet another perspective: add another indirection.
As the virtual machine becomes more available to introspection,
it might become natural to define a *very* restricted interpreter
which we can all agree is safe, PLUS a means to extend that
specific instance of the VM with, say, new definitions of bindings
for particular AST nodes. Then the developer has the means to
"build out" his own VM in a way he can judge useful and safe for
his own situation. Rather than the Java there-is-one-"safe"-for-
all approach, Pythoneers would have the tools to create safety.

Michael Spencer

unread,
Jan 25, 2005, 7:03:11 PM1/25/05
to pytho...@python.org
Cameron Laird wrote:
> In article <mailman.1301.1106686...@python.org>,
> Michael Spencer <ma...@telcopartners.com> wrote:
> .
> .
> .
>
>>Right - the crux of the problem is how to identify dangerous objects. My point
>>is that if such as test is possible, then safe exec is very easily implemented
>>within current Python. If it is not, then it is essentially impossible.
>>
>
>
>
> I'll suggest yet another perspective: add another indirection.
> As the virtual machine becomes more available to introspection,
> it might become natural to define a *very* restricted interpreter
> which we can all agree is safe, PLUS a means to extend that
> specific instance of the VM with, say, new definitions of bindings
> for particular AST nodes. Then the developer has the means to
> "build out" his own VM in a way he can judge useful and safe for
> his own situation. Rather than the Java there-is-one-"safe"-for-
> all approach, Pythoneers would have the tools to create safety.

That does sound good. And evolutionary, because the very restricted VM could be
implemented today (in Python), and subsequently PyPy (or whatever) could
optimize it.

The safe eval recipe I referred to earlier in the thread is IMO a trivial
example of of this approach. Of course, its restrictions are extreme - only
constant expressions, but it is straightforwardly extensible to any subset of
the language.

The limitation that I see with this approach is that it is not, in general,
syntax that is safe or unsafe (with the notable exception of 'import' and its
relatives). Rather, it the library objects, especially the built-ins, that
present the main source of risk.

So, if I understand your suggestion, it would require assessing the safety of
the built-in objects, as well as providing an interpreter that could control
access to them, possibly with fine-grain control at the attribute level.

M


Alexander Schremmer

unread,
Jan 26, 2005, 11:18:59 AM1/26/05
to
On Tue, 25 Jan 2005 22:08:01 +0100, I wrote:

>>>> sys.safecall(func, maxcycles=1000)
> could enter the safe mode and call the func.

This might be even enhanced like this:

>>> import sys
>>> sys.safecall(func, maxcycles=1000,
allowed_domains=['file-IO', 'net-IO', 'devices', 'gui'],
allowed_modules=['_sre'])

Every access to objects that are not in the specified domains are
restricted by the interpreter. Additionally, external modules (which are
expected to be not "decorated" by those security checks) have to be in the
modules whitelist to work flawlessy (i.e. not generate exceptions).

Any comments about this from someone who already hacked CPython?

Kind regards,
Alexander

Jack Diederich

unread,
Jan 26, 2005, 12:03:01 PM1/26/05
to pytho...@python.org
On Wed, Jan 26, 2005 at 05:18:59PM +0100, Alexander Schremmer wrote:
> On Tue, 25 Jan 2005 22:08:01 +0100, I wrote:
>
> >>>> sys.safecall(func, maxcycles=1000)
> > could enter the safe mode and call the func.
>
> This might be even enhanced like this:
>
> >>> import sys
> >>> sys.safecall(func, maxcycles=1000,
> allowed_domains=['file-IO', 'net-IO', 'devices', 'gui'],
> allowed_modules=['_sre'])
>
> Any comments about this from someone who already hacked CPython?

Yes, this comes up every couple months and there is only one answer:
This is the job of the OS.
Java largely succeeds at doing sandboxy things because it was written that
way from the ground up (to behave both like a program interpreter and an OS).
Python the language was not, and the CPython interpreter definitely was not.

Search groups.google.com for previous discussions of this on c.l.py

-Jack

Steven Bethard

unread,
Jan 26, 2005, 12:23:03 PM1/26/05
to
Jack Diederich wrote:
> Yes, this comes up every couple months and there is only one answer:
> This is the job of the OS.
> Java largely succeeds at doing sandboxy things because it was written that
> way from the ground up (to behave both like a program interpreter and an OS).
> Python the language was not, and the CPython interpreter definitely was not.
>
> Search groups.google.com for previous discussions of this on c.l.py

Could you give some useful queries? Every time I do this search, I get
a few results, but never anything that really goes into the security
holes in any depth. (They're ususally something like -- "look, given
object, I can get int" not "look, given object, I can get eval,
__import__, etc.)

Steve

aurora

unread,
Jan 26, 2005, 1:39:18 PM1/26/05
to
It is really necessary to build a VM from the ground up that includes OS
ability? What about JavaScript?

Jack Diederich

unread,
Jan 26, 2005, 2:17:20 PM1/26/05
to pytho...@python.org

A search on "rexec bastion" will give you most of the threads,
search on "rexec bastion diederich" to see the other times I tried to
stop the threads by reccomending reading the older ones *wink*.

Thread subjects:
Replacement for rexec/Bastion?
Creating a capabilities-based restricted execution system
Embedding Python in Python
killing thread ?

-Jack

Jack Diederich

unread,
Jan 26, 2005, 2:21:32 PM1/26/05
to pytho...@python.org
On Wed, Jan 26, 2005 at 10:39:18AM -0800, aurora wrote:
> >On Wed, Jan 26, 2005 at 05:18:59PM +0100, Alexander Schremmer wrote:
> >>On Tue, 25 Jan 2005 22:08:01 +0100, I wrote:
> >>
> >>>>>> sys.safecall(func, maxcycles=1000)
> >>> could enter the safe mode and call the func.
> >>
> >>This might be even enhanced like this:
> >>
> >>>>> import sys
> >>>>> sys.safecall(func, maxcycles=1000,
> >> allowed_domains=['file-IO', 'net-IO', 'devices',
> >>'gui'],
> >> allowed_modules=['_sre'])
> >>
> >>Any comments about this from someone who already hacked CPython?
> >
> >Yes, this comes up every couple months and there is only one answer:
> >This is the job of the OS.
> >Java largely succeeds at doing sandboxy things because it was written
> >that
> >way from the ground up (to behave both like a program interpreter and an
> >OS).
> >Python the language was not, and the CPython interpreter definitely was
> >not.
> >
> >Search groups.google.com for previous discussions of this on c.l.py
> >
> It is really necessary to build a VM from the ground up that includes OS
> ability? What about JavaScript?
>

See the past threads I reccomend in another just-posted reply.

Common browser implementations of Javascript have almost no features, can't
import C-based libraries, and can easilly enter endless loops or eat all
available memory. You could make a fork of python that matches that feature
set, but I don't know why you would want to.

-Jack

Steven Bethard

unread,
Jan 26, 2005, 3:23:17 PM1/26/05
to

Thanks for the keywords -- I hadn't tried anything like any of these.
Unfortunately, they leave me with the same feeling as before... The
closest example that I saw that actually showed a security hole made use
of __builtins__. As you'll note from the beginning of this thread, I
was considering the case where no builtins are provided and imports are
disabled.

I also read a number of messages that had the same problems I do -- too
many threads just say "look at google groups", without saying what to
search for. They also often spend most of their time talking about
abstract problems, without showing code that illustrates how to break
the "security". For example, I never found anything close to describing
how to retrieve, say, 'eval' or '__import__' given only 'object'.

What would be really nice is a wiki that had examples of how to derive
"unsafe" functions from 'object'. I'd be glad to put one together, but
so far, I can't find many examples... If you want to consider reading
and writing of files as "unsafe", then I guess this might be one:
file = object.__subclasses__()[16]
If I could see how to go from 'object' (or 'int', 'str', 'file', etc.)
to 'eval' or '__import__', that would help out a lot...

Steve

Dieter Maurer

unread,
Jan 27, 2005, 2:02:59 PM1/27/05
to
Steven Bethard <steven....@gmail.com> writes on Tue, 25 Jan 2005 12:22:13 -0700:
> Fuzzyman wrote:
> ...

> > A better (and of course *vastly* more powerful but unfortunately only
> > a dream ;-) is a similarly limited python virutal machine.....

I already wrote about the "RestrictedPython" which is part of Zope,
didn't I?

Please search the archive to find a description...


Dieter

Fuzzyman

unread,
Jan 28, 2005, 5:23:27 AM1/28/05
to

Dieter Maurer wrote:
> Steven Bethard <steven....@gmail.com> writes on Tue, 25 Jan 2005
12:22:13 -0700:
> > Fuzzyman wrote:
> > ...
> > > A better (and of course *vastly* more powerful but unfortunately
only
> > > a dream ;-) is a similarly limited python virutal machine.....
>
> I already wrote about the "RestrictedPython" which is part of Zope,
> didn't I?
>

Not in this thread.

> Please search the archive to find a description...
>

Interesting. I'd be interested in whether it requires a full Zope
install and how easy (or otherwise) it is to setup. I'll investigate.

Regards,
Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

>
> Dieter

Alex Martelli

unread,
Jan 28, 2005, 6:40:17 AM1/28/05
to
Steven Bethard <steven....@gmail.com> wrote:
...

> If I could see how to go from 'object' (or 'int', 'str', 'file', etc.)
> to 'eval' or '__import__', that would help out a lot...

>>> object.__subclasses__()
[<type 'type'>, <type 'weakref'>, <type 'int'>, <type 'basestring'>,
<type 'list'>, <type 'NoneType'>, <type 'NotImplementedType'>, <type
'module'>, <type 'zipimport.zipimporter'>, <type 'posix.stat_result'>,
<type 'posix.statvfs_result'>, <type 'dict'>, <type 'function'>, <class
'site._Printer'>, <class 'site._Helper'>, <type 'set'>, <type 'file'>]

Traipse through these, find one class that has an unbound method, get
that unbound method's func_globals, bingo.


Alex

Nick Coghlan

unread,
Jan 28, 2005, 8:31:28 PM1/28/05
to Python List

So long as any Python modules are imported using the same restricted environment
their func_globals won't contain eval() or __import__ either.

And C methods don't have func_globals at all.

However, we're talking about building a custom interpreter here, so there's no
reason not to simply find the dangerous functions at the C-level and replace
their bodies with "PyErr_SetString(PyExc_Exception, "Access to this operation
not allowed in restricted build"); return NULL;".

Then it doesn't matter *how* you get hold of file(), it still won't work. (I can
hear the capabilities folks screaming already. . .)

Combine that with a pre-populated read-only sys.modules and a restricted custom
interpreter would be quite doable. Execute it in a separate process and things
should be fairly solid.

Cheers,
Nick.

--
Nick Coghlan | ncog...@email.com | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net

Steven Bethard

unread,
Jan 28, 2005, 8:37:49 PM1/28/05
to

Thanks for the help! I'd played around with object.__subclasses__ for
a while, but I hadn't realized that func_globals was what I should be
looking for.

Here's one route to __builtins__:

py> string_Template = object.__subclasses__()[17]
py> builtins = string_Template.substitute.func_globals['__builtins__']
py> builtins['eval']
<built-in function eval>
py> builtins['__import__']
<built-in function __import__>

Steve

Alex Martelli

unread,
Jan 29, 2005, 3:44:27 AM1/29/05
to
Nick Coghlan <ncog...@iinet.net.au> wrote:

> Alex Martelli wrote:
> > Steven Bethard <steven....@gmail.com> wrote:
> > ...
> >
> >>If I could see how to go from 'object' (or 'int', 'str', 'file', etc.)
> >>to 'eval' or '__import__', that would help out a lot...
> >
> >>>>object.__subclasses__()

...


> > Traipse through these, find one class that has an unbound method, get
> > that unbound method's func_globals, bingo.
>
> So long as any Python modules are imported using the same restricted
> environment their func_globals won't contain eval() or __import__ either.

Sure, as long as you don't need any standard library module using eval
from Python (or can suitably restrict them or the eval they use), etc,
you can patch up this specific vulnerability.

> And C methods don't have func_globals at all.

Right, I used "unbound method" in the specific sense of "instance of
types.UnboundMethodType" (bound ones or any Python-coded function you
can get your paws on work just as well).

> However, we're talking about building a custom interpreter here, so there's no

It didn't seem to me that Steven's question was so restricted; and since
he thanked me for my answer (which of course is probably inapplicable to
some custom interpreter that's not written yet) it appears to me that my
interpretation of his question was correct, and my answer useful to him.

> reason not to simply find the dangerous functions at the C-level and replace
> their bodies with "PyErr_SetString(PyExc_Exception, "Access to this operation
> not allowed in restricted build"); return NULL;".
>
> Then it doesn't matter *how* you get hold of file(), it still won't work.
> (I can hear the capabilities folks screaming already. . .)

Completely removing Python-level access to anything dangerous might be a
safer approach than trying to patch one access route after another, yes.


> Combine that with a pre-populated read-only sys.modules and a restricted
> custom interpreter would be quite doable. Execute it in a separate process
> and things should be fairly solid.

If you _can_ execute (whatever) in a separate process, then an approach
based on BSD's "jail" or equivalent features of other OS's may be able
to give you all you need, without needing other restrictions to be coded
in the interpreter (or whatever else you run in that process).


Alex

Aahz

unread,
Jan 29, 2005, 6:31:45 AM1/29/05
to
In article <1gr3mwj.1mhbjao122j7fxN%ale...@yahoo.com>,

Alex Martelli <ale...@yahoo.com> wrote:
>Steven Bethard <steven....@gmail.com> wrote:
>>

One thing my company has done is written a ``safe_eval()`` that uses a
regex to disable double-underscore access.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis

Alex Martelli

unread,
Jan 29, 2005, 9:34:27 AM1/29/05
to
Aahz <aa...@pythoncraft.com> wrote:
...
> >>>> object.__subclasses__()
...

> One thing my company has done is written a ``safe_eval()`` that uses a
> regex to disable double-underscore access.

will the regex catch getattr(object, 'subclasses'.join(['_'*2]*2)...?-)


Alex

Skip Montanaro

unread,
Jan 29, 2005, 9:53:45 AM1/29/05
to Alex Martelli, pytho...@python.org

>> One thing my company has done is written a ``safe_eval()`` that uses
>> a regex to disable double-underscore access.

Alex> will the regex catch getattr(object,
Alex> 'subclasses'.join(['_'*2]*2)...?-)

Now he has two problems. ;-)

Skip

Stephen Thorne

unread,
Jan 29, 2005, 10:11:32 AM1/29/05
to sk...@pobox.com, pytho...@python.org, Alex Martelli
On Sat, 29 Jan 2005 08:53:45 -0600, Skip Montanaro <sk...@pobox.com> wrote:
>
> >> One thing my company has done is written a ``safe_eval()`` that uses
> >> a regex to disable double-underscore access.
>
> Alex> will the regex catch getattr(object,
> Alex> 'subclasses'.join(['_'*2]*2)...?-)
>
> Now he has two problems. ;-)

I nearly asked that question, then I realised that 'getattr' is quite
easy to remove from the global namespace for the code in question, and
assumed that they had already thought of that.

Stephen.

Alex Martelli

unread,
Jan 29, 2005, 11:04:29 AM1/29/05
to
Stephen Thorne <stephen...@gmail.com> wrote:

OK then -- vars(type(object)) is a dict which has [[the unbound-method
equivalent of]] object.__subclasses__ at its entry for key
'__subclasses__'. Scratch 'vars' in addition to 'getattr'. And 'eval'
of course, or else building up the string 'object.__subclasses__' (in a
way the regex won't catch) then eval'ing it is easy. I dunno, maybe I'm
just being pessimistic, I guess...


Alex

Aahz

unread,
Jan 29, 2005, 11:55:59 AM1/29/05
to
In article <1gr5osy.7eipfq7xyz72N%ale...@yahoo.com>,
Alex Martelli <ale...@yahoo.com> wrote:
>Aahz <aa...@pythoncraft.com> wrote:
>> Alex Martelli deleted his own attribution:
>>>
>>> >>> object.__subclasses__()

>>
>> One thing my company has done is written a ``safe_eval()`` that uses a
>> regex to disable double-underscore access.
>
>will the regex catch getattr(object, 'subclasses'.join(['_'*2]*2)...?-)

Heheh. No. Then again, security is only as strong as its weakest link,
and that quick hack makes this part of our application as secure as the
rest.

Bernhard Herzog

unread,
Jan 29, 2005, 2:48:12 PM1/29/05
to
ale...@yahoo.com (Alex Martelli) writes:

> OK then -- vars(type(object)) is a dict which has [[the unbound-method
> equivalent of]] object.__subclasses__ at its entry for key
> '__subclasses__'. Scratch 'vars' in addition to 'getattr'. And 'eval'
> of course, or else building up the string 'object.__subclasses__' (in a
> way the regex won't catch) then eval'ing it is easy. I dunno, maybe I'm
> just being pessimistic, I guess...

You can defeat the regexp without any builtin besides object:

>>> eval("# coding: utf7\n"
"+AG8AYgBqAGUAYwB0AC4AXwBfAHMAdQBiAGMAbABhAHMAcwBlAHMAXwBf-")
<built-in method __subclasses__ of type object at 0x81010e0>
>>>

Bernhard


--
Intevation GmbH http://intevation.de/
Skencil http://skencil.org/
Thuban http://thuban.intevation.org/

Skip Montanaro

unread,
Jan 29, 2005, 3:35:36 PM1/29/05
to Alex Martelli, pytho...@python.org

Alex> I dunno, maybe I'm just being pessimistic, I guess...

No, I think you are being realistic. I thought one of the basic tenets of
computer security was "that which is not expressly allowed is forbidden".
Any attempt at security that attempts to find and plug the security holes
while leaving the basic insecure system intact is almost certainly going to
miss something.

Skip

Alex Martelli

unread,
Jan 29, 2005, 5:23:10 PM1/29/05
to
Skip Montanaro <sk...@pobox.com> wrote:

I guess security is drastically different from all other programming
spheres because you DO have an adversary, who you should presume to be
at least as clever as you are. In most tasks, good enough is good
enough and paranoia doesn't pay; when an adversary IS there, only the
paranoid survive...;-)


Alex

Christophe Cavalaria

unread,
Jan 29, 2005, 6:05:41 PM1/29/05
to
Steven Bethard wrote:

> Fuzzyman wrote:
> > Cameron Laird wrote:
> > [snip..]
> >
> >>This is a serious issue.
> >>
> >>It's also one that brings Tcl, mentioned several
> >>times in this thread, back into focus. Tcl presents
> >>the notion of "safe interpreter", that is, a sub-
> >>ordinate virtual machine which can interpret only
> >>specific commands. It's a thrillingly powerful and
> >>correct solution to the main problem Jeff and others
> >>have described.


> >
> > A better (and of course *vastly* more powerful but unfortunately only
> > a dream ;-) is a similarly limited python virutal machine.....
>

> Yeah, I think there are a lot of people out there who would like
> something like this, but it's not quite clear how to go about it. If
> you search Google Groups, there are a lot of examples of how you can use
> Python's object introspection to retrieve "unsafe" functions.


>
> I wish there was a way to, say, exec something with no builtins and with
> import disabled, so you would have to specify all the available
> bindings, e.g.:
>
> exec user_code in dict(ClassA=ClassA, ClassB=ClassB)
>

> but I suspect that even this wouldn't really solve the problem, because
> you can do things like:
>
> py> class ClassA(object):
> ... pass
> ...
> py> object, = ClassA.__bases__
> py> object
> <type 'object'>
> py> int = object.__subclasses__()[2]
> py> int
> <type 'int'>
>
> so you can retrieve a lot of the builtins. I don't know how to retrieve
> __import__ this way, but as soon as you figure that out, you can then
> do pretty much anything you want to.
>
> Steve

Wouldn't it be better to attach to all code objets some kind of access right
marker and to create an opcode that calls a function while reducing the
access rights ? After all, security would be easier to achieve if you
prevented the execution of all the dangerous code rather than trying to
hide all the entry points to it.

Nick Coghlan

unread,
Jan 29, 2005, 8:59:39 PM1/29/05
to Python List
Alex Martelli wrote:
> It didn't seem to me that Steven's question was so restricted; and since
> he thanked me for my answer (which of course is probably inapplicable to
> some custom interpreter that's not written yet) it appears to me that my
> interpretation of his question was correct, and my answer useful to him.

Yes, I'd stopped following the thread for a bit, and the discussion had moved
further afield than I realised :)

> If you _can_ execute (whatever) in a separate process, then an approach
> based on BSD's "jail" or equivalent features of other OS's may be able
> to give you all you need, without needing other restrictions to be coded
> in the interpreter (or whatever else you run in that process).

I think that's where these discussion have historically ended. . . making a
Python-specific sandbox gets complicated enough that it ends up making more
sense to just use an OS-based sandbox that lets you execute arbitrary binaries
relatively safely.

The last suggestion I recall along these lines was chroot() plus a monitoring
daemon that killed the relevant subprocess if it started consuming too much
memory or looked like it had got stuck in an infinite loop.

Alex Martelli

unread,
Jan 30, 2005, 3:22:07 AM1/30/05
to
Nick Coghlan <ncog...@iinet.net.au> wrote:
...

> > If you _can_ execute (whatever) in a separate process, then an approach
> > based on BSD's "jail" or equivalent features of other OS's may be able
> > to give you all you need, without needing other restrictions to be coded
> > in the interpreter (or whatever else you run in that process).
>
> I think that's where these discussion have historically ended. . . making a
> Python-specific sandbox gets complicated enough that it ends up making more
> sense to just use an OS-based sandbox that lets you execute arbitrary binaries
> relatively safely.
>
> The last suggestion I recall along these lines was chroot() plus a monitoring
> daemon that killed the relevant subprocess if it started consuming too much
> memory or looked like it had got stuck in an infinite loop.

"Yes, but" -- that ``if'' at the start of this quote paragraph of mine
is, I believe, a meaningful qualification. It is not obvious to me that
all applications and platforms can usefully execute untrusted Python
code in a separate jail'd process; so, I think there would still be use
cases for an in-process sandbox, although it's surely true that making
one would not be trivial.


Alex

Jack Diederich

unread,
Jan 30, 2005, 9:49:19 AM1/30/05
to pytho...@python.org
On Sun, Jan 30, 2005 at 11:59:39AM +1000, Nick Coghlan wrote:
> Alex Martelli wrote:
> >It didn't seem to me that Steven's question was so restricted; and since
> >he thanked me for my answer (which of course is probably inapplicable to
> >some custom interpreter that's not written yet) it appears to me that my
> >interpretation of his question was correct, and my answer useful to him.
>
> Yes, I'd stopped following the thread for a bit, and the discussion had
> moved further afield than I realised :)
>
> >If you _can_ execute (whatever) in a separate process, then an approach
> >based on BSD's "jail" or equivalent features of other OS's may be able
> >to give you all you need, without needing other restrictions to be coded
> >in the interpreter (or whatever else you run in that process).
>
> I think that's where these discussion have historically ended. . . making a
> Python-specific sandbox gets complicated enough that it ends up making more
> sense to just use an OS-based sandbox that lets you execute arbitrary
> binaries relatively safely.
>
> The last suggestion I recall along these lines was chroot() plus a
> monitoring daemon that killed the relevant subprocess if it started
> consuming too much memory or looked like it had got stuck in an infinite
> loop.
>

The Xen virtual server[1] was recently metnioned on slashdot[2].
It is more lightweight and faster than full scale machine emulators because
it uses a modified system kernel (so it only works on *nixes it has been
ported to). You can set the virtual memory of each instance to keep
programs from eating the world. I don't know about CPU, you might still
have to monitor & kill instances that peg the CPU.

If anyone does this, a HOWTO would be appreciated!

-Jack

Nick Craig-Wood

unread,
Jan 30, 2005, 3:30:01 PM1/30/05
to
Jack Diederich <ja...@performancedrivers.com> wrote:
> The Xen virtual server[1] was recently metnioned on slashdot[2].
> It is more lightweight and faster than full scale machine emulators because
> it uses a modified system kernel (so it only works on *nixes it has been
> ported to).

...it also uses python for its control programs.

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

0 new messages