Binary Sandbox of Native Methods

Klaus D. Witzel

unread,

Aug 21, 2008, 5:36:51 AM8/21/08

to Moebius project discussion

In a system which runs on dynamically generated native methods,
special care must be taken about referencing objects (through their
oops) and deriving other oops from them.

This is so even for constants {false. nil. true. self. } and small
integers and the latter because of their tag, the absence of which
would present it as a bogous oop which then would compromise the
integrity of the whole system.

It therefore should be considered to validate dynamically generated
native methods without executing their hw-instructions, for checking
that the method indeed runs in its binary sandbox and no part of it
runs outside of its binary sandbox.

I let it open at what time this has to take place, it must at least be
possible to do the checks on demand (from a debugger session, for
example) and/or before storing the method so that it can become
executable.

One possible consequence it that compile-time literals must indeed
reside in a literals array (perhaps as part of the method as is the
case with Smalltalk CompiledMethods).

Another possible check is, that oops assigned to temps / iVars /
passed as arguments, etc, are indeed oops and not just integers
resulting from (pointer-)arithmetics with one or more uninterpreted
operands.

------------------------

Whatever prior work you have seen on binary sandboxes for dynamically
generated native code, please let me now. TIA

Cheers,
Klaus

P.S. it seems some authors "just" want to solve "the" security problem
but not for dynamically generated native methods.

Igor Stasenko

unread,

Aug 21, 2008, 6:50:25 AM8/21/08

to moebius-proje...@googlegroups.com

2008/8/21 Klaus D. Witzel <klaus....@cobss.com>:

>
> In a system which runs on dynamically generated native methods,
> special care must be taken about referencing objects (through their
> oops) and deriving other oops from them.
>
> This is so even for constants {false. nil. true. self. } and small
> integers and the latter because of their tag, the absence of which
> would present it as a bogous oop which then would compromise the
> integrity of the whole system.
>
> It therefore should be considered to validate dynamically generated
> native methods without executing their hw-instructions, for checking
> that the method indeed runs in its binary sandbox and no part of it
> runs outside of its binary sandbox.
>
> I let it open at what time this has to take place, it must at least be
> possible to do the checks on demand (from a debugger session, for
> example) and/or before storing the method so that it can become
> executable.
>
> One possible consequence it that compile-time literals must indeed
> reside in a literals array (perhaps as part of the method as is the
> case with Smalltalk CompiledMethods).
>
> Another possible check is, that oops assigned to temps / iVars /
> passed as arguments, etc, are indeed oops and not just integers
> resulting from (pointer-)arithmetics with one or more uninterpreted
> operands.
>

I don't know what to answer to you in this regard :)
Here some tricks which i use to write 'mostly' valid code. Of course a
chance of error is still high - because you allowed to do anything
within native method.

Currently, during sumulation, i checking that any read/store address
is inside simulated address space.

With ' thisContext ifInlined: [] ifNotInlined: []' directive i can
tell, what to do depending if method inlined or not.
This helps me to avoid tagging/detagging in places where you inline methods.

For instance:

sizeInWords
<native>
^ 3 beTagged

here, if method is called from language side (using polymorphic send),
it returns a smallinteger oop.
But if given method inlined - it returns machine word == 3.

I using following rules when writing code:
- native method arguments can be arbitrary machine words only if
method is inlined.
If method is called as usual - all arguments and returned value from
method should be valid oop.

The #ifInlined:ifNotInlined: helps me to avoid slip non-oop values in
places where an oop or smallinteger expected, or avoid performing
operations which not intented to be used when you sent a message from
language.
For instance you can send a CompiledMethod>>enter: but it will do
nothing (^ self), while if inlined it does right thing.

> ------------------------
>
> Whatever prior work you have seen on binary sandboxes for dynamically
> generated native code, please let me now. TIA
>

Well, it is really hard to tell what analysis can be applied to check
method's validity w/o running it, especially native ones - yo should
allow any low-level operation in it, otherwise there is not much sense
in having native methods :)
In most code i using #oopAt:[put:] , which a shortcut for (self +
offset) read/write.
Potentially, you can put checks in these methods, to check that value
read/written from/to given address is valid oop.

Also, since object contents can be a mix of oops and non-oops and
there is no dictated object format - you can't check an oop contents
validity without running native code and seeing what it returns. An
object contents is a black box from outside.
You can send message, like #enumerateReferences: and then check, if
object referencing valid oops.

> Cheers,
> Klaus
>
> P.S. it seems some authors "just" want to solve "the" security problem
> but not for dynamically generated native methods.
> >
>

--
Best regards,
Igor Stasenko AKA sig.

Klaus D. Witzel

unread,

Aug 22, 2008, 4:09:26 AM8/22/08

to Moebius project discussion

On Aug 21, 12:50 pm, Igor Stasenko wrote:
> 2008/8/21 Klaus D. Witzel :
...

> I don't know what to answer to you in this regard :)
> Here some tricks which i use to write 'mostly' valid code. Of course a
> chance of error is still high - because you allowed to do anything
> within native method.

Yes, and I don't think about retricting *how* native code can do
things, only *what* it can be allowed to let see other parties => GC,
debugger, other friends.

> Currently, during sumulation, i checking that any read/store address
> is inside simulated address space.

Good.

> With ' thisContext ifInlined: [] ifNotInlined: []' directive i can
> tell, what to do depending if method inlined or not.
> This helps me to avoid tagging/detagging in places where you inline methods.
>
> For instance:
>
> sizeInWords
> <native>
> ^ 3 beTagged
>
> here, if method is called from language side (using polymorphic send),
> it returns a smallinteger oop.
> But if given method inlined - it returns machine word == 3.

Then logic dictates that #beTagged is superflous and its name
misleading?

> I using following rules when writing code:
> - native method arguments can be arbitrary machine words only if
> method is inlined.

Yup.

> If method is called as usual - all arguments and returned value from
> method should be valid oop.

Right.

> The #ifInlined:ifNotInlined: helps me to avoid slip non-oop values in
> places where an oop or smallinteger expected, or avoid performing
> operations which not intented to be used when you sent a message from
> language.
> For instance you can send a CompiledMethod>>enter: but it will do
> nothing (^ self), while if inlined it does right thing.

Yes, I recall the was achieved in the previous email discussions,
good.

> > ------------------------
>
> > Whatever prior work you have seen on binary sandboxes for dynamically
> > generated native code, please let me now. TIA
>
> Well, it is really hard to tell what analysis can be applied to check
> method's validity w/o running it, especially native ones - yo should
> allow any low-level operation in it, otherwise there is not much sense
> in having native methods :)

When the Binary Sandbox is designed carefully, the latter (any low-
level operation) should be possible without compromising the former
(validation of oops).

> In most code i using #oopAt:[put:] , which a shortcut for (self +
> offset) read/write.
> Potentially, you can put checks in these methods, to check that value
> read/written from/to given address is valid oop.

That an oop is valid when read from/writen to a slot, is nowadays
validated with existing approaches and procedures, like

- http://www.cs.cmu.edu/~modelcheck/cbmc/

> Also, since object contents can be a mix of oops and non-oops and

No. Then at GC time the system would crash. Perhaps you mean, mix of
oops and handles?

> there is no dictated object format - you can't check an oop contents
> validity without running native code and seeing what it returns. An
> object contents is a black box from outside.

Could you define interoperability for co-existence of the host VM and
its guest Moebius, this would be a good start.

Igor Stasenko

unread,

Aug 22, 2008, 11:21:50 AM8/22/08

to moebius-proje...@googlegroups.com

2008/8/22 Klaus D. Witzel <klaus....@cobss.com>:

>
> On Aug 21, 12:50 pm, Igor Stasenko wrote:
>> 2008/8/21 Klaus D. Witzel :
> ...
>> I don't know what to answer to you in this regard :)
>> Here some tricks which i use to write 'mostly' valid code. Of course a
>> chance of error is still high - because you allowed to do anything
>> within native method.
>
> Yes, and I don't think about retricting *how* native code can do
> things, only *what* it can be allowed to let see other parties => GC,
> debugger, other friends.
>

Please, can you be more specific. I dont understand what you mean by that.
The native code works with CPU & memory. There is a set of CPU/memory
related operations supported by compiler.
From that set you can tell, what method can do and what can't.
But you can't guarantee that method will use operate with valid oops.

For smalltalk methods - things is easier , since once you done most of
things you don't need to care about validity of compiled code.
But for native methods - its interesting how you can guarantee that
using some automation? I don't see how automation can help with that.

Well, i confused about what sandboxes you talking about.
You can run method in simulator and catch most of errors i presume.
You can catch errors at runtime (protection faults etc).
But determining errors without running anything... ??
Some code will work , if you provide a valid input. And there is a
problem: i think you can't provide such input automatically, because
it may not uncover any possible variants. Writing tests is a solution
- but you have to write them manually.

>> In most code i using #oopAt:[put:] , which a shortcut for (self +
>> offset) read/write.
>> Potentially, you can put checks in these methods, to check that value
>> read/written from/to given address is valid oop.
>
> That an oop is valid when read from/writen to a slot, is nowadays
> validated with existing approaches and procedures, like
>
> - http://www.cs.cmu.edu/~modelcheck/cbmc/
>
>> Also, since object contents can be a mix of oops and non-oops and
>
> No. Then at GC time the system would crash. Perhaps you mean, mix of
> oops and handles?
>

mix of anything. It would not crash system, if GC performed in
cooperation with objects.

>> there is no dictated object format - you can't check an oop contents
>> validity without running native code and seeing what it returns. An
>> object contents is a black box from outside.
>
> Could you define interoperability for co-existence of the host VM and
> its guest Moebius, this would be a good start.
>

err.. what is the host VM in environment where is no VM at all (Moebius) ?
Or you mean interoperability with OS?

Reply all

Reply to author

Forward