Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

branch dump

9 views
Skip to first unread message

Michael Collins

unread,
Nov 11, 2002, 10:02:06 AM11/11/02
to perl6-i...@perl.org
Hi,

This may be an ignorant statement since I just joined this list, but I noticed
that the parrot "branch" assembly instruction doesn't work and sometimes causes
a core dump on Linux 2.4.

------------------
example 1:
set I0, 16
branch 3
print "a"
print "b"
print "c"
print "d"
print "\n"
end

result 1: prints nothing, just ends
------------
example 2:
set I0, 16
branch 7
print "a"
print "b"
print "c"
print "d"
print "\n"
end

result 2: causes parrot to core dump (segmentation fault)
-----
Since this is my first posting, I hope I haven't spoken ignorantly. I'm interested
in getting involved with this project and I'd kind of like to know how things
work. Got any advice for me?

Michael W. Collins


Dan Sugalski

unread,
Nov 11, 2002, 10:42:27 AM11/11/02
to mcol...@bestweb.net, perl6-i...@perl.org
At 10:02 AM -0500 11/11/02, Michael Collins wrote:
>Hi,
>
>This may be an ignorant statement since I just joined this list, but I noticed
>that the parrot "branch" assembly instruction doesn't work and
>sometimes causes
>a core dump on Linux 2.4.

Oh, it works, you just need to understand it properly. :) This is one
of the reasons to use labels in a hand-rolled assembly program. So,
let's look at your code, shall we?

>------------------
>example 1:
>set I0, 16
>branch 3
>print "a"
>print "b"
>print "c"
>print "d"
>print "\n"
>end

Simple, straightforward, easy. With one mistake, that of counting.
Branch offsets are relative to the PC (IC?) at the start of the op.
Words used to encode parameters to ops also count. So, lets show this
with offsets:

>set I0, 16
>branch 3

0 1

>print "a"

2 3

>print "b"

4 5

>print "c"

6 7

>print "d"
>print "\n"
>end


So the branch 3 sets the PC to be the constant "a". Now, we don't
inline string constants, so that's really an offset in the current
segment's string table. In this case, its the first constant in the
constant table, so it has an offset of 0. Which just happens to be
the end opcode number. :)

All you need to do is change the offset a bit to point to an opcode
and you'll be fine.

Two tools you may find handy are disassemble.pl (which disassembles a
bytecode file) and the -t switch to parrot, which traces execution.
Both are really useful for diagnosing Odd Problems. (And if this is
actual assembly code you're writing, use labels for branch
destinations, though real offsets are fine if you're planning on
emitting the bytecode directly)
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Gopal V

unread,
Nov 11, 2002, 10:55:41 AM11/11/02
to perl6-i...@perl.org
If memory serves me right, Dan Sugalski wrote:
> All you need to do is change the offset a bit to point to an opcode
> and you'll be fine.

Hmm... you mean to say that a jump to a non-instruction is valid ? ..

We've had the verifiability question hashed out ... but jump target
validation is one of the simplest cases..... This could a serious issue
if Parrot starts using precompiled .pbc files (instead of the .pm model).

Also it must be relatively simple to check for right ? ... A special
option (for the sake of speed freaks ?) parrot -fverify hello.pbc ?

Excuse my ignorance if such a thing already exists ...
Gopal
--
The difference between insanity and genius is measured by success

Dan Sugalski

unread,
Nov 11, 2002, 5:14:44 PM11/11/02
to perl6-i...@perl.org
At 9:25 PM +0530 11/11/02, Gopal V wrote:
>If memory serves me right, Dan Sugalski wrote:
>> All you need to do is change the offset a bit to point to an opcode
>> and you'll be fine.
>
>Hmm... you mean to say that a jump to a non-instruction is valid ? ..

Sure. Or at least not forbidden.

>We've had the verifiability question hashed out ... but jump target
>validation is one of the simplest cases..... This could a serious issue
>if Parrot starts using precompiled .pbc files (instead of the .pm model).

No, it's not a problem. There are two potential cases to deal with.

1) We trust the code
2) We don't trust the code

The first is the common case for us. We're either coming straight
from source, or from code that we are reasonably sure is fine. In
that case, why bother verifying? Sure, there may be compiler errors,
but having every single invocation of bytecode check for compiler
errors is a bit excessive.

The second is the less common case, but given the number of things we
can't verify (it's valid to give branch and jump register args) the
time spent isn't going to get you anywhere, since you need to check
at runtime anyway. With a full scan of the bytecode, of course, and
you'd need to figure where each and every instruction starts, which
can be costly. (We can't use any table in the bytecode, as what makes
that table any more valid than the code itself? :)

We're already pretty much forced to do full runtime checking, so if
code branches somewhere it shouldn't when it's running in a
restricted container, we'll just catch the bad access and kill the
interpreter. Not much else for it, if we need to guard against the
malicious, and unfortunately we do.

>Also it must be relatively simple to check for right ? ... A special
>option (for the sake of speed freaks ?) parrot -fverify hello.pbc ?

No, the option'd be more that you enable verification rather than
disable it. Speed is the default.

>Excuse my ignorance if such a thing already exists ...

Only plans, which aren't as solid as they ought be. I'll fix that.

Gopal V

unread,
Nov 11, 2002, 6:27:53 PM11/11/02
to perl6-i...@perl.org
If memory serves me right, Dan Sugalski wrote:
> Sure. Or at least not forbidden.

k ...

> that case, why bother verifying?

Hmm.... wouldn't the JIT benifit from a pre knowledge of basic blocks
and types or some information ? ... (I seem to think so ...).

> at runtime anyway. With a full scan of the bytecode, of course, and
> you'd need to figure where each and every instruction starts, which
> can be costly.

Can't that be added onto the JIT'ing process ? ... viz during conversion
,check for jump targets ?..

I still have this assumption that JITs need to maintain some sort of
basic block identification for peephole optimisations ?..

Or is that totally irrelvant for register VMs ? ... (this is the first
register VM I have encountered...)

> >option (for the sake of speed freaks ?) parrot -fverify hello.pbc ?
>
> No, the option'd be more that you enable verification rather than
> disable it. Speed is the default.

Yup I meant just that .... not "-noverify' , but "-verify" :-)...

So, Parrot is more secure than perl is ? (that being your benchmark).

Angel Faus

unread,
Nov 11, 2002, 7:04:44 PM11/11/02
to Gopal V, perl6-i...@perl.org
> Hmm.... wouldn't the JIT benifit from a pre knowledge of basic
> blocks and types or some information ? ... (I seem to think so
> ...).

I would think so, because if, for example, the JIT wants to do a full
register allocation to map parrot registers to machine registers, it
would certainly need information about basic blocks.

(I am talking of a complete register allocation, that would re-do the
original register allocation of imc with the actual number of
registers available in the machine)

On the other hand, the JIT could certainly regenerate this information
from the imc code, which is probably going to be stored somewhere
anyway.

-angel

Leopold Toetsch

unread,
Nov 12, 2002, 2:21:48 AM11/12/02
to af...@corp.vlex.com, Gopal V, perl6-i...@perl.org
Angel Faus wrote:

>>Hmm.... wouldn't the JIT benifit from a pre knowledge of basic
>>blocks and types or some information ? ... (I seem to think so
>>...).
>>
>
> I would think so, because if, for example, the JIT wants to do a full
> register allocation to map parrot registers to machine registers, it
> would certainly need information about basic blocks.


JIT doese register allocation already, but not per basic block (which
JIT has too). Allocation is more fine grained per JITed sections, which
are either basic blocks, or consist of JITed code only.

> On the other hand, the JIT could certainly regenerate this information
> from the imc code, which is probably going to be stored somewhere
> anyway.


But right, IMCC could help here by e.g. assigning registers with top
down priority I0, I1, In ... could be the top N used registers for this
block, which JIT just remapps to processor registers.

Calling externl (non JITed code) would still need to load/restore these.

> -angel


leo


Dan Sugalski

unread,
Nov 12, 2002, 4:38:18 PM11/12/02
to perl6-i...@perl.org
At 4:57 AM +0530 11/12/02, Gopal V wrote:
>If memory serves me right, Dan Sugalski wrote:
> > that case, why bother verifying?
>
>Hmm.... wouldn't the JIT benifit from a pre knowledge of basic blocks
>and types or some information ? ... (I seem to think so ...).

Oh, sure. But whether the metadata is trustable is an interesting
question, as is whether the JIT can generate code that's safe to
execute from an unsafe base. It's distinctly possible that when
running in safe mode you don't get the JIT.

> > at runtime anyway. With a full scan of the bytecode, of course, and
> > you'd need to figure where each and every instruction starts, which
> > can be costly.
>
>Can't that be added onto the JIT'ing process ? ... viz during conversion
>,check for jump targets ?..
>
>I still have this assumption that JITs need to maintain some sort of
>basic block identification for peephole optimisations ?..
>
>Or is that totally irrelvant for register VMs ? ... (this is the first
>register VM I have encountered...)

More like I'm not expecting to use the JIT for untrusted code. I'm
not sure we'll be able to reasonably use the CG core, though I expect
we probably will.

The JIT can likely use basic block info in normal circumstances. I
leave that up to the JIT folks--if there's useful metadata they can
do things with, we can see about getting it into the bytecode.

>So, Parrot is more secure than perl is ? (that being your benchmark).

Oh, absolutely not. Some benchmarks are too poor to consider. :)

VMS is my benchmark system. I want the safe interpreters to be as
safe as a locked down VMS system. Whether we get there or not's an
open question, but it's where we're trying.

0 new messages