Newsgroups: comp.lang.misc
From: BGB <cr88...@hotmail.com>
Date: Wed, 17 Oct 2012 17:12:12 -0500
Local: Wed, Oct 17 2012 6:12 pm
Subject: Re: new interpreter ("Fast RIR")
On 10/17/2012 5:50 AM, Rod Pemberton wrote:
> "BGB" <cr88...@hotmail.com> wrote in message
BGBScript has been in-use and incrementally developed since about 2004.
> news:k5khbh$f71$1@news.albasani.net... > ... >> the idea would be to develop a new faster threaded-code backend for the
> As you may know, I've been developing an ITC (threaded code) Forth
I have been using threaded-code since around November 2011. so, the point now is mostly looking into taking a different and more note that, unlike Forth or similar, the BGBScript bytecode follows the prior interpreter basically operated as a sort of this time I am factoring this out and instead using "traces".
>> as-is, the interpreter already translates the bytecode into a
I went away from using switch a while ago, mostly as it takes a fairly >> stack-based threaded-code format (which is faster than running it >> directly via a "switch()", but still could be somewhat faster). > Well, that's good to know!
> Most Forth's coded in C use a switch() since that's the C way of doing
severe performance hit for every loop iteration. threaded code is a little faster, but all of those function sadly, there is no really "good" option short of writing the whole >> so, the general idea would be to "unwind" the stack-based bytecode
typically, all of the items in a typical expression will end up on the >> (vaguely like JVM or AVM2 bytecode) into using a register-machine model >> (more like Dalvik or Parrot), and essentially use a register-machine to >> execute the code. > Have you tried any of this already? Have you analyzed how many stacked
stack. this will likely exceed the number of x86 registers, but the threaded if a JIT were used, likely it would use a mapping heuristic to figure > From my own experiences, I can "see" issues with needing a register
this would be more for a native code generator, not so much for a > allocator, keeping track of which data is in which registers, spilling of > registers when you run out of available registers, etc. register-IR, where the registers aren't so much "registers" in the CPU sense, but are more closely related to "temporary variables". so, you can avoid the need to "spill" by simply having 256 (or more) as the need for a full register allocator can also be avoided at this stage >> any ideas for how to make such a design idea faster are welcome (apart
the "register" keyword doesn't really usually make much of a difference, >> maybe from putting everything in globals, which yes, can make things >> faster, but kills thread-safety, and also doesn't help much on ELF >> targets). > I ran into a few issues with making my ITC interpreter faster in C.
> The first issue is that the "register" keyword only works on "auto" or
> The second issue is:
> "If I can't use the 'register' keyword on globals, and I've got four
> The first answer to that is something C doesn't have: nested procedures.
so I don't really use it. >> apparently, people working in similar situations in the past, were able
checking for stack overflow is just the sort of cost I am trying to >> to get a roughly 25%-30% speedup by translating to a register IR (for an >> interpreter). this was for Java ByteCode. > ... >> I also realized some while looking at retrofitting a few ideas onto the
> Or, stack check on push/pop...
> Or, just abort...
avoid here. it is the same sort of issue as checking if an exception-state variable typically, a stack might be accessed like:
but, if we add something like:
then suddenly lots of clock cycles are being wasted just performing this and, if our dispatch loop is something like:
that "&& !ctx->ret" also adds a bit of cost.
here is the inner-loop for the in-development interpreter:
op=tr->ops;
}
and the add-integer handler:
BGBFRIR_API void FRIR_ThOp_ADD_I(FRIR_Context *ctx, FRIR_Opcode *op) { ctx->ri[op->c]=ctx->ri[op->a]+ctx->ri[op->b]; } >> design changes from previously:
this more has to do with the new interpreter having a lot of operations >> 'A' type was changed from being a raw pointer to a pointer-sized >> integer, mostly as it became apparent that the interpreter logic would >> require an excessive amount of casting in C, so it would be easier just >> to treat it as an integer type and cast back to a pointer as-needed. > Exactly. You can't dereference a pointer type in C and end up with the
for pointer arithmetic, like "lea" (for calculating addressed) and "xloada" (for loading the address of a field into a register) and similar. an integer type makes more sense, but leaves the minor annoyance of sign at the IR level, the 'A' (address type) is partly disjoint from the there are operations like: "conv_a2l" and "conv_l2a" to convert between but, as-is, IR would look something like:
there may likely end up being a few more special cases, like:
or such...
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
| ||||||||||||||