MetaCC and it's role in the OpenEm project

8 views

Skip to first unread message

TrackRX

unread,

Jul 30, 2006, 11:36:51 AM7/30/06

to openem_dev

I thought it could be good if I present my view on MetaCC tool and the
development future of this project.

When I entered the project as a developer, I proposed to take part in
developing emulation for the Intel 8086 CPU (with the aim to simulate
the original IBM PC machine and the old games written for it).

I had previous experience of writing emulators (and even compilers for)
simple stack-machine based VMs (virtual machines), where I had total
control over the binary format of individual opcodes, but not in
emulation of real CPU opcodes such as those of Intel x86.

The freedom to choose my own binary encoding allowed me to design the
bytecode in such a way it would be easy to decode (and emulate), so I
had little problem with decoding the bytecode.

Basically, the emulation problem itself was secondary. The most complex
problem is decoding the instructions, recognizing them and recovering
their _meaning_ from the code bit stream.

When I entered this project, I knew about two methods applicable to
solve the decoding task: switch() driven approach, and table (lookup)
driven approach.

The switch approach worked by looking at first byte of an instruction,
then performing switch() {...} on it in order to execute different code
path based on the first byte of an instruction. Each "case" then takes
care of decoding all instructions starting with that particular byte.

The table lookup approach works by doing lookup in table of 256
pointers to functions, based on value of first byte of an instruction,
and execute the correct function, which takes care of decoding just
like the different "cases" in switch()-based approach.

This works well for simple bytecode in which most instructions are
identified well by their first byte. Once the instruction set becomes
large and complex, both code dealing with individual instruction
*variations* become dispersed more and more among different cases
(values of the first byte), and amount and complexity of the code paths
dealing with different "cases" raises in geometric progression, which
leads to huge and hardly manageable code.

The problem of code size could be solved in the "brute-force" way, by
having lot of developers write and verify the code for the emulator,
but this requires the project to find and teach many skilled and
motivated people. The work writing the cases is tedious and not very
interesting, so finding good candidates (especially if they are not
paid for it), would not be easy.

The problem of increased code complexity (dispersion of handling of
single instruction variations among multiple "cases" of the switch or
handler functions in table-based method) - is worse. The task of
finding and fixing bugs become harder with the raised complexity of
emulated bytecode.

The problem is actually multiplied in our project, since we want to
emulate not just one single CPU (which we could, somehow, with lot of
effort, to manage using "brute-force", manual methods). We want to
emulate _many_ of them. This increases the importance of the problem
even higher for this particular project.

So, both table-driven and switch-based approaches work, but seem to
require too much effort from the developers, and the main problem is
code size and comlexity of converting description of bytecode from the
way it's described in CPU datasheet to the spaghetti "switch" or
"table-driven" code, which quickly becomes unmanageable.

So the problems we face are, in a nutshell:

1. It's not easy to map representation of bytecode formats as they are
given in CPU data sheets to the actual emulator code.
2. Same instruction often requires the programmer to write the same
code over and over for different "case" labels or "case" handler
functions, with small differences among the cases, which creates hardly
manageable code (each fix, if required, should be performed in many
places).
3. The amount of case labels and corresponding code snippets is high,
forcing programmers to write lot of repetitive code, increasing the
chance of creating bugs (the more you write the more are your
possibilities to create a bug or simply a typo in the text, which is
very hard to debug later).
4. The source code of the emulator becomes huge (many thousands of
lines of code), which is hard to manage and update.

MetaCC tool is meant to come to the rescue here. It allows more or less
unified model for describing CPU bytecode and the C code for emulating
it.

The tool lets you write the source CPU description file in compact and
intuitive way, and then compile it into more obscure (but still very
much readable, well commented) C++ code module (.h file) which you can
simply #include into your emulator application without modification and
get the emulator functionality right away.

Here is what it offers to solve the problems outlined above:

1. It automatically generates code for various permutations of the same
instruction for you, saving you typing the same thing again and again
in the source ==> saves development time and prevents many bugs.
Examples:
a. For 8- and 16-bit versions of the MOV instruction in x86, for
example, you would write individual case labels if you would write the
code manually, because 1st byte of the instruction differs. In MetaCC
description, you write it once, and code for the the two cases (8 and
16 bit) is generated automagically for you.
b. Most arithmetic/logic x86 instructions taking 2 arguments (except
TEST for some reason) use the same bit pattern, differing by bitfield
of 3 bits in the middle of the second(!) byte of the instruction.
So, there are 2^3 = 8 different arithmetic ops. Taking into account
there are at least two cases for each op (8 and 16 bits) and each case
can have different direction (reg->mem, mem->reg, reg<->reg, immed->mem
etc..) amount of "case" labels one need to write increases in geometric
progression.
With MetaCC - one writes two or three templated patterns, and these
are automatically expanded by the tool to cover all possible
combinations.
Essentially, in reality, two patterns written as a source into
MetaCC are expanded into 32 case labels with different code in each,
which would have to be written manually by the programmer without using
the MetaCC tool. Just think about it!

2. It automatically generates code to extract individual bit
sub-patterns from detected pattern, and present them to the programmer
as regular integer values (I mean BitFields).
This code is usually bit-shift and then and (ex: "(byte >> 3) &&
0x3)"), which is cryptic and bug-prone. With MetaCC you won't need to
worry about the details. Just use the value in your emulating code (to
select given register by it's index embedded in bits of the opcode for
example).
==> Again, saves bugs and relieves the programmer from writing
cryptic code again and again (read: reduces possibility to make a bug
in the process).

3. The generated source code is still huge (there's nothing we can do
against it, except, maybe, by giving up performance in exchange for
slightly more clarity), but since it's (auto)annotated by the tool,
it's often easy to map faulty place (where crash occur for example) to
the place in original .dsc file, and fix the problem.

I just had to explain my motivation to write the tool, and maybe help
others understand it's importance. And for this particular project in
particular.