decgen: A little something that might help with libcpu

77 views
Skip to first unread message

Jeffrey Lee

unread,
May 25, 2010, 3:52:36 PM5/25/10
to lib...@googlegroups.com
Hi guys,

After talking with Michael a bit I've heard that you're interested in
moving away from your hand-crafted instruction decoders and instead using
something a bit more robust/automated/maintainable. So I'm here to do a
bit of shameless advertising for the instruction decoder generator I've
been working on for the past few weeks :)

The tool is called decgen, and it can be found here on my website:

http://www.phlamethrower.co.uk/riscos/decgen.php

Basically to use decgen you write one or more 'encoding files' describing
the encodings of each instruction, and one or more 'action files'
containing blocks of C code that should be executed whenever the
corresponding instruction is encountered. decgen uses the data in the
encoding files to generate a decision tree which can be used for rapid
classification of instructions. It then spits out a C file containing a
function to walk that tree, and which performs the indicated
encoding-specific action whenever a leaf node is reached (i.e. when the
instruction has been decoded).

Although there's no guarantee that decgen will produce a decoder that's
faster than a hand-crafted one, it does offer two main advantages -
guaranteed correctness and ease of extensibility. The encoding files you
give to decgen don't dictate the structure of the decoder, so you can
add and remove instructions without fear of breaking the decoding logic.
Plus the validation and verification steps that decgen performs ensure
that there aren't any ambiguities or gaps in the instruction set
definition.

I know that for a while Orlando has been working on a 'UPCL' language
for describing the behaviour of instructions; as I understand it the goal
is to produce a UPCL compiler that translates the instruction definitions
into C/C++ code which either emulates the instruction outright or
generates LLVM bitcode for it. If that's the case then it should be fairly
trivial to have the UPCL tools output action files suitable for use with
decgen, so that you can use the two together without any hassle. And just
like you can create UPCL files by transcribing the pseudocode from (e.g.)
the ARM architecture reference manual, you can also create decgen encoding
files by transcribing the encoding information from the reference manual
too (In fact, I've already done the hard part of transcribing the 1800+
ARM/Thumb encodings for you, so even if you only use decgen for your ARM
frontend it will still save you many hours of work)

Of course decgen is still very new, so it's a bit rough around the edges,
and lacks some features that could be useful for certain architectures
(e.g. at the moment it only supports fixed-size instructions). There may
also be a few bugs lurking around, although I'd hope that the release
I made a couple of days ago will have fixed all the big ones.

I'll be using decgen to produce a number of instruction set decoders for
use in/with RISC OS, so you can expect it to see a fair amount of
development over the coming months. If other people are interested in
helping develop decgen then I'd be happy to move it to somewhere like
sourceforge.

Let me know what you guys think!

Cheers,

- Jeffrey

Reply all
Reply to author
Forward
0 new messages