switch/case statement vs function pointer table

1,129 views
Skip to first unread message

Ulf Magnusson

unread,
Oct 12, 2011, 5:59:47 PM10/12/11
to andro...@googlegroups.com
Hi,

I'm writing an NES emulator for fun. In my main 6502 emulation loop I
need to dispatch on an 8-bit opcode to instruction-specific handlers.
There seems to be two common approaches:

1. Use a switch/case statement, i.e.,

switch(read_opcode(pc++)) {
case <instruction 1>: ...
case <instruction 2>: ...
}

2. Use a table of function pointers to handlers, i.e.,

typedef void (*instruction_handler)();
instruction_handler instruction_handlers[0x100] = { ... };
...
instruction_handlers[read_opcode(pc++)]();

Which approach is likely to generate the most efficient code with GCC
on an ARM Android device? Does the fact that the code needs to be
relocatable come in in any way?

I realize I should profile to figure out if CPU emulation is actually
worth optimizing compared to other things (from what I've heard, it's
usually negligible), but it's a fun topic anyway.

/Ulf

Carlos A. M. dos Santos

unread,
Oct 12, 2011, 11:08:32 PM10/12/11
to andro...@googlegroups.com

Use an array of label pointers as a jump table. It will be faster than
calling functions and even slightly faster than the switch/case (much
faster, if the number of opcodes is large).

See http://gcc.gnu.org/onlinedocs/gcc-4.4.6/gcc/Labels-as-Values.html#Labels-as-Values

--
"The flames are all long gone, but the pain lingers on"

Ulf Magnusson

unread,
Oct 13, 2011, 3:19:19 PM10/13/11
to andro...@googlegroups.com

How does that differ from the kind of jump table the compiler might
generate for a switch?

I'm planning on having this compile with at least VC++ as well by the
way, so might be messy to use GCC extensions :/

/Ulf

Tim Mensch

unread,
Oct 13, 2011, 3:54:25 PM10/13/11
to andro...@googlegroups.com
On 10/13/2011 1:19 PM, Ulf Magnusson wrote:
> How does that differ from the kind of jump table the compiler might
> generate for a switch?
>
> I'm planning on having this compile with at least VC++ as well by the
> way, so might be messy to use GCC extensions :/

switch tends to be compiled to be quite optimal -- if you've got 100%
coverage of a region, it probably is just as fast as the array of jump
labels.

Unfortunately, the only way to find out would be to benchmark actual code.

If I were doing it, and I wanted it to be as fast as possible, I'd code
it all three ways using some a generative programming technique -- you
could do it using C macros, for instance. All of the code that DOES
anything would live in short functions with an optional "inline"
attribute, so that GCC/MSVC can inline them into the switch statement,
or keep them as distinct functions in the case of the function pointer
solution.

But honestly, I'd probably not optimize this until I knew there was an
issue -- and I trust switch enough to let the compiler pick the best way
to optimize it. You can always move to a different approach later if you
find you need additional speed -- especially if the switch is something
that could pretty trivially be generated from a list of opcode macros
(each function that DOES something shares a name with the opcode macro,
for instance).

Tim

Ulf Magnusson

unread,
Oct 13, 2011, 4:03:52 PM10/13/11
to andro...@googlegroups.com

Yeah, that's the way I've tried to arrange things, with core
instruction logic in separate inline functions, and then some macro
magic to generate versions for all the different addressing modes.
Should be fairly simple to try out different approaches later on.

/Ulf

Carlos A. M. dos Santos

unread,
Oct 13, 2011, 9:09:05 PM10/13/11
to andro...@googlegroups.com
On Thu, Oct 13, 2011 at 4:19 PM, Ulf Magnusson <ulfa...@gmail.com> wrote:
> On Thu, Oct 13, 2011 at 5:08 AM, Carlos A. M. dos Santos
> <unix...@gmail.com> wrote:
[...]

>> Use an array of label pointers as a jump table. It will be faster than
>> calling functions and even slightly faster than the switch/case (much
>> faster, if the number of opcodes is large).
>>
>> See http://gcc.gnu.org/onlinedocs/gcc-4.4.6/gcc/Labels-as-Values.html#Labels-as-Values
>
> How does that differ from the kind of jump table the compiler might
> generate for a switch?

That will depend on the compiler. Last time I checked (GCC 3.x,
cross-compiling for MIPS and Coldfire processors) the hand crafted
version was better. GCC probably generates better object code now.

> I'm planning on having this compile with at least VC++ as well by the
> way, so might be messy to use GCC extensions :/

You can use preprocessor macros and conditional compilation here.

Reply all
Reply to author
Forward
0 new messages