Improving efficiency of a bytecode interpreter

Mark R. Bannister

unread,

Oct 27, 2009, 6:43:04 AM10/27/09

to

I'm developing a new programming language, and at the core of this
language is a bytecode interpreter (yes, I've invented the bytecode
too, as well as assembly code and an assembler ...)

It's all written in C.

My question is, can anyone think of a way of making engine.c (see link
below) smaller and more efficient? This is the heart of the
interpreter, it's a big switch with switches inside it, a bit like
this pseudo-code:

switch (opcode)
{
case for each bytecode instruction with similar arguments:
process arguments
perform command
}

What troubles me is that a lot of the code for processing arguments is
very similar, but because different instructions take different
arguments I can't readily process them up-front. Take a look at the
code itself and if you have any useful suggestions please send them my
way.

http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup

Thanks,
Mark Bannister.
(cambridge at sourceforge dot net)

Richard Heathfield

unread,

Oct 27, 2009, 7:57:19 AM10/27/09

to

In
<64a6511a-0b84-4ddd...@f16g2000yqm.googlegroups.com>,
Mark R. Bannister wrote:

> I'm developing a new programming language, and at the core of this
> language is a bytecode interpreter (yes, I've invented the bytecode
> too, as well as assembly code and an assembler ...)
>
> It's all written in C.

Almost 10,000 lines of C, which means you'll struggle to find anyone
prepared to do an in-depth analysis for you for free. This is even
more true because it's awkward to extract your C from your HTML so
that it can be loaded easily into a decent editor.

It doesn't help that your code is written in a rather archaic style.
Nor is it particularly helpful to have your macro definitions
scattered throughout the code.

> My question is, can anyone think of a way of making engine.c (see
> link below) smaller and more efficient?

Firstly, despite the 10,000 lines, "smaller" isn't really your problem
(in terms of line count, that is). Your problem is not so much size
as complexity. Many project teams mandate a limit of 1000 lines per
source file, which is perhaps a little restrictive, but you have a
single *function* that is almost ten times that long! So your first
step should be to identify logical groupings of code that you can
move off into their own functions. Don't worry for now about
refactoring for commonality - just refactor for function size! Once
you have your functions down to a reasonable size (and have re-tested
to identify bugs introduced in the process), you can start to work on
efficiency. A profiler is a lot more useful to you if you have many
small functions rather than one colossal monolith.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Richard

unread,

Oct 27, 2009, 8:16:03 AM10/27/09

to

Richard Heathfield <r...@see.sig.invalid> writes:

> In
> <64a6511a-0b84-4ddd...@f16g2000yqm.googlegroups.com>,
> Mark R. Bannister wrote:
>
>> I'm developing a new programming language, and at the core of this
>> language is a bytecode interpreter (yes, I've invented the bytecode
>> too, as well as assembly code and an assembler ...)
>>
>> It's all written in C.
>
> Almost 10,000 lines of C, which means you'll struggle to find anyone
> prepared to do an in-depth analysis for you for free. This is even
> more true because it's awkward to extract your C from your HTML so
> that it can be loaded easily into a decent editor.
>
> It doesn't help that your code is written in a rather archaic style.
> Nor is it particularly helpful to have your macro definitions
> scattered throughout the code.

Good old Heathfield.

First slap the outstretched hand, kick him in the head and then settle
down to offer your experience ....

>
>> My question is, can anyone think of a way of making engine.c (see
>> link below) smaller and more efficient?
>
> Firstly, despite the 10,000 lines, "smaller" isn't really your problem
> (in terms of line count, that is). Your problem is not so much size
> as complexity. Many project teams mandate a limit of 1000 lines per
> source file, which is perhaps a little restrictive, but you have a
> single *function* that is almost ten times that long! So your first
> step should be to identify logical groupings of code that you can
> move off into their own functions. Don't worry for now about
> refactoring for commonality - just refactor for function size! Once
> you have your functions down to a reasonable size (and have re-tested
> to identify bugs introduced in the process), you can start to work on
> efficiency. A profiler is a lot more useful to you if you have many
> small functions rather than one colossal monolith.
>
> <snip>

Unbelievable. A lecture on using functions.

LOL!

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Mark R. Bannister

unread,

Oct 27, 2009, 8:35:33 AM10/27/09

to

On Oct 27, 11:57 am, Richard Heathfield <r...@see.sig.invalid> wrote:
<snip>

> So your first
> step should be to identify logical groupings of code that you can
> move off into their own functions. Don't worry for now about
> refactoring for commonality - just refactor for function size!

<snip>

Thanks Richard, yes I have considered this. However, what is the
overhead of adding a function? I obviously want my bytecode to
execute as efficiently as possible. If every single bytecode
instruction launches a function to perform its role, it's not going to
be so quick is it? That's why one big function monolith. The other
problem with adding functions is it then requires all kinds of
different variables required to keep the engine running to be passed
to the function. I could encapsulate these in one big structure and
pass a single pointer, but then I'm making more data available to each
function than it really needs, this goes against my nature (usually I
only want to pass the data I know the function will need).

So, functions aside, are there any other suggestions from anyone
else? Speed is of the essence, I don't want to do anything that will
slow it down (execution speed before style, Richard).

jacob navia

unread,

Oct 27, 2009, 9:49:24 AM10/27/09

to

Mark R. Bannister a �crit :

I think the first step is to build an array of 255 function pointers. For each position
in the array, write a function that does the work of the corresponding byte code.

Then, instead of a switch you write:

(functionTable[bytecode])();

Then, to avoid the overhead of returning to call again a function write as the last
statement of each function

(functionTable[bytecode])();

That will jump to the next function. Problem is, that is not a jump, but a call.
Eventually, you could run out of stack space. You can only get away with this schema
if you get a compiler that allows taking the address of a label and putting them
in an array. Gcc does that.

Another way is to write the core in assembler. I did that for my byte interpreter of lisp,
and the speed was way beyond what C can ever achieve.

James Kuyper

unread,

Oct 27, 2009, 10:24:38 AM10/27/09

to

Mark R. Bannister wrote:
> On Oct 27, 11:57�am, Richard Heathfield <r...@see.sig.invalid> wrote:
> <snip>
>> So your first
>> step should be to identify logical groupings of code that you can
>> move off into their own functions. Don't worry for now about
>> refactoring for commonality - just refactor for function size!
> <snip>
>
> Thanks Richard, yes I have considered this. However, what is the
> overhead of adding a function? I obviously want my bytecode to
> execute as efficiently as possible.

If you've got a single function that is almost 10,000 lines long, you've
almost certainly got lots of design errors and other bugs to work out.
Simplify the code by breaking it into smaller pieces, each of which can
be properly understood. Only after you have a coherent overall design up
and working correctly should you worry about details of efficiency.

> be so quick is it? That's why one big function monolith. The other
> problem with adding functions is it then requires all kinds of
> different variables required to keep the engine running to be passed
> to the function. I could encapsulate these in one big structure and
> pass a single pointer, but then I'm making more data available to each
> function than it really needs, this goes against my nature (usually I
> only want to pass the data I know the function will need).

That's why that's the wrong thing to do. Created many different data
structures, each of them collecting together functionally related pieces
of data. Then any given function will generally need only a few of those
structures, and most functions will make use of a fair fraction of all
of the members of any structure they do use.

> So, functions aside, are there any other suggestions from anyone
> else? Speed is of the essence, I don't want to do anything that will
> slow it down (execution speed before style, Richard).

That's exactly the wrong attitude: good programming style leads to
elegant design and accurate code, and makes the code easier to maintain.
Don't worry about efficiency until you've got other, more important
matters such as correctness and clarity dealt with.

BGB / cr88192

unread,

Oct 27, 2009, 10:25:44 AM10/27/09

to

"Mark R. Bannister" <chap...@yahoo.com> wrote in message
news:ff6e8f3a-4c58-4c21...@a32g2000yqm.googlegroups.com...

On Oct 27, 11:57 am, Richard Heathfield <r...@see.sig.invalid> wrote:
<snip>
> So your first
> step should be to identify logical groupings of code that you can
> move off into their own functions. Don't worry for now about
> refactoring for commonality - just refactor for function size!
<snip>

<--

Thanks Richard, yes I have considered this. However, what is the
overhead of adding a function? I obviously want my bytecode to
execute as efficiently as possible. If every single bytecode
instruction launches a function to perform its role, it's not going to
be so quick is it? That's why one big function monolith. The other
problem with adding functions is it then requires all kinds of
different variables required to keep the engine running to be passed
to the function. I could encapsulate these in one big structure and
pass a single pointer, but then I'm making more data available to each
function than it really needs, this goes against my nature (usually I
only want to pass the data I know the function will need).

So, functions aside, are there any other suggestions from anyone
else? Speed is of the essence, I don't want to do anything that will
slow it down (execution speed before style, Richard).

-->

with this code, you will not lose much.

noting as how damn near every case is a full block (with variables, if
statements, loops, and such...), the relative cost of adding a function is
trivial.

it can be noted that, on average, if/for/while/... are fairly expensive vs
other operations (whereas your code uses them endlessly).

putting state in a big structure:
I, and probably most others I know of, also do this...

other ideas:
don't do high-level opcode decoding for each opcode, or if you really must
have high-level opcode decoding, maybe put it in its own code. this is what
I do for x86 machine code, where 1 piece of code is responsible for decoding
opcodes, and another piece of code for executing them.

so, in this case, I decode opcodes, and then run them through a switch based
on argument configuration (Reg, RM, Imm, RegRM, RMReg, Imm, RegImm, RMImm,
...), and then through another big switch (per-opcode number).

typically, each of these has its own functions, so the argument
configuration switch calls the function for handling the particular
configuration, which does a big switch for handling the various opcodes.

another route is to make each opcode have fixed-form arguments (sort of like
in the JVM's JBC, MSIL, ...). where, if one has different arguments, they
also have different opcode names and numbers.

another thing:
if possible, try to make operations fairly close to atomic.
if it is an atomic operation, then maybe it goes in the switch;
if it is not (as in, involves a big ugly chunk of logic code), then maybe it
is better served by a function.

FWIW, I handled pretty much the entire integer subset of x86 in around 2
kloc (for the per-instruction code), with about 700 lines going just into
dealing with REP/REPE/REPZ opcode forms (I give them a lot of special
treatment...).

most of the rest is simple operations, although a lot of this function calls
(given x86 has fairly complicated register-handling logic, and the great
terror known as 'eflags' which is awkward to simulate effectively...), so a
lot of code for dealing with registers and eflags is done elsewhere (as well
as the code for managing the virtual address space, which is essentially
independent of the interpreter, mostly so that they don't step on each
other, ...).

it is then about 1.5 kloc for the code to simulate the x87 FPU (including
conversion code, 80-bit float operations being done primarily via integer
math, ...).

granted, there is a 1.2 kloc chunk of code dedicated primarily to
getting/setting registers (various types, sizes, sign vs zero extension,
...).

...

and, in all this, I realize that x86 is very difficult to interpret
efficiently, vs nearly any other bytecode.

most of my bytecodes could be interpreted almost entirely with a big dumb
switch and maybe 1 or 2 statements for each case...

or such...

BGB / cr88192

unread,

Oct 27, 2009, 10:37:41 AM10/27/09

to

"Richard Heathfield" <r...@see.sig.invalid> wrote in message
news:tOudncpQP99BQ3vX...@bt.com...

> In
> <64a6511a-0b84-4ddd...@f16g2000yqm.googlegroups.com>,
> Mark R. Bannister wrote:
>
>> I'm developing a new programming language, and at the core of this
>> language is a bytecode interpreter (yes, I've invented the bytecode
>> too, as well as assembly code and an assembler ...)
>>
>> It's all written in C.
>
> Almost 10,000 lines of C, which means you'll struggle to find anyone
> prepared to do an in-depth analysis for you for free. This is even
> more true because it's awkward to extract your C from your HTML so
> that it can be loaded easily into a decent editor.
>
> It doesn't help that your code is written in a rather archaic style.
> Nor is it particularly helpful to have your macro definitions
> scattered throughout the code.
>

FFS this code is terrible, IMHO...

this is also another example of code which would probably break my compiler,
oh well...

>> My question is, can anyone think of a way of making engine.c (see
>> link below) smaller and more efficient?
>
> Firstly, despite the 10,000 lines, "smaller" isn't really your problem
> (in terms of line count, that is). Your problem is not so much size
> as complexity. Many project teams mandate a limit of 1000 lines per
> source file, which is perhaps a little restrictive, but you have a
> single *function* that is almost ten times that long! So your first
> step should be to identify logical groupings of code that you can
> move off into their own functions. Don't worry for now about
> refactoring for commonality - just refactor for function size! Once
> you have your functions down to a reasonable size (and have re-tested
> to identify bugs introduced in the process), you can start to work on
> efficiency. A profiler is a lot more useful to you if you have many
> small functions rather than one colossal monolith.
>

agreed...

when each switch statement includes its own block, functions are advised.
typically (in C at least), function call overhead is only really noticed
when the functions are trivially simple, as in:

int foo()
{ return(bar()); }

even then, it is small...

and, apparently a bit larger on Intel than AMD CPU's, although it seems
Intel CPUs are faster per-clock for unbranching code, which in my test has a
2.10 GHz Intel CPU doing slightly better on average than a 2.31 GHz AMD CPU,
except for short functions, such as mutex locking and unlocking, ... which
apparently manage to 'pwn' the Intel...

but, if one's code has, say, variables and if statements, ... then a
function call is likely to make little or no real difference (and may
actually improve performance for reasons internal to the compiler...).

or such...

Mark R. Bannister

unread,

Oct 27, 2009, 10:49:58 AM10/27/09

to

On Oct 27, 2:25 pm, "BGB / cr88192" <cr88...@hotmail.com> wrote:
<snip>

> another thing:
> if possible, try to make operations fairly close to atomic.
> if it is an atomic operation, then maybe it goes in the switch;
> if it is not (as in, involves a big ugly chunk of logic code), then maybe it
> is better served by a function.

I think you've hit the nail on the head. Thank you everyone for your
comments, they have been helpful and given me something to think
about. I know engine.c is horrible at the moment. It didn't start
off that way, it's just grown slowly over the years.

I will try to piece my engine variables into a few more structures
(ultimately they need to be anyway so I can easily save and restore
the current execution thread). Any operations that are not atomic
will go into functions, that I will get the compiler to inline if
possible.

Hey ho, more heavy shifting work to do, and just when I thought I was
on the verge of the next exciting breakthrough ..... :-)

pete

unread,

Oct 27, 2009, 10:53:07 AM10/27/09

to

BGB / cr88192 wrote:

> So, functions aside, are there any other suggestions from anyone
> else? Speed is of the essence, I don't want to do anything that will
> slow it down (execution speed before style, Richard).

If you are able to replace a giant switch
with an array of function pointers,
then that may or may not be a speed increase.

--
pete

Richard

unread,

Oct 27, 2009, 11:12:29 AM10/27/09

to

James Kuyper <james...@verizon.net> writes:

> Mark R. Bannister wrote:
>> On Oct 27, 11:57�am, Richard Heathfield <r...@see.sig.invalid> wrote:
>> <snip>
>>> So your first
>>> step should be to identify logical groupings of code that you can
>>> move off into their own functions. Don't worry for now about
>>> refactoring for commonality - just refactor for function size!
>> <snip>
>>
>> Thanks Richard, yes I have considered this. However, what is the
>> overhead of adding a function? I obviously want my bytecode to
>> execute as efficiently as possible.
>
> If you've got a single function that is almost 10,000 lines long, you've
> almost certainly got lots of design errors and other bugs to work
> out.

What total nonsense. It could well be auto generated.

Seebs

unread,

Oct 27, 2009, 11:22:44 AM10/27/09

to

On 2009-10-27, Mark R. Bannister <chap...@yahoo.com> wrote:
> So, functions aside, are there any other suggestions from anyone
> else? Speed is of the essence, I don't want to do anything that will
> slow it down (execution speed before style, Richard).

Actually, he's probably right.

Here's the thing. Functions MIGHT have overhead. But they also might help
the compiler make good optimization decisions. I wouldn't bet on the code
slowing down AT ALL from a conversion to functions. It might actually be
faster.

As an example: Imagine that you currently have a giant switch table. And you
replace it with a table of function pointers. There have existed compilers
such that, when you do this, and write:

*(func[opcode])(...);

that your performance will INCREASE. Dramatically.

Because some compilers have tended to implement switch statements as a huge
string of if/else if/else if/else if, in practice. :)

The thing is... No, you're wrong. Not execution speed before style.

The order is:
1. Clarity.
2. Good profiling results.
3. NOW you can try to optimize.

Please review Henry Spencer's 10 Commandments for C Programmers. In
particular, the stuff about premature optimization. You have committed
some EXTREMELY premature optimization here, and it's going to make your
life hell.

-s
--
Copyright 2009, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

BGB / cr88192

unread,

Oct 27, 2009, 11:28:37 AM10/27/09

to

"pete" <pfi...@mindspring.com> wrote in message
news:L9GdnSMbMqrMlHrX...@earthlink.com...

replacing a giant switch with an array of function pointers:
no, this will probably not improve speed.

since an array of pointers is also very position sensitive, it may also be
much more difficult to work with (if manually produced), whereas one can
fairly easily work with a switch (adding/removing items, ...).
internally, compilers tend to handle switches as an array of pointers
anyways, although it is typically to specific statements (usually a big glob
of statements placed after the dispatch logic for the switch AFAIK).

however, if each switch item is simply a function call, then it is possible
there might be a slight speed increase (as well as a reducing the
machine-code size, due to not having a whole bunch of function-call related
instruction sequences), but the speed increase is not likely to be
significant (from the POV of the CPU, both sets of instruction sequences
would likely be fairly similar either way).

(well, except in my compiler, which is kind of lame in that it does not use
jump tables, as I never got around to this...).

> --
> pete

Mark R. Bannister

unread,

Oct 27, 2009, 11:33:02 AM10/27/09

to

On Oct 27, 3:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
> As an example: Imagine that you currently have a giant switch table. And you
> replace it with a table of function pointers. There have existed compilers
> such that, when you do this, and write:
>
> *(func[opcode])(...);
>
> that your performance will INCREASE. Dramatically.

If I did this, I'd have to pass all the same variables to every
function, wouldn't I? I have a large number of variables used by the
processing engine, which different instructions may or may not
manipulate. How could I implement a table of function pointers, but
still have the flexibility to send only the right arguments to the
right functions?

BGB / cr88192

unread,

Oct 27, 2009, 11:47:22 AM10/27/09

to

"James Kuyper" <james...@verizon.net> wrote in message
news:hc6vra$ern$1...@news.eternal-september.org...
> Mark R. Bannister wrote:

>> On Oct 27, 11:57?am, Richard Heathfield <r...@see.sig.invalid> wrote:
>> <snip>
>>> So your first
>>> step should be to identify logical groupings of code that you can
>>> move off into their own functions. Don't worry for now about
>>> refactoring for commonality - just refactor for function size!
>> <snip>
>>
>> Thanks Richard, yes I have considered this. However, what is the
>> overhead of adding a function? I obviously want my bytecode to
>> execute as efficiently as possible.
>
> If you've got a single function that is almost 10,000 lines long, you've
> almost certainly got lots of design errors and other bugs to work out.
> Simplify the code by breaking it into smaller pieces, each of which can be
> properly understood. Only after you have a coherent overall design up and
> working correctly should you worry about details of efficiency.
>

it was said elsethread that this was produced over a period of years,
meaning that likely it is at least acceptably stable in this respect...

>> be so quick is it? That's why one big function monolith. The other
>> problem with adding functions is it then requires all kinds of
>> different variables required to keep the engine running to be passed
>> to the function. I could encapsulate these in one big structure and
>> pass a single pointer, but then I'm making more data available to each
>> function than it really needs, this goes against my nature (usually I
>> only want to pass the data I know the function will need).
>
> That's why that's the wrong thing to do. Created many different data
> structures, each of them collecting together functionally related pieces
> of data. Then any given function will generally need only a few of those
> structures, and most functions will make use of a fair fraction of all of
> the members of any structure they do use.
>

my usual approach is to have a central "context" which may or may not
contain sub-structures.
in my JBC interpreter, the 'frame' structure was generally passed instead,
since oddly the frame tended to be a more central structure than the main
context...

>> So, functions aside, are there any other suggestions from anyone
>> else? Speed is of the essence, I don't want to do anything that will
>> slow it down (execution speed before style, Richard).
>
> That's exactly the wrong attitude: good programming style leads to elegant
> design and accurate code, and makes the code easier to maintain. Don't
> worry about efficiency until you've got other, more important matters such
> as correctness and clarity dealt with.

more so, smaller and cleaner code is usually much easier to optimize.

since, although a function call is not free, it is much easier to apply
fixes and optimizations to cleanly centralized chunks of code than to go and
find and fix all of these repetitions.

I will qualify something here:
IMO this applies primarily to the small scale, and in the case of very large
masses of code (by this I mean projects > maybe 200 kloc or so), it may be
better to actually use a copy-paste-specialize approach, since
modularization and isolation becomes a much more effective strategy for
"distant" regions of code than does centralization (sharing code in this
case makes a mess of modularization).

or, stated another way:
centralize and integrate at the small scale (< 25-50 kloc);
modularize and isolate at larger scales.

similarly, centralization improves flexibility at the small scale (by
reducing redundant code, ...);
whereas, centralization hurts flexibility at larger scales (by essentially
"crystalizing" the codebase);
...

however, on a much larger scale, and in a different way, this pattern seems
to repeat (me considering more "massive" projects, such as the GNU and Linux
world-space, ...). although, this is more in terms of disjoint projects and
inter-project dependencies.

or, at least, this is my personal experience...

Richard Heathfield

unread,

Oct 27, 2009, 12:09:11 PM10/27/09

to

In <slrnhee43f.83l...@guild.seebs.net>, Seebs wrote:

> On 2009-10-27, Mark R. Bannister <chap...@yahoo.com> wrote:
>> So, functions aside, are there any other suggestions from anyone
>> else? Speed is of the essence, I don't want to do anything that
>> will slow it down (execution speed before style, Richard).
>
> Actually, he's probably right.

Oi, Seebs, less of the "probably"!

Mark R. Bannister

unread,

Oct 27, 2009, 12:05:30 PM10/27/09

to

On Oct 27, 3:47 pm, "BGB / cr88192" <cr88...@hotmail.com> wrote:
> "James Kuyper" <jameskuy...@verizon.net> wrote in message
<snip>

> it was said elsethread that this was produced over a period of years,
> meaning that likely it is at least acceptably stable in this respect...

I've been working on the project since 2002, although engine.c, the
subject of this thread, began life in Oct 2006. Yes it's very stable.

If I were to functionise the code for each instruction, how clever is
the compiler when it comes to inlining functions? What I mean is, say
I have a variable called program_ptr. This is, by no surprise, the
program pointer. Each instruction needs to read the data at the
program pointer and advance the pointer, because each instruction may
have a different number of arguments (some instructions just keep
reading arguments until there are no more - you could literally pass
hundreds of arguments to some instructions if so required).

Currently program_ptr is a register. In the current scenario, it
remains as such. When functionised, of course, it can no longer be a
register as I need to pass it to a function. But what happens when
that function is inlined by an optimiser? If I've gone to the bother
of wrapping the program_ptr up into a structure to pass it to a
function that has been inlined, am I not wasting time wrapping up
something that only then needs to be immediately unwrapped? Or are
compilers clever enough to miss out the wrapping/unwrapping stage when
inlining? I suppose I'll only know for sure by trying it and then
disassembling the result....

There again I could take the approach I've taken for ages, which is
"who cares it's 9,942 lines today, it works, leave it alone".
Especially when one has a 3-month old baby and less & less time for
hobbyist programming :-)

BGB / cr88192

unread,

Oct 27, 2009, 12:12:15 PM10/27/09

to

"Mark R. Bannister" <chap...@yahoo.com> wrote in message
news:652795f3-1b3b-4470...@37g2000yqm.googlegroups.com...

On Oct 27, 3:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
> As an example: Imagine that you currently have a giant switch table. And
> you
> replace it with a table of function pointers. There have existed compilers
> such that, when you do this, and write:
>
> *(func[opcode])(...);
>
> that your performance will INCREASE. Dramatically.

<--

If I did this, I'd have to pass all the same variables to every
function, wouldn't I? I have a large number of variables used by the
processing engine, which different instructions may or may not
manipulate. How could I implement a table of function pointers, but
still have the flexibility to send only the right arguments to the
right functions?

-->

in my interpreter, I split it up like this:
there is the 'Context' which holds the general state (actually, it is
composed partly of references to other structures, which hold more specific
state for specific subsystems, such as the MMU, ...);
then there is 'DecodeOp', which basically holds info relevant to the current
opcode (its opcode number, arguments, ...).

so, if you want to get, say, EAX (or maybe R11W, CR4, ...), then this info
is grabbed from the context.
it you want to know, say, which register is in 'Reg', or where RM points,
then this is figured from the DecodeOp (note: I have a special function to
figure out the value of RM at a given moment).

(it is worth noting as a technical impact of this: because I am only using a
single value for a field which can handle both registers and memory
operands, internally I ended up memory-mapping the registers, although not
as byte-addressable memory...).

actually, I ended up reserving the whole 3GB-4GB region for "internal" uses,
so:
0x00000000-0x7FFFFFFF: reserved for a simulated process;
0x80000000-0xBFFFFFFF: reserved for shared memory / ... (maybe, could be
made general-use);
0xC0000000-0xFFFFFFFF: reserved for internal use (current largest use:
swizzling references between disjoint address spaces, handles, ...).

0x100000000-... : only available for x86-64 simulation (yet, the prior
mapping remains in this case).

note: all of this is not strictly authentic (AKA: different from a real
CPU), however, the point of this interpreter is for userspace code, not as
an emulator (I can then safely diverge on points which would not effect
userspace code...).

BGB / cr88192

unread,

Oct 27, 2009, 12:16:55 PM10/27/09

to

"jacob navia" <ja...@nospam.org> wrote in message
news:hc6tol$bu1$1...@aioe.org...

yep.

this is also an advantage of having a dynamic assembler available, as then
one can procedurally generate fine-tuned interpreter logic.

however, I did not do this for my x86 interpreter, choosing instead to write
it in C to have a working version first (ASM is all or nothing, and if one
messes up somewhere in terms of overall design/... well then, there is a bit
of a mess...).

Gordon Burditt

unread,

Oct 27, 2009, 12:28:03 PM10/27/09

to

>My question is, can anyone think of a way of making engine.c (see link
>below) smaller and more efficient? This is the heart of the
>interpreter, it's a big switch with switches inside it, a bit like
>this pseudo-code:

You might make it smaller by grouping the opcodes by type of arguments
and handling the types in common code. That assumes that there
*are* groups of opcodes that have common argument types. If there
aren't, perhaps your bytecode design should be re-examined.

This will likely make the code smaller, but it probably won't speed it
up, unless you get a *LOT* of speedup from better locality of code.
You'll have an extra switch.

>switch (opcode)
>{
>case for each bytecode instruction with similar arguments:
> process arguments
> perform command
>}

char argtype[256] = { 1, 2, 1, .... };

switch (argtype[opcode])
{
case for each group with similar arguments:
process common arguments for this group;
}
switch (opcode)
{
case for each bytecode instruction :
process arguments
perform command
}

Where argtype[opcode] represents what group this particular opcode
falls in. For instance, you might have several instructions that
each refer to a destination register and a source virtual memory
with a 32-bit offset and an optional index register (e.g. load,
add, subtract, bitwise AND, bitwise OR, etc.). You can calculate
the addresses of these without having to look at the specific
instruction.

>What troubles me is that a lot of the code for processing arguments is
>very similar, but because different instructions take different
>arguments I can't readily process them up-front. Take a look at the
>code itself and if you have any useful suggestions please send them my
>way.

That's way too much code to look at.

bartc

unread,

Oct 27, 2009, 12:34:44 PM10/27/09

to

"Mark R. Bannister" <chap...@yahoo.com> wrote in message

news:64a6511a-0b84-4ddd...@f16g2000yqm.googlegroups.com...

> I'm developing a new programming language, and at the core of this
> language is a bytecode interpreter (yes, I've invented the bytecode
> too, as well as assembly code and an assembler ...)
>
> It's all written in C.
>
> My question is, can anyone think of a way of making engine.c (see link
> below) smaller and more efficient? This is the heart of the
> interpreter, it's a big switch with switches inside it, a bit like
> this pseudo-code:
>
> switch (opcode)
> {
> case for each bytecode instruction with similar arguments:
> process arguments
> perform command
> }
>
> What troubles me is that a lot of the code for processing arguments is
> very similar, but because different instructions take different
> arguments I can't readily process them up-front. Take a look at the
> code itself and if you have any useful suggestions please send them my
> way.

The main structural problems have already been addressed; the code is so
inefficient at present that offloading the execution of each main switch
case to a function is not a significant overhead. (And you should use enums
for opcodes not hex constants.)

You haven't seen said what performance speedups you're looking for. How fast
is it on a scale of C to Python at present? (Assuming your language does
similar things.)

For speed, you want to make as few runtime decisions as possible. From what
I can see, after each opcode is decoded, you have a series of if statements
and while loops to determine what to do next. And a call to a macro
PROGRAM_NEXT_ARG which looks a mess of code, which if it doesn't simplify
down is going to be a killer.

Does your bytecode have multiple operands (number not known until runtime)
and of a category (not type; that's a separate matter) not known until
runtime either? That will be slow to execute. Dedicated opcodes for each
combination is faster.

If your bytecode does use dynamic stuff like this, you need to be able to
determine things using a simple int value, to index a switch/array/function
array, not use random logic.

There are many other things, but at present the code seems very inefficient;
you want to completely minimise the number of lines executed per opcode
(unless it is a slow opcode anyway, such as implementing a=b where b is some
huge data).

When things get a bit tighter, you might want to look at changing how
opcodes are stored in the bytecode. Replace the index codes (00, 01, etc),
with the actual addresses of functions that deal with that opcode. Then the
main loop might look like this:

typedef void (*(*FFF))(void);

do {
(**(FFF)(pcptr))();
} while (1);

if you can find of neat way of exiting, otherwise it's more like: do ...
while (!stopped)

(At present I use asm code for the main dispatch code in my interpreters. To
execute the equivalent of your noop opcode takes just two x86 instructions
in overhead. In C code it would be an empty function with no parameters or
locals.)

--
Bartc

bartc

unread,

Oct 27, 2009, 1:02:59 PM10/27/09

to

Well, you just use globals, although that is frowned upon in this group (but
then so are 10000-line functions..).

Parameters only slow things down anyway.

One problem with switch (opcode) is that there will be code generated to
ensure opcode is in range (there may or may not be a way of turning this
off). And of course it needs to be in a dispatch loop.

One approach I tried with gcc was to make use of it's label addresses, then
this line:

goto *pcptr;

would jump to the next handler for this opcode. That handler would then need
to step pcptr to the next instruction (my bytecode interleaved 'opcodes' and
operands; opcodes were replaced by handler addresses). It could then also
execute goto *pcptr itself, the dispatch loop disappears!

To structure the code, I put each opcode handler into a function (with no
parameters or automatic locals), and try to goto that function, however that
didn't work, since gcc did some weird things with function prologs.

In the end I had to go outside of C to achieve this: fast threaded code and
encapsulating handlers inside 'functions' even though the functions are
never called and never returned from!

--
Bartc

Gene

unread,

Oct 27, 2009, 1:54:06 PM10/27/09

to

> http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine....

>
> Thanks,
> Mark Bannister.
> (cambridge at sourceforge dot net)

Optimizations at this level are always extremely dependent on the
compiler/processor/cache combination. A good compiler goes through a
lot of trouble to make switch statements efficient. For most
processors, a switch with many cases having similar values will be
translated to a jump table. This allows the dispatch to occur with
(usually) a couple of instructions. It's hard to do better with
dispatching, even hand-coding in assembly language. You might get a
tiny improvement by making sure your opcodes are in a continguous
stretch starting at zero.

You should handle the common argument processing with functions
declared static (local) in the same file. Such functions are nicely
inlinable by nearly all modern compilers. Then it's easy to test with
inlining compiler flags (like -finline-functions in gcc) set both on
and off. Now and then, you will actually get speedier execution with
the functions _not_ inlined for code cache reasons. If you open code
all the argument processing, you don't get to try this.

You'll note that performance-critical systems will generally have an
"arch" directory containing a gazzilion versions of inner loop code--
sometimes in assembly langage--customized for each operating
environment.

One other suggestion is to look at open source bytecode interpreters
for ideas. From past reading, Perl and Lua seem to be well-regarded
for speed, but I have no data to support this. You could look at one
of the "shoot out" sites for comparisons.

Mark R. Bannister

unread,

Oct 27, 2009, 4:32:29 PM10/27/09

to

On Oct 27, 5:02 pm, "bartc" <ba...@freeuk.com> wrote:
<snip>

> Well, you just use globals, although that is frowned upon in this group (but
> then so are 10000-line functions..).

Can't use globals, because ultimately the engine will be multi-
threaded.

So many people are offering so many useful suggestions, thank you.
You've all given me lots of ideas. Some of them are not feasible,
some of them I'll try. I'll see how I get on.

bartc

unread,

Oct 27, 2009, 4:57:58 PM10/27/09

to

Mark R. Bannister wrote:
> On Oct 27, 5:02 pm, "bartc" <ba...@freeuk.com> wrote:
> <snip>
>> Well, you just use globals, although that is frowned upon in this
>> group (but then so are 10000-line functions..).
>
> Can't use globals, because ultimately the engine will be multi-
> threaded.

In that case then perhaps just one parameter pointing to a complete
environment for that activation of the interpreter (I don't understand
threads).

In C terms, that probably means putting everything (that is specific to the
activation) into a giant struct. In practice that can mean a very slight
performance hit from having this extra pointer access.

--
Bartc

Eric Sosman

unread,

Oct 27, 2009, 6:17:46 PM10/27/09

to

James Kuyper wrote:
> Mark R. Bannister wrote:
>> On Oct 27, 11:57�am, Richard Heathfield <r...@see.sig.invalid> wrote:
>> <snip>
>>> So your first
>>> step should be to identify logical groupings of code that you can
>>> move off into their own functions. Don't worry for now about
>>> refactoring for commonality - just refactor for function size!
>> <snip>
>>
>> Thanks Richard, yes I have considered this. However, what is the
>> overhead of adding a function? I obviously want my bytecode to
>> execute as efficiently as possible.
>
> If you've got a single function that is almost 10,000 lines long, you've
> almost certainly got lots of design errors and other bugs to work out.
> Simplify the code by breaking it into smaller pieces, each of which can
> be properly understood. Only after you have a coherent overall design up
> and working correctly should you worry about details of efficiency.

It's somewhat a matter of opinion, but I'll venture to
disagree with both James Kuyper and Richard Heathfield in this
instance. Ordinarily, I'd agree that a ten thousand-line function,
even a five thousand-line function, is too big and almost sure to
be too snarled. But for a function of the kind Mark Bannister
describes, I think the objection based on size largely (sorry)
disappears: Total size is no longer a good measure of complexity.

What Mark has is a function body that looks something like

{
preamble();
while (something) {
switch (doodad) {
case 0: ...
case 1: ...
...
case 999: ...
default: abort();
}
}
postamble();
}

... and with a structure like this the "size" to consider is
roughly the total size of preamble(), the gnarliest `case', and
postamble(). One thousand `case's of about ten lines each,
and there you have your ten thousand-line function, but the
conceptual burden is light.

I've come across (and written) big `switch' constructs like
this in three contexts: Lexical scanners, interpreters, and
state machines. (There are certainly others, but these three
seem most susceptible to `case' proliferation.) In none of these
situations does the total number of `case's significantly affect
the conceptual complexity -- state machines can get hairy, but
most of the hair grows on the state-to-state transitions, not on
the relatively bald bulk of the function containing them all.

Size doesn't *always* matter, lads.

> That's exactly the wrong attitude: good programming style leads to
> elegant design and accurate code, and makes the code easier to maintain.
> Don't worry about efficiency until you've got other, more important
> matters such as correctness and clarity dealt with.

"Elegant" is a word sometimes applicable, but more often
used as a synonym for "The Devil's in the details, and I shun
the Devil." There's an old joke about an engineer, a physicist,
and a mathematician asked to solve an aerodynamic drag problem:

The engineer proposes to build a detailed full-scale model
of the automobile, put it in a wind tunnel, and measure the
drag directly. It'll take several weeks, but it's The Way.

The physicist pooh-poohs the engineer's naïveté and says
he can write a huge system of nonlinear partial differential
equations for the drag and solve them on a supercomputer. It'll
take a solid week to run the solution (two, counting the re-run
after the bug is fixed), but the answer will come out Sooner.

The mathematician smiles indulgently at his colleagues'
folly, and says "Consider a spherical Ferrari ..."

The mathematician's approach was elegant. IMHO, we can
do without that sort of elegance.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Eric Sosman

unread,

Oct 27, 2009, 6:25:21 PM10/27/09

to

bartc wrote:
> Mark R. Bannister wrote:
>> On Oct 27, 3:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
>>> As an example: Imagine that you currently have a giant switch table.
>>> And you replace it with a table of function pointers. There have
>>> existed compilers such that, when you do this, and write:
>>>
>>> *(func[opcode])(...);
>>>
>>> that your performance will INCREASE. Dramatically.
>>
>> If I did this, I'd have to pass all the same variables to every
>> function, wouldn't I? I have a large number of variables used by the
>> processing engine, which different instructions may or may not
>> manipulate. How could I implement a table of function pointers, but
>> still have the flexibility to send only the right arguments to the
>> right functions?
>
> Well, you just use globals, although that is frowned upon in this group
> (but then so are 10000-line functions..).
>
> Parameters only slow things down anyway.

Globals will probably slow them down even more. (Of course,
neither your opinion nor mine is supported by the C language
itself; measurement wins.)

> One problem with switch (opcode) is that there will be code generated to
> ensure opcode is in range (there may or may not be a way of turning this
> off). And of course it needs to be in a dispatch loop.

Again, the C language takes no position on this. But it
seems odd to fret about one compare-and-jump-not-taken while
blithely ignoring the *two* jumps plus linkage overhead plus
register-to-memory spills that a function table will likely
involve. (And it's well beyond odd to worry about "a dispatch
loop" around a switch statement while conveniently forgetting
that a solution based on function pointers needs that very
same loop ...)

--
Eric Sosman
eso...@ieee-dot-org.invalid

BGB / cr88192

unread,

Oct 27, 2009, 6:47:24 PM10/27/09

to

"Eric Sosman" <eso...@ieee-dot-org.invalid> wrote in message
news:hc7rih$466$1...@news.eternal-september.org...

> James Kuyper wrote:
>> Mark R. Bannister wrote:

did you actually look at the code?...

> I've come across (and written) big `switch' constructs like
> this in three contexts: Lexical scanners, interpreters, and
> state machines. (There are certainly others, but these three
> seem most susceptible to `case' proliferation.) In none of these
> situations does the total number of `case's significantly affect
> the conceptual complexity -- state machines can get hairy, but
> most of the hair grows on the state-to-state transitions, not on
> the relatively bald bulk of the function containing them all.
>
> Size doesn't *always* matter, lads.
>

I don't think you have looked at it...

look before you comment...

it is a good deal more terrible, more like:

{
declarations...
...
switch(foo)
{
case 0x0a:
{
declarations...
...
while(p=foo)
{
...
switch(*p)
{
...
}
}
...
more conditionals, loops, switches, ...
...
}
}

more-or-less (not exact, but close enough), but it is scary-bad code
organization...

I have seen some plenty big switches of the type you describe, and this was
not one of them...

>> That's exactly the wrong attitude: good programming style leads to
>> elegant design and accurate code, and makes the code easier to maintain.
>> Don't worry about efficiency until you've got other, more important
>> matters such as correctness and clarity dealt with.
>
> "Elegant" is a word sometimes applicable, but more often
> used as a synonym for "The Devil's in the details, and I shun
> the Devil." There's an old joke about an engineer, a physicist,
> and a mathematician asked to solve an aerodynamic drag problem:
>
> The engineer proposes to build a detailed full-scale model
> of the automobile, put it in a wind tunnel, and measure the
> drag directly. It'll take several weeks, but it's The Way.
>

yep, yep...

> The physicist pooh-poohs the engineer's na�vet� and says

> he can write a huge system of nonlinear partial differential
> equations for the drag and solve them on a supercomputer. It'll
> take a solid week to run the solution (two, counting the re-run
> after the bug is fixed), but the answer will come out Sooner.
>

I have SEEN this approach to problems...

I am taking a physics class, and IMO the way a lot of things are approached
makes little sense from the POV of a programmer like myself...

> The mathematician smiles indulgently at his colleagues'
> folly, and says "Consider a spherical Ferrari ..."
>
> The mathematician's approach was elegant. IMHO, we can
> do without that sort of elegance.
>

yep.

but, alas, it is still better advised that you look over the code before
making comments that show that you haven't...

I have written an x86 interpreter, and so I have seen big-ass switches, but
what was presented was something altogether different...

> --
> Eric Sosman
> eso...@ieee-dot-org.invalid

BGB / cr88192

unread,

Oct 27, 2009, 6:57:34 PM10/27/09

to

"Eric Sosman" <eso...@ieee-dot-org.invalid> wrote in message

news:hc7s1o$8kj$1...@news.eternal-september.org...

> bartc wrote:
>> Mark R. Bannister wrote:
>>> On Oct 27, 3:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
>>>> As an example: Imagine that you currently have a giant switch table.
>>>> And you replace it with a table of function pointers. There have
>>>> existed compilers such that, when you do this, and write:
>>>>
>>>> *(func[opcode])(...);
>>>>
>>>> that your performance will INCREASE. Dramatically.
>>>
>>> If I did this, I'd have to pass all the same variables to every
>>> function, wouldn't I? I have a large number of variables used by the
>>> processing engine, which different instructions may or may not
>>> manipulate. How could I implement a table of function pointers, but
>>> still have the flexibility to send only the right arguments to the
>>> right functions?
>>
>> Well, you just use globals, although that is frowned upon in this group
>> (but then so are 10000-line functions..).
>>
>> Parameters only slow things down anyway.
>
> Globals will probably slow them down even more. (Of course,
> neither your opinion nor mine is supported by the C language
> itself; measurement wins.)
>

depends on the compiler and OS...

globals are likely to be faster on Win32 and Win64, since code direct-links
against the globals.

locals and arguments are likely to be faster on Linux, since Linux uses
position-independent-code by default, which tends to involve (in the 32-bit
case), fetching EIP and calculating the position of the GOT, and then
accessing said globals indirectly via the GOT.

this means that, almost invariably, a global will require a double-indirect
access on Linux (first GOT then value), wheras a local is a plain indirect
access (rel ESP or EBP), whereas a global is direct on Win32/64.

>> One problem with switch (opcode) is that there will be code generated to
>> ensure opcode is in range (there may or may not be a way of turning this
>> off). And of course it needs to be in a dispatch loop.
>
> Again, the C language takes no position on this. But it
> seems odd to fret about one compare-and-jump-not-taken while
> blithely ignoring the *two* jumps plus linkage overhead plus
> register-to-memory spills that a function table will likely
> involve. (And it's well beyond odd to worry about "a dispatch
> loop" around a switch statement while conveniently forgetting
> that a solution based on function pointers needs that very
> same loop ...)
>

yep, yep...

ASM is about the only real way for a "fast" interpreter, but is it worth the
cost?
for most things, probably not...

> --
> Eric Sosman
> eso...@ieee-dot-org.invalid

James Kuyper

unread,

Oct 27, 2009, 7:20:29 PM10/27/09

to

Eric Sosman wrote:
> James Kuyper wrote:
...

>> That's exactly the wrong attitude: good programming style leads to
>> elegant design and accurate code, and makes the code easier to
>> maintain. Don't worry about efficiency until you've got other, more
>> important matters such as correctness and clarity dealt with.
>
> "Elegant" is a word sometimes applicable, but more often
> used as a synonym for "The Devil's in the details, and I shun
> the Devil." There's an old joke about an engineer, a physicist,
> and a mathematician asked to solve an aerodynamic drag problem:

For me, in a programming context, an "elegant" way of doing something is
a way that clearly and obviously does precisely what needs to be done,
nothing more, nothing less. All too often, we have to settle for unclear
methods, or methods that waste time doing something only to undo it
later. It has nothing to do with avoiding the details. When I can find a
method of doing something that is truly elegant, I treasure it.

BGB / cr88192

unread,

Oct 27, 2009, 7:28:44 PM10/27/09

to

"Gordon Burditt" <gordon...@burditt.org> wrote in message
news:ksmdnf8KIcYOgnrX...@posted.internetamerica...

> >My question is, can anyone think of a way of making engine.c (see link
>>below) smaller and more efficient? This is the heart of the
>>interpreter, it's a big switch with switches inside it, a bit like
>>this pseudo-code:
>
> You might make it smaller by grouping the opcodes by type of arguments
> and handling the types in common code. That assumes that there
> *are* groups of opcodes that have common argument types. If there
> aren't, perhaps your bytecode design should be re-examined.
>

yep, I think the OP did something weird here...

> This will likely make the code smaller, but it probably won't speed it
> up, unless you get a *LOT* of speedup from better locality of code.
> You'll have an extra switch.
>

actually, it probably can help speed up the code, since one can essentially
batch together a bunch of things that couldn't otherwise be done, but this
depends a lot on the interpreter design.

>>switch (opcode)
>>{
>>case for each bytecode instruction with similar arguments:
>> process arguments
>> perform command
>>}
>
> char argtype[256] = { 1, 2, 1, .... };
>
> switch (argtype[opcode])
> {
> case for each group with similar arguments:
> process common arguments for this group;
> }
> switch (opcode)
> {
> case for each bytecode instruction :
> process arguments
> perform command
> }
>
>
> Where argtype[opcode] represents what group this particular opcode
> falls in. For instance, you might have several instructions that
> each refer to a destination register and a source virtual memory
> with a 32-bit offset and an optional index register (e.g. load,
> add, subtract, bitwise AND, bitwise OR, etc.). You can calculate
> the addresses of these without having to look at the specific
> instruction.
>

I personally used a strategy (for my x86 interpreter) where the first switch
essentially branched off to different functions, each doing their own
versions of the second switch.

granted, it is worth noting that I did the opcode decoding separate from the
interpretation, mostly as the way the x86 opcode encodings work is, well,
messy...

so, splitting up the 'decode' and 'interpret' steps helps greatly reduce the
overall complexity, since fairly generic logic can be used for decoding the
opcodes, and the interpretation step is mostly about getting to the right
place and doing whatever (no worries over the complexities of opcode
encoding rules, ...).

however, my current opcode decoder is likely to be terribly buggy and slow,
and I may need at some point to figure how to optimize it... (one remote
possibility here being auto-generated mega-switches, and some very different
processing logic, another possibility being a specialized lookup table,
possibly matching on the first 1 or 2 bytes...).

the current decoder is based on table-based lookups and a lot of ugly
hacked-together logic code (I had hacked code from my disassembler into an
opcode decoder...).

>
>>What troubles me is that a lot of the code for processing arguments is
>>very similar, but because different instructions take different
>>arguments I can't readily process them up-front. Take a look at the
>>code itself and if you have any useful suggestions please send them my
>>way.
>
> That's way too much code to look at.
>

yep, I skimmed...

bartc

unread,

Oct 27, 2009, 7:58:36 PM10/27/09

to

Eric Sosman wrote:
> bartc wrote:
>> Mark R. Bannister wrote:
>>> On Oct 27, 3:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
>>>> As an example: Imagine that you currently have a giant switch
>>>> table. And you replace it with a table of function pointers. There
>>>> have existed compilers such that, when you do this, and write:
>>>>
>>>> *(func[opcode])(...);
>>>>
>>>> that your performance will INCREASE. Dramatically.
>>>
>>> If I did this, I'd have to pass all the same variables to every
>>> function, wouldn't I? I have a large number of variables used by
>>> the processing engine, which different instructions may or may not
>>> manipulate. How could I implement a table of function pointers, but
>>> still have the flexibility to send only the right arguments to the
>>> right functions?
>>
>> Well, you just use globals, although that is frowned upon in this
>> group (but then so are 10000-line functions..).
>>
>> Parameters only slow things down anyway.
>
> Globals will probably slow them down even more. (Of course,
> neither your opinion nor mine is supported by the C language
> itself; measurement wins.)

That seems counterintuitive. You'd think that pushing a parameter somewhere
(or maybe two parameters, or maybe ten), creating a local stack frame and
uncreating it on return (as I assume you will suggest that using local
statics is slower), would take a bit longer than, um, not doing any of
that...

>> One problem with switch (opcode) is that there will be code
>> generated to ensure opcode is in range (there may or may not be a
>> way of turning this off). And of course it needs to be in a dispatch
>> loop.
>
> Again, the C language takes no position on this. But it
> seems odd to fret about one compare-and-jump-not-taken while
> blithely ignoring the *two* jumps plus linkage overhead plus
> register-to-memory spills that a function table will likely
> involve. (And it's well beyond odd to worry about "a dispatch
> loop" around a switch statement while conveniently forgetting
> that a solution based on function pointers needs that very
> same loop ...)

Well I do go on to mention possible ways of eliminating the dispatch loop,
although standard C makes that difficult.

(And a simple dispatch loop using a function pointer can be unrolled more
easily.)

The best I could manage with a switch (using gcc 3.4.5 a couple of years
ago), was I think 4 or 5 instructions for the switch, including a jump, plus
at the end of the case handler, a jump for the break (I assume this was
optimised to go straight to the start of the loop, I don't think it
optimised it by replicating the switch jump code). Some half-dozen
instructions overhead per opcode decode.

The best I could achieve with asm was 2 instructions overhead per opcode,
using one dedicated register. Of course these counts are only significant
for opcodes with very simple handlers. And for a bytecode design that
doesn't have complex operand decoding.

--
Bartc

Richard Harter

unread,

Oct 27, 2009, 11:37:09 PM10/27/09

to

On Tue, 27 Oct 2009 08:33:02 -0700 (PDT), "Mark R. Bannister"
<chap...@yahoo.com> wrote:

>On Oct 27, 3:22=A0pm, Seebs <usenet-nos...@seebs.net> wrote:
>> As an example: =A0Imagine that you currently have a giant switch table. =
>=A0And you
>> replace it with a table of function pointers. =A0There have existed compi=

>lers
>> such that, when you do this, and write:
>>
>> *(func[opcode])(...);
>>

>> that your performance will INCREASE. =A0Dramatically.

>
>If I did this, I'd have to pass all the same variables to every
>function, wouldn't I? I have a large number of variables used by the
>processing engine, which different instructions may or may not
>manipulate. How could I implement a table of function pointers, but
>still have the flexibility to send only the right arguments to the
>right functions?

You could use stdarg.h and variable length calling sequences.
Ask the language mavens to spell out the gory details of the
syntax.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Kafka wasn't an author;
Kafka was a prophet!

Phil Carmody

unread,

Oct 28, 2009, 3:25:08 AM10/28/09

to

James Kuyper <james...@verizon.net> writes:
> Eric Sosman wrote:
>> James Kuyper wrote:
> ...
>>> That's exactly the wrong attitude: good programming style leads to
>>> elegant design and accurate code, and makes the code easier to
>>> maintain. Don't worry about efficiency until you've got other, more
>>> important matters such as correctness and clarity dealt with.
>>
>> "Elegant" is a word sometimes applicable, but more often
>> used as a synonym for "The Devil's in the details, and I shun
>> the Devil." There's an old joke about an engineer, a physicist,
>> and a mathematician asked to solve an aerodynamic drag problem:
>
> For me, in a programming context, an "elegant" way of doing something
> is a way that clearly and obviously does precisely what needs to be
> done, nothing more, nothing less.

I'd say a design which recognises a likely greater class of
problems that this one is a single precise instance of, and
which can be clearly and cleanly extended in the direction
of that greater class would often be preferable.

So, for example, atoi8() and atoi10() may be what you were
asked for, but atoiN(...,N) might be the elegant solution.

> All too often, we have to settle for
> unclear methods, or methods that waste time doing something only to
> undo it later. It has nothing to do with avoiding the details. When I
> can find a method of doing something that is truly elegant, I treasure
> it.

Yup.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

James Kuyper

unread,

Oct 28, 2009, 6:03:22 AM10/28/09

to

Phil Carmody wrote:
...

> I'd say a design which recognises a likely greater class of
> problems that this one is a single precise instance of, and
> which can be clearly and cleanly extended in the direction
> of that greater class would often be preferable.
>
> So, for example, atoi8() and atoi10() may be what you were
> asked for, but atoiN(...,N) might be the elegant solution.

While atoiN(..., N) has a virtue that I approve of, I would call that
virtue generality, not elegance. But that's hardly an issue worth
arguing about.

Nick Keighley

unread,

Oct 28, 2009, 6:13:06 AM10/28/09

to

On 27 Oct, 12:16, Richard <rgrd...@gmail.com> wrote:

> Unbelievable. A lecture on using functions.

maybe someone who writes a 10,000 line function needs a lecture on
using functions

Nick Keighley

unread,

Oct 28, 2009, 6:58:06 AM10/28/09

to

well currently all the blocks of code for operands can access all the
variables. So the function version will be no worse. As a first go I'd
simply pass the whole virtual register set to the operand handlers.
later you worry about approriate visibility.

Do you have a test harness? Life becomes much easier if you can run
the test, modify the code, run the test. Running the test should
involve pressing one button (ok, a couple of keystrokes on a decent
cli), and if it's ok return nothing but "Test ok".

Eric Sosman

unread,

Oct 28, 2009, 8:40:52 AM10/28/09

to

bartc wrote:
> Eric Sosman wrote:
>> bartc wrote:

>>> [...]

>>> Well, you just use globals, although that is frowned upon in this
>>> group (but then so are 10000-line functions..).
>>>
>>> Parameters only slow things down anyway.
>>
>> Globals will probably slow them down even more. (Of course,
>> neither your opinion nor mine is supported by the C language
>> itself; measurement wins.)
>
> That seems counterintuitive. You'd think that pushing a parameter
> somewhere (or maybe two parameters, or maybe ten), creating a local
> stack frame and uncreating it on return (as I assume you will suggest
> that using local statics is slower), would take a bit longer than, um,
> not doing any of that...

Actually, what I'd think is that passing things around
in globals prevents the compiler from passing them around in
registers, forcing actual memory accesses instead. Since the
CPU runs a few hundred times faster than memory ...

Even if you're lucky and all the globals wind up in a nearby
cache, access to them will be slower than if they were sitting
in the CPU already. But if the compiler has just computed `pc++'
and has the program counter in a register and is about to call
a function, it must first spill the register back to the global
`pc' location lest the function need it. And when the function
returns, the compiler must re-fetch the global `pc' just in case
the function changed it. Both the store and the fetch (and any
fetches and stores within the called function) are potentially
avoidable if the compiler can "see" all the accesses, which will
be more likely if they're all inside one function.

But anyhow: As I mentioned earlier, this sort of thing is
not about the C language, but about specific implementations
thereof. Measurement *still* wins.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Moi

unread,

Oct 28, 2009, 9:01:48 AM10/28/09

to

On Tue, 27 Oct 2009 03:43:04 -0700, Mark R. Bannister wrote:

> I'm developing a new programming language, and at the core of this
> language is a bytecode interpreter (yes, I've invented the bytecode too,
> as well as assembly code and an assembler ...)
>
> It's all written in C.
>

> My question is, can anyone think of a way of making engine.c (see link
> below) smaller and more efficient? This is the heart of the
> interpreter, it's a big switch with switches inside it, a bit like this
> pseudo-code:
>

> switch (opcode)
> {
> case for each bytecode instruction with similar arguments:
> process arguments
> perform command
> }
>

> What troubles me is that a lot of the code for processing arguments is
> very similar, but because different instructions take different
> arguments I can't readily process them up-front. Take a look at the
> code itself and if you have any useful suggestions please send them my
> way.
>

> http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup

>
> Thanks,
> Mark Bannister.
> (cambridge at sourceforge dot net)

I scrolled trough the code, and found that 95 % of it
dealt with handling arguments/registers/addressing modes, with a lot of common code.

Personally I would put as much as possible of the instruction set into data-tables
, and create some helper functions to find out instruction length / addressing modes / register usage.
That would reduce the big switch to a few lines per case.
(if you use the opcode to index an opcode-table, you could even leave out the big switch)
When needed you could always inline these functions.

If that is still too slow, you could use these same tables to generate code.

HTH,
AvK

Tim Rentsch

unread,

Oct 28, 2009, 9:52:41 AM10/28/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

> I'm developing a new programming language, and at the core of this
> language is a bytecode interpreter (yes, I've invented the bytecode
> too, as well as assembly code and an assembler ...)
>
> It's all written in C.
>
> My question is, can anyone think of a way of making engine.c (see link
> below) smaller and more efficient? This is the heart of the
> interpreter, it's a big switch with switches inside it, a bit like
> this pseudo-code:
>
> switch (opcode)
> {
> case for each bytecode instruction with similar arguments:
> process arguments
> perform command
> }
>
> What troubles me is that a lot of the code for processing arguments is
> very similar, but because different instructions take different
> arguments I can't readily process them up-front. Take a look at the
> code itself and if you have any useful suggestions please send them my
> way.
>
> http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup

I don't know if you've thought of this, but the way the question is
asked basically precludes the most useful answers. The most signicant
factors in performance have to do with program design at higher
levels. But, here we are, presented with a program where all the
higher level decisions have been made, and _now_ there's a request for
help with performance? It's too late in the development cycle. If
you really want or need better performance (and I mean really better,
not just marginally better), the way to start is to reconsider some
higher level design questions. For example, can the bytecode be
transformed to a different intermediate form that allows faster
interpretation (in the spirit of JIT compiling)? Very likely it can.

Recommendation: read "The Mythical Man-Month". If you've already
read it, read it again. Representation is the essence of programming.

Gordon Burditt

unread,

Oct 28, 2009, 6:26:50 PM10/28/09

to

I see nothing wrong with a 10,000 line function if the purpose of
the function is to generate a fixed 10,000 line text output (e.g.
one of the shorter software license agreements for certain companies)
and it needs to be reviewable for accuracy by a non-programmer (e.g.
a lawyer) before anyone is allowed to compile it. You can do this
with 10,000 consecutive calls to puts(), and no funny backslash
sequences that lawyers don't understand.

Performance is not an issue for this function. It might work better
for its intended purpose if it slowed down to first-grade reading
speed, although the text is more suitable for a first-year law
student.

No, it's not acceptable to put the license agreement in a separate
file, open it, and output it. (It's too easy for someone to replace
the license with a copy of the GPL or the Pirate Bay logo).

BGB / cr88192

unread,

Oct 28, 2009, 7:16:14 PM10/28/09

to

"Gordon Burditt" <gordon...@burditt.org> wrote in message

news:P-Sdncy2AM23WHXX...@posted.internetamerica...

if I were doing this, I would probably develop things with the liscense as a
separate text file, and at compile time essentially compress and C-ify the
document via a tool (it is possible to do pure-ASCII text compression for
use in string constants, although the ratio is not as good at with a more
traditional compressor).

for example, one could use an LZ77-based algo with '$' as a run escape, and
maybe base-64 encoded runs.
"LOLZ$/AD"

could emit LOLZ 16 times ("LOLZLOLZLOLZ"...).

(note, this scheme would have a 4k window and a max match of 64).

similarly, decoding would be fairly easy.

or such...

Mark R. Bannister

unread,

Nov 22, 2009, 4:18:31 AM11/22/09

to

As discussed a few weeks ago, I took on board the helpful comments
that I received from this group and re-engineered my bytecode
interpreter's engine.c so that all bytecode instructions were handled
by functions and called via a jump table.

Here is the result:
http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup

As you will see, engine.c is no longer 10,000 lines long :-)

I'd appreciate any further suggestions you may have that will help
improve the efficiency of the interpreter.

Many thanks,
Mark.

Richard Heathfield

unread,

Nov 22, 2009, 1:07:39 PM11/22/09

to

In <e73fb85d-daf2-4d6f...@j4g2000yqe.googlegroups.com>,
Mark R. Bannister wrote:

> As discussed a few weeks ago, I took on board the helpful comments
> that I received from this group and re-engineered my bytecode
> interpreter's engine.c so that all bytecode instructions were
> handled by functions and called via a jump table.
>
> Here is the result:
>
http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup
>
> As you will see, engine.c is no longer 10,000 lines long :-)

Looks like you've turned it into a maintainable program. Assuming it
still works, you deserve some congratulations.

> I'd appreciate any further suggestions you may have that will help
> improve the efficiency of the interpreter.

You've already dealt with the most difficult inefficiency -
unmaintainability. Next step: profile it, see where the bottleneck
is.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Richard

unread,

Nov 22, 2009, 1:06:57 PM11/22/09

to

Richard Heathfield <r...@see.sig.invalid> writes:

> In <e73fb85d-daf2-4d6f...@j4g2000yqe.googlegroups.com>,
> Mark R. Bannister wrote:
>
>> As discussed a few weeks ago, I took on board the helpful comments
>> that I received from this group and re-engineered my bytecode
>> interpreter's engine.c so that all bytecode instructions were
>> handled by functions and called via a jump table.
>>
>> Here is the result:
>>
> http://prose.cvs.sourceforge.net/viewvc/prose/prose/src/prose/engine.c?view=markup
>>
>> As you will see, engine.c is no longer 10,000 lines long :-)
>
> Looks like you've turned it into a maintainable program. Assuming it
> still works, you deserve some congratulations.
>
>> I'd appreciate any further suggestions you may have that will help
>> improve the efficiency of the interpreter.
>
> You've already dealt with the most difficult inefficiency -
> unmaintainability. Next step: profile it, see where the bottleneck
> is.

And hope it doesn't require a complete rewrite.

Redesigning a program to appease people on *only* how it looks is
nonsense. It (rewrites for maintainability) MUST be combined with
efficiency considerations or you'll be utilising that "maintainability"
a lot more often and sooner than you imagined.

C is chosen as it is an efficient "bare metal" language in most
cases. Naive design listening to his "never optimise too early" nonsense
can often counter that.

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Malcolm McLean

unread,

Nov 22, 2009, 1:57:54 PM11/22/09

to

"Richard" <rgr...@gmail.com> wrote in message

> Naive design listening to his "never optimise too early" nonsense
> can often counter that.
>

It's an unusual situation, since the bottleneck is not in a small inner
loop, but in a huge switch. Microptimisation will probably make a big
difference.
(In fact you proably want to right-shift the bytecode and use that to
calcualte a jump label, throwing away space. This can only be done in
assembly.

William Ahern

unread,

Nov 22, 2009, 4:15:08 PM11/22/09

to

Richard <rgr...@gmail.com> wrote:
> Richard Heathfield <r...@see.sig.invalid> writes:
> > In <e73fb85d-daf2-4d6f...@j4g2000yqe.googlegroups.com>,
> > Mark R. Bannister wrote:
> >
> >> As discussed a few weeks ago, I took on board the helpful comments
> >> that I received from this group and re-engineered my bytecode
> >> interpreter's engine.c so that all bytecode instructions were
> >> handled by functions and called via a jump table.
> >>

<snip>

> And hope it doesn't require a complete rewrite.
>
> Redesigning a program to appease people on *only* how it looks is
> nonsense. It (rewrites for maintainability) MUST be combined with
> efficiency considerations or you'll be utilising that "maintainability"
> a lot more often and sooner than you imagined.

> C is chosen as it is an efficient "bare metal" language in most
> cases. Naive design listening to his "never optimise too early" nonsense
> can often counter that.

Ultimately he could change his jump table back to a switch statement and
declare his functions static inline (or rely on compiler heuristics to
inline). That would put him more-or-less back to square zero. That's trivial
work, and easily done if profiling requires it (indeed, profiling often
requires the ability to easily toy around with ideas). Abstraction and
readability is important if, for instance, he wanted to implement threaded
dispatch. Implementing such things is made much easier if the scope of the
modifications--instruction pointer manipulation--is well circumscribed. But,
I suppose a very simplistic copy+paste of code into functions might not have
accomplished that.

Mark R. Bannister

unread,

Nov 24, 2009, 8:16:34 AM11/24/09

to

On 22 Nov, 18:07, Richard Heathfield <r...@see.sig.invalid> wrote:
> Looks like you've turned it into a maintainable program. Assuming it
> still works, you deserve some congratulations.

Yes of course it still works! I have an extensive test suite (make
check) and have ensured that the product is still working as before.
I've taken so long because I only have half hour train journeys in the
morning to do my coding...

On 22 Nov, 21:15, William Ahern <will...@wilbur.25thandClement.com>
wrote:

> Ultimately he could change his jump table back to a switch statement and
> declare his functions static inline (or rely on compiler heuristics to
> inline). That would put him more-or-less back to square zero.

I investigated this. It would seem that I'd have to put all the
functions in a single source file for inlining to work. That would
take me back to a 10,000 line engine.c again. Or should I #include
them? I don't think I've ever seen a project that #includes a .c
source file, only .h.

There's something else I'm struggling with. I'm using gcc (various
versions including 4.1.2) and I'm getting these nuisance compile
warnings:

../../../prose-0.7.0/src/prose/stringtype.c:1793: warning: passing
argument 2 of 'mSlist_init_properties' from incompatible pointer type
../../../prose-0.7.0/src/prose/stringtype.c:1793: warning: passing
argument 3 of 'mSlist_init_properties' from incompatible pointer type
../../../prose-0.7.0/src/prose/stringtype.c:1793: warning: passing
argument 4 of 'mSlist_init_properties' from incompatible pointer type

These are examples. I get a lot of these warnings from numerous
source files.

Argument 2, to take one particular example, of mSlist_init_properties,
is declared in src/libIvor/slist.c as:

int (*cmpkeys)(const VOIDP *key1, const VOIDP *key2);

where VOIDP in this case is 'void'. This is basically giving a singly-
linked list library routine an entry-point for comparing two pieces of
arbitrary data.

In stringtype.c, I'm calling this function with argument 2: STRcmp.
STRcmp is a bytestring comparison routine declared as:

int STRcmp(const STRING *ss1, const STRING *ss2);

where STRING is a typedef for a structure called stringtype_struct.

So, I see where the error is coming from. The compiler is expecting a
function that takes void pointers, and I'm giving it a function that
takes different pointers instead. The code works fine and I've
ignored the warnings for many years. However, it would be nice to get
rid of these warnings from the compiler output. Any suggestions as to
how to get around this would be very much appreciated. Wouldn't it be
nice if I could use a cast....

Best regards,
Mark.

Ben Bacarisse

unread,

Nov 24, 2009, 10:35:02 AM11/24/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

<snip>

> There's something else I'm struggling with. I'm using gcc (various
> versions including 4.1.2) and I'm getting these nuisance compile
> warnings:
>
> ../../../prose-0.7.0/src/prose/stringtype.c:1793: warning: passing
> argument 2 of 'mSlist_init_properties' from incompatible pointer type

<snip>

> Argument 2, to take one particular example, of mSlist_init_properties,
> is declared in src/libIvor/slist.c as:
>
> int (*cmpkeys)(const VOIDP *key1, const VOIDP *key2);
>
> where VOIDP in this case is 'void'.

OK, but that is an odd name. What is the P for? If VOIDP is not void
then some of what I say below will be off the mark a little.

> This is basically giving a singly-
> linked list library routine an entry-point for comparing two pieces of
> arbitrary data.
>
> In stringtype.c, I'm calling this function with argument 2: STRcmp.
> STRcmp is a bytestring comparison routine declared as:
>
> int STRcmp(const STRING *ss1, const STRING *ss2);
>
> where STRING is a typedef for a structure called stringtype_struct.
>
> So, I see where the error is coming from. The compiler is expecting a
> function that takes void pointers, and I'm giving it a function that
> takes different pointers instead. The code works fine and I've
> ignored the warnings for many years.

Just so you know, I have used systems where this would break unless
you use a cast to convert the function back to its real type when you
call it with struct * arguments.

> However, it would be nice to get
> rid of these warnings from the compiler output. Any suggestions as to
> how to get around this would be very much appreciated. Wouldn't it be
> nice if I could use a cast....

There are two choices. (a) Cast the pointer to the type expected by
the function but be sure to cast it back to whatever is the real type
of the function when you call it. This is messy. (b) Make all your
functions have the type of the pointer that you actually pass, and
convert the void *s internally. This can be done without a cast.
I.e. you re-write:

int STRcmp(const STRING *ss1, const STRING *ss2) { ... }

to be

int STRcmp(const void *vp1, const void *vp2)
{
const STRING *ss1 = vp1, *ss2 = vp2;
/* body as before */
}

--
Ben.

Mark R. Bannister

unread,

Nov 24, 2009, 12:18:43 PM11/24/09

to

On 24 Nov, 15:35, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> "Mark R. Bannister" <chapte...@yahoo.com> writes:
> <snip>
> > Argument 2, to take one particular example, of mSlist_init_properties,
> > is declared in src/libIvor/slist.c as:
>
> > int (*cmpkeys)(const VOIDP *key1, const VOIDP *key2);
>
> > where VOIDP in this case is 'void'.
>
> OK, but that is an odd name. What is the P for? If VOIDP is not void
> then some of what I say below will be off the mark a little.

Well, VOIDP could be a char on a system that doesn't support void. I
guess when I started the project (a long time ago), I was not making
any assumptions about the capabilities of your compiler ... now I've
implemented a function jump table in an array, I've negated that. So
at some point, when I find the time, I can trawl through and remove
support for very old compilers.

<snip>

> There are two choices. (a) Cast the pointer to the type expected by
> the function but be sure to cast it back to whatever is the real type
> of the function when you call it. This is messy. (b) Make all your
> functions have the type of the pointer that you actually pass, and
> convert the void *s internally. This can be done without a cast.
> I.e. you re-write:
>
> int STRcmp(const STRING *ss1, const STRING *ss2) { ... }
>
> to be
>
> int STRcmp(const void *vp1, const void *vp2)
> {
> const STRING *ss1 = vp1, *ss2 = vp2;
> /* body as before */
> }

STRcmp is used elsewhere as a comparison function in its own right.
If I used void pointers as parameters in STRcmp, I'd lose the type-
checking. I'm trying to avoid writing two different STRcmp functions,
one with STRING pointers, one with void pointers. That seems
unnecessary duplication.

Seebs

unread,

Nov 24, 2009, 1:02:51 PM11/24/09

to

On 2009-11-24, Mark R. Bannister <chap...@yahoo.com> wrote:
> I investigated this. It would seem that I'd have to put all the
> functions in a single source file for inlining to work. That would
> take me back to a 10,000 line engine.c again. Or should I #include
> them? I don't think I've ever seen a project that #includes a .c
> source file, only .h.

Nothing wrong with a huge source module with tons of small functions in it.

> So, I see where the error is coming from. The compiler is expecting a
> function that takes void pointers, and I'm giving it a function that
> takes different pointers instead. The code works fine and I've
> ignored the warnings for many years. However, it would be nice to get
> rid of these warnings from the compiler output. Any suggestions as to
> how to get around this would be very much appreciated. Wouldn't it be
> nice if I could use a cast....

You can.

But!

It is undefined behavior to call a function through a pointer of a different
sort. So if you have a function which is actually declared to take a
char * argument, and you call it through a pointer with a void * argument,
you lose.

Solution:

int my_example_strcmp(const void *v1, const void *v2) {
unsigned char *s1 = v1, *s2 = v2;
while (*s1 && *s2 && *s1 == *s2)
++s1, ++s2;
return *s1 - *s2;
}

Convert the arguments in the function, so its signature matches what the
function pointer looks like.

-s
--
Copyright 2009, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Ben Bacarisse

unread,

Nov 24, 2009, 1:15:20 PM11/24/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

The use (a) and live with all the casts. I don't see another way to
avoid the problem.

--
Ben.

Bill Cunningham

unread,

Nov 24, 2009, 2:07:36 PM11/24/09

to

"Richard" <rgr...@gmail.com> wrote in message

news:hebuk2$7og$1...@news.eternal-september.org...

> C is chosen as it is an efficient "bare metal" language in most
> cases. Naive design listening to his "never optimise too early" nonsense
> can often counter that.

On C. .5/10

Richard

unread,

Nov 24, 2009, 2:15:39 PM11/24/09

to

"Bill Cunningham" <nos...@nspam.invalid> writes:

Bill, we're talking about C. Go away.

Mark R. Bannister

unread,

Nov 25, 2009, 4:55:25 AM11/25/09

to

On 24 Nov, 18:15, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> "Mark R. Bannister" <chapte...@yahoo.com> writes:
> > On 24 Nov, 15:35, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> <snip>
> >> There are two choices. (a) Cast the pointer to the type expected by
> >> the function but be sure to cast it back to whatever is the real type
> >> of the function when you call it. This is messy.

<snip>

> > STRcmp is used elsewhere as a comparison function in its own right.
> > If I used void pointers as parameters in STRcmp, I'd lose the type-
> > checking. I'm trying to avoid writing two different STRcmp functions,
> > one with STRING pointers, one with void pointers. That seems
> > unnecessary duplication.
>

> Then use (a) and live with all the casts. I don't see another way to
> avoid the problem.

Can you give me an example of what you mean by (a)? I don't mean to
be thick, maybe I'm missing something, but I don't see how that would
work.

Thanks,
Mark.

Ben Bacarisse

unread,

Nov 25, 2009, 6:26:06 AM11/25/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

Lets start by naming two function types to simplify the examples:

typedef int cmp_voidp(const void *, const void *);
typedef int cmp_stringp(const struct String *, const struct String *);

You have a function that takes one or more parameters with a type of
"pointer to cmp_voidp":

int example(cmp_voidp *f)
{
/* ... */
}

The call

exmaple(STRcmp);

gives you a warning, because STRcmp is of type cmp_stringp and
evaluates to a pointer of type cmp_stringp * rather than cmp_voidp *.
To avoid the warning you must first cast the call:

example((cmp_voidp *)STRcmp)

You might be tempted to stop here since the warnings have gone away,
but that is the danger of casts. Inside example, you must cast the
pointer back to the type of the function actually being called. Since
the whole point of this is that 'example' gets passed lots of
different function types you will have lots of different casts decided
when you know what type of function you are calling:

int example(cmp_voidp *f)
{
/* ... */
switch (data->type) {
case TYPE_VOIDP:
return f(p1, p2); /* correct type already */
case TYPE_STRINGP:
return ((cmp_stringp *)f)(p1, p2);
/* ... */
}
}

In the second case, p1 and p2 are converted to pointers of the right
type because the function pointer has been cast. Without the cast, a
call like

example((cmp_voidp *)STRcmp)

causes STRcmp to be called with void * parameters silently interpreted
as struct pointers. This works on most machines you will find but is
technically wrong and does break on some odd hardware.

--
Ben.

Mark R. Bannister

unread,

Nov 27, 2009, 9:15:27 AM11/27/09

to

On 25 Nov, 11:26, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> > Can you give me an example of what you mean by (a)? I don't mean to
> > be thick, maybe I'm missing something, but I don't see how that would
> > work.
>
> Lets start by naming two function types to simplify the examples:
>
> typedef int cmp_voidp(const void *, const void *);
> typedef int cmp_stringp(const struct String *, const struct String *);

<snip>

Thanks Ben, this is all new to me. Despite programming C for almost
15 years and having some very good C reference books, I'm embarrassed
to say in all that time I never knew that you could cast function
types or even typedef functions. Well, I'm always open to learning
and trying new things, so here we go ..... !

Ben Bacarisse

unread,

Nov 27, 2009, 9:43:23 AM11/27/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

It is much more common to typedef function pointer types rather than
the function types themselves, but there are some advantages to the
latter. For example, it lets you declare a bunch of similar functions
very compactly:

typedef void binary_op(struct state *, struct value, struct value);

extern binary_op op_add, op_sub, op_mul, op_div /*...*/;

static binary_op *op_table[] = {
op_add, op_sub, op_mul, op_div /*...*/
};

The translation unit that defines the op_xxx functions needs to have
the parameters written out in full every time (with names this time)
but there is no need to do that just to declare them all.

--
Ben.

Keith Thompson

unread,

Nov 27, 2009, 11:37:39 AM11/27/09

to

"Mark R. Bannister" <chap...@yahoo.com> writes:

But you can't cast (or otherwise convert) function types. You can
cast pointer-to-function types.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"