Getting the size of a C function

john

unread,

Jan 22, 2010, 5:53:18 PM1/22/10

to

Hi,

I need to know the size of a function or module because I need to
temporarily relocate the function or module from flash into sram to
do firmware updates.

How can I determine that at runtime? The
sizeof( myfunction)
generates an error: "size of function unknown".

Thanks.

Bob

unread,

Jan 22, 2010, 6:03:32 PM1/22/10

to

If you give us the details on your target and tools, someone here will
surely be able to help you do the thing you *actually* want to do.
Bob

David Empson

unread,

Jan 22, 2010, 6:03:18 PM1/22/10

to

john <jo...@nospam.com> wrote:

> I need to know the size of a function or module because I need to
> temporarily relocate the function or module from flash into sram to
> do firmware updates.

In general, C does not provide a mechanism to find the size of a
function. Some compilers might implement sizeof(function) but it is not
standard C.

If your compiler always outputs functions to the object code in the same
order as they appear in the source code, you could take the address of
the next function and the address of the function in question, convert
them to (char *) and get the difference between them. This assumes you
never rearrange your source code - comment well!

If your compiler outputs functions in a somewhat unpredictable order
then this won't work.

The technique I used for a similar problem was to examine the object
code to determine the size of the function manually, added a safety
margin to allow for potential code growth, and embedded that as a
constant in the source code. It then needs to be re-checked after source
changes (or a revised compiler) to confirm that the size hasn't grown
too much.

--
David Empson
dem...@actrix.gen.nz

WangoTango

unread,

Jan 22, 2010, 6:04:03 PM1/22/10

to

In article <hjda8u$t4k$1...@speranza.aioe.org>, jo...@nospam.com says...

Good question, and I would like to know if there is an easy way to do it
during runtime, and a portable way would be nice too. I would probably
look at the map file and use the size I calculated from there, but
that's surely not runtime.

You can get the starting address of the function pretty easy, but how
about the end? Hmmm, gotta' think about that.

Jim

jacob navia

unread,

Jan 22, 2010, 6:50:37 PM1/22/10

to

john a écrit :

(1)
There is the method already mentioned that subtracts two function
addresses. If your compiler is "well behaved" that could work
except for the last function in the module...

(2)
Another method is to generate an assembly listing and insert at the end
of each function a "marker" by just using (the equivalent) of
.byte 0,1,2,3,4,5,6,7,8,9,8,7,6,5,4,3,2,1
Then, at runtime you load the code and search for the terminator marker
Obviously the terminator should contain at least one illegal instruction
to be sure that it doesn't appear in the code itself

(3)
Yet another method is to generate a linker map table and read the
size of each function from the table, what comes to method (1) but
at compile time.

(4) Another method is to locate all function prologues and
function epilogues ofthe functions in the code you generate.
Locating the prologue means searching for the sequence of
instructions that the compiler generates for each function start,
probably the saving of some registers and the allocating of stack
space for the local variables.
Caveat: It could be that for certain functions the compiler
doesn't generate any prologue... specially if the function
doesn't call any other functions and receives no arguments...

Locating the epilogue means searching for the return instruction
Caveat: It could be that the compiler generates several...
You should find the last one, before you find a prologue.

From all those possibilities, the option (3) looks the more
promising one to me. Method (1) isn't very precise and
there is the problem of the last function in a compilation unit.

Method 2 is a PITA since you have to generate the assembly,
insert the markers, re-assemble...

Method (4) needs a disassembler, and a LOT of parsing work,
and it is very sensitive to compilation options.

Vladimir Vassilevsky

unread,

Jan 22, 2010, 6:54:29 PM1/22/10

to

john wrote:

Several times I encountered the following construction:
//--------------------------
your_function()
{
}

next_function()
{
}

*fu() = next_function;
*bar() = your_function;

size_of_your_function = ((int)fu) - ((int)bar);
//-------------------------

Of course, this is not guaranteed to work as it depends on many things,
however I've seen that solution used in bootloaders.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

BGB / cr88192

unread,

Jan 22, 2010, 7:40:28 PM1/22/10

to

"john" <jo...@nospam.com> wrote in message
news:hjda8u$t4k$1...@speranza.aioe.org...

my recommendation:
in this case, it might actually be better advised to generate the function
as a chunk of arch-specific ASM or machine code (ASM is preferable IMO, but
requires an assembler...), which could then be located wherever (such as the
heap).

the reason for suggesting this is that, for many archs, relocating compiled
code (for example, via memcpy) may very well cause it to break. at least
with custom ASM, one can be more certain that the code will survive the
relocation.

another possibility would be to compile some code itself as a relocatable
module (such as an ELF or COFF object or image or whatever is common on the
arch), which can then be stored as a glob of binary data (this can be done
fairly easily by writing a tool to convert the module into a an array of
bytes in C syntax which can be linked into the image). when needed, this
module is itself relocated to the target address, and jumped to.

this would allow more complex modules to be used (and is less-effort in the
non-trivial case than would be writing it in ASM or raw machine code).

keep in mind that there is no really "good" or "general purpose" ways to do
these sorts of tasks.

> Thanks.

Grant Edwards

unread,

Jan 22, 2010, 8:32:35 PM1/22/10

to

On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:

> in this case, it might actually be better advised to generate the function
> as a chunk of arch-specific ASM or machine code (ASM is preferable IMO, but
> requires an assembler...), which could then be located wherever (such as the
> heap).

IMO, the "right" thing to do is to tell the compiler to put the
function into a separate section and then have it linked so
that it's "located" to run in RAM at the proper address but
stored in ROM.

That way you know the code will work correctly when it's run
from RAM. Defining approprate symbols in the linker command
file will allow the program to refer to the start and end of
the section's address in ROM.

The OP needs to spend some time studying the manuals for his
compiler and linker.

Keith Thompson

unread,

Jan 22, 2010, 8:40:55 PM1/22/10

to

WangoTango <Asga...@mindspring.com> writes:
> In article <hjda8u$t4k$1...@speranza.aioe.org>, jo...@nospam.com says...

>> I need to know the size of a function or module because I need to
>> temporarily relocate the function or module from flash into sram to
>> do firmware updates.
>>
>> How can I determine that at runtime? The
>> sizeof( myfunction)
>> generates an error: "size of function unknown".
>>

> Good question, and I would like to know if there is an easy way to do it
> during runtime, and a portable way would be nice too. I would probably
> look at the map file and use the size I calculated from there, but
> that's surely not runtime.
>
> You can get the starting address of the function pretty easy, but how
> about the end? Hmmm, gotta' think about that.

You can't even portably assume that &func is the memory address of the
beginning of the function. I think there are systems (AS/400) where
function pointers are not just machine addresses.

Given whatever it is you're doing, you're probably not too concerned
with portability, so that likely not to be an issue. But there's no
portable way in C to determine the size of a function, so you're more
likely to get help somewhere other than comp.lang.c.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Tim Wescott

unread,

Jan 22, 2010, 11:23:51 PM1/22/10

to

On Sat, 23 Jan 2010 12:03:18 +1300, David Empson wrote:

> john <jo...@nospam.com> wrote:
>
>> I need to know the size of a function or module because I need to
>> temporarily relocate the function or module from flash into sram to do
>> firmware updates.
>
> In general, C does not provide a mechanism to find the size of a
> function. Some compilers might implement sizeof(function) but it is not
> standard C.
>
> If your compiler always outputs functions to the object code in the same
> order as they appear in the source code, you could take the address of
> the next function and the address of the function in question, convert
> them to (char *) and get the difference between them. This assumes you
> never rearrange your source code - comment well!

I've seen it done like this:

whatever my_eeprom_burning_code()
{
// insert deathless prose here
}

void end_my_eeprom_burning_code(void)
{
}

As long as the second function doesn't get optimized away or moved,
you're home free.

--
www.wescottdesign.com

Tim Wescott

unread,

Jan 22, 2010, 11:26:53 PM1/22/10

to

Check your tools -- newer ones will often let you set the segment of a
function (usually with something like "#pragma ramcode"), and many of
those will automatically load a function into ram at startup.

If you can't do it on a function-by-function basis, you may be able to do
it file-by-file, or coerce the linker to relocate the text segment from
one whole object file into a segment of your choosing.

Then you either put that segment (with just your magic function(s)) into
RAM, or you find out that your linker will.

--
www.wescottdesign.com

David Empson

unread,

Jan 23, 2010, 12:07:34 AM1/23/10

to

Tim Wescott <t...@seemywebsite.com> wrote:

Except if the compiler outputs the functions in reverse order, as one
I've used does (which means you need a "begin_my_eeprom_burning_code"
dummy function instead). You need to know the pattern generated by your
particular compiler, which might depend on factors other than the order
the functions appear in the source code.

--
David Empson
dem...@actrix.gen.nz

Nobody

unread,

Jan 23, 2010, 12:10:18 AM1/23/10

to

On Fri, 22 Jan 2010 22:53:18 +0000, john wrote:

> I need to know the size of a function or module because I need to
> temporarily relocate the function or module from flash into sram to
> do firmware updates.

Do you need to be able to run it from RAM? If so, simply memcpy()ing it
may not work. And you would also need to copy anything which the function
calls (just because there aren't any explicit function calls in the source
code, that doesn't mean that there aren't any in the resulting object code).

Mark Borgerson

unread,

Jan 23, 2010, 12:28:55 AM1/23/10

to

In article <pan.2010.01.23....@nowhere.com>,
nob...@nowhere.com says...

At the expense of a few words of code and a parameter, you could do

int MoveMe(...., bool findend){
if(!findend){

// do all the stuff the function is supposed to do

} else Markend();

}

Where Markend is a function that pulls the return
address off the stack and stashes it somewhere
convenient. Markend may have to have some
assembly code. External code can then
subtract the function address from the address
stashed by Markend(), add a safety margin, and
know how many bytes to move to RAM.

Mark Borgerson

Paul Keinanen

unread,

Jan 23, 2010, 12:57:14 AM1/23/10

to

On Fri, 22 Jan 2010 22:53:18 +0000 (UTC), john <jo...@nospam.com>
wrote:

>I need to know the size of a function or module because I need to
>temporarily relocate the function or module from flash into sram to
>do firmware updates.

Do you actually want to execute that function while in RAM or just
store some bytes into safety during the update ?

If you intend to execute, the code must be position independent e.g.
with PC relative branch, however accessing some fixed addresses, such
as memory mapped peripherals, some absolute addressing mode must be
used, no PC relative addressing modes can not be used.

jacob navia

unread,

Jan 23, 2010, 3:03:37 AM1/23/10

to

Mark Borgerson a ï¿œcrit :

Sorry Mark but this is totally WRONG!

The return address contains the address where the CPU RETURNS TO
when the current function is finished, not the end of the
current function!!!

The return address will be in the middle of another function, that CALLED
this one.

Gordon Burditt

unread,

Jan 23, 2010, 3:44:53 AM1/23/10

to

>In general, C does not provide a mechanism to find the size of a
>function. Some compilers might implement sizeof(function) but it is not
>standard C.

There might not even be a clearly-defined *definition* for "the size of
a function". One obvious problem is if it's inlined several times in
several places. Where, exactly, does the function "start"? Where
does it "end"?

It's also possible that code is shared between functions.
If several functions have many places where the code does
something like:

errno = EINVAL;
unlock_critical_section();
return -1;

the compiler might generate code for that once and branch to it
from all of the functions. This is more likely to happen if "return"
isn't a one-instruction method of restoring registers, a stack frame
pointer if applicable, and adjusting the stack, if there is one.
It's the embedded processors that are more likely to need multiple
instructions to return from a function. They are also more likely
to need aggressive optimization for code space.

You could end up with the strange math that a compilation unit
containing 4 functions has 16k of code, but any one of the functions
need 10k of code.

>If your compiler always outputs functions to the object code in the same
>order as they appear in the source code,

That's a rather dubious assumption in the presence of inlining (and
C90 doesn't have a way to stop inlining). Also, the assumption
that a function consists of a contiguous block of code (worry about
the *data* later) dedicated to that function only is not guaranteed
to hold. In practice, it will probably work OK if you don't turn on
aggressive optimization.

Stefan Reuther

unread,

Jan 23, 2010, 7:05:39 AM1/23/10

to

john wrote:
> I need to know the size of a function or module because I need to
> temporarily relocate the function or module from flash into sram to
> do firmware updates.
>
> How can I determine that at runtime?

You can't in standard C, because functions are not contiguous objects.

Most environments have some way of placing a function in a special
section (using pragmas or things like __attribute__), and a possibility
to acquire position and size of that section (using linker magic).

In general, you cannot assume a function generates just a single blob of
assembly code in the ".text" sections. For example, functions containing
string or floating-point literals, or large switches, often generate
some data in ".rodata", static variables end up in ".data" or ".bss",
and if you're doing C++, you'll get some exception handling tables as well.

Stefan

Flash Gordon

unread,

Jan 23, 2010, 7:15:16 AM1/23/10

to

jacob navia wrote:
> john a écrit :
>> Hi,
>>
>> I need to know the size of a function or module because I need to
>> temporarily relocate the function or module from flash into sram to
>> do firmware updates.
>>
>> How can I determine that at runtime? The
>> sizeof( myfunction)
>> generates an error: "size of function unknown".
>>
>> Thanks.
>
> (1)

> Method (4) needs a disassembler, and a LOT of parsing work,
> and it is very sensitive to compilation options.

You forgot to mention the method which, in my experience, is by far the
best, most reliable, and easiest method.

Read the manual!

This is NOT a glib suggestion, on the one occasion where I needed to do
something similar, but for different reasons, I read the manual and low
and behold the implementation documented a nice and relatively easy way
to achieve when I wanted. In fact, using any other method was almost
guaranteed to produce a function that did not work correctly. After all,
there could be references to absolute addresses which will be wrong
after the code is moved!

In my case, it was compile the function and get it in to a specific
section (I can't remember how now) and then tell the linker to locate
the section at one location for programming in to the ROM but set up all
the addresses as if it would be placed in another section. Then use some
link-time constants (can't remember the details) for moving it. As I
say, it was all fully documented in the manuals!

There is every chance that someone on comp.arch.embedded might know how
to do it on the target platform, if the OP specifies the target platform.
--
Flash Gordon

BGB / cr88192

unread,

Jan 23, 2010, 11:30:41 AM1/23/10

to

"Grant Edwards" <inv...@invalid.invalid> wrote in message
news:hjdjjj$njp$1...@reader1.panix.com...

> On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>
>> in this case, it might actually be better advised to generate the
>> function
>> as a chunk of arch-specific ASM or machine code (ASM is preferable IMO,
>> but
>> requires an assembler...), which could then be located wherever (such as
>> the
>> heap).
>
> IMO, the "right" thing to do is to tell the compiler to put the
> function into a separate section and then have it linked so
> that it's "located" to run in RAM at the proper address but
> stored in ROM.
>
> That way you know the code will work correctly when it's run
> from RAM. Defining approprate symbols in the linker command
> file will allow the program to refer to the start and end of
> the section's address in ROM.
>

this is a little closer to the second option, of having a secondary image
file embedded as data...

> The OP needs to spend some time studying the manuals for his
> compiler and linker.
>

this is, assuming the linker or image format actually supports the "separate
section" idea...

dunno about ELF, but PE/COFF would not support this, since it would require
breaking some of the internal assumptions of the file format (for example,
that the image is continuous from ImageBase to ImageBase+ImageSize, ...).

ELF may have similar restrictions (actually, I think most ELF images are
position independent anyways, so one could relocate and adjust the GOT for
an image easily enough).

(note that embedding an additional PE/COFF of ELF image would not likely be
"that difficult", and the formats are not particularly difficult to work
with).
a fixed-address PE/COFF image is likely an easy case, since one can copy the
contents of the sections and then call into it.

for fixed-address, producing a raw binary image (supported by GNU ld, ...)
is also probably a good option, since in this case the resulting image can
be copied as a raw chunk of data (no need to relocate or worry about
file-format), and jumped into.

can't say so much about other file formats though...

James Harris

unread,

Jan 23, 2010, 1:26:17 PM1/23/10

to

On 23 Jan, 08:03, jacob navia <ja...@nospam.org> wrote:

...

> > At the expense of a few words of code and a parameter, you could do
>
> > int MoveMe(...., bool findend){
> > if(!findend){
>
> > // do all the stuff the function is supposed to do
>
> > } else Markend();
>
> > }
>
> > Where Markend is a function that pulls the return
> > address off the stack and stashes it somewhere
> > convenient.

...

> Sorry Mark but this is totally WRONG!
>
> The return address contains the address where the CPU RETURNS TO
> when the current function is finished, not the end of the
> current function!!!

So in Mark's example what will it be in Markend()?

> The return address will be in the middle of another function, that CALLED
> this one.

i.e. Moveme()?

James

Mark Borgerson

unread,

Jan 23, 2010, 1:28:06 PM1/23/10

to

In article <hjeah3$10g$1...@speranza.aioe.org>, ja...@nospam.org says...
> Mark Borgerson a écrit :

I think you missed a few points:

Inside Markend, The return address on the stack will be the address
after the call to Markend----which was purposely located at the end of
MoveMe. Then next few instructions after the call to
Markend will be the return from MoveMe (an RTS or equivalent with stack
cleanup).

Inside Markend, the return address on the stack will be
an address near the end of MoveMe. It is that address that
you need to save and make available for the computation
of the function length.

In assembly, the code in Moveme might look like this:

0900 MoveMe: sub.l #8, SP // make room for 8 bytes of locals
0904 test.l R14 // check the findend parameter in R14
0908 bne lbl1; // if true, just find end of function
....
.... // all the work of Moveme goes here
.... // and gets executed when findend is zero
....
1000 bra lbl2 // skip the markend call
1004 lbl1: bsr Markend
1008 lbl2: add.l #8, SP // clean up 8 bytes of local variables
1012 rts // return from MoveMe

When Markend is called at 1004, the address 1008 gets pushed on the
stack.

Inside Markend, you could do:

2040 Markend: Move SP, NearEnd // NearEnd is a global variable
2044 RTS

Someplace else, could do

MMLength = NearEnd - (unsigned long)&Moveme + 4;

When I was teaching introductory M68K assembly language, I used
to give exam problems with nested subroutine calls like this---some
with pushed local variables, and ask the students to show
the contents of the stack at some point in the function.
Those questions really separated the As from the Bs and
Cs!

NOTE: You have to make sure that your compiler doesn't convert
the Markend function to an inline sequence of instructions.

Mark Borgerson

unread,

Jan 23, 2010, 1:35:07 PM1/23/10

to

In article <hjes74...@stefan.msgid.phost.de>, stefa...@arcor.de
says...

That's a real good point. If the OP's goal was just to move the
function code--and not necessarily execute it after movement, he
may not care whether the bytes in the .rodata, .data, or .bss
segments get moved.

If the function has to be moved and executed, then it better
to be able to access the data in the .rodata, .data and .bss
segements---or not use data in any of those segments that are
in flash memory.

If you're moving the function to RAM because you can't execute
from Flash while updating flash, the function being moved
could be written to use only variables and data in RAM. This
might be the case if the function being moved is the Flash
write routine.

Now that I think about it, I may use this approach in writing
a firmware update routine for the MSP430---which has the
restrictions mentioned above.

Mark Borgerson

Ben Pfaff

unread,

Jan 23, 2010, 1:46:26 PM1/23/10

to

Mark Borgerson <mborg...@comcast.net> writes:

> At the expense of a few words of code and a parameter, you could do
>
>
> int MoveMe(...., bool findend){
> if(!findend){
>
> // do all the stuff the function is supposed to do
>
> } else Markend();
>
> }
>
>
> Where Markend is a function that pulls the return
> address off the stack and stashes it somewhere
> convenient. Markend may have to have some
> assembly code. External code can then
> subtract the function address from the address
> stashed by Markend(), add a safety margin, and
> know how many bytes to move to RAM.

You seem to be assuming that the compiler emits machine code that
is in the same order as the corresponding C code, i.e. that the
call to Markend() will occur at the end of MoveMe(). This is not
a good assumption.
--
"A lesson for us all: Even in trivia there are traps."
--Eric Sosman

Mark Borgerson

unread,

Jan 23, 2010, 2:29:04 PM1/23/10

to

In article <MPG.25c4df83a...@news.eternal-september.org>,
mborg...@comcast.net says...

Yikes! I'll have to mark myself down 5 points!!!

That should be
2040 Markend: Move @SP, NearEnd // NearEnd is a global variable

I need to save the data pointed to by the stack pointer, not the
contents of the stack pointer itself.

>
> Someplace else, could do
>
> MMLength = NearEnd - (unsigned long)&Moveme + 4;
>
>
> When I was teaching introductory M68K assembly language, I used
> to give exam problems with nested subroutine calls like this---some
> with pushed local variables, and ask the students to show
> the contents of the stack at some point in the function.
> Those questions really separated the As from the Bs and
> Cs!
>
> NOTE: You have to make sure that your compiler doesn't convert
> the Markend function to an inline sequence of instructions.
>
>

I also realized that, on the MSP430, I don't even need the
function call. At the end of the function whose
length I want to determine, I simply add the assembly
language:

mov PC, NearEnd

Both these methods do require some assembly language and
are processor dependent. The compiler that I'm using on
the MSP430 (Imagecraft), allows inline assembly, so
the instruction above would be

asm("mov PC, %NearEnd\n"); // the % is used to reference a C
//variable

I'm reasonably confident that I can use this technique to move
a flash-write routine, but I will have to be very careful
about using global variables, since the compiler produces
PC relative references to global and static variables. Those
references will be hosed when the code is moved.

Mark Borgerson

unread,

Jan 23, 2010, 2:37:26 PM1/23/10

to

In article <87y6jop...@blp.benpfaff.org>, b...@cs.stanford.edu
says...

I'll paraphrase the old Reagan maxim: "assume, but verify". I
did a test run with an MSP-430 compiler and the call was at
the end. For that particular processor, as I later discovered
and noted in another post, you don't even need the
function call. You can save the contents of the PC at the
end of the function with a line of assembly.

This would certainly be a dangerous technique on a processor
with multi-threading and possible out-of-order execution.
I think it will work OK on the MSP430 that is the CPU where
I am working on a flash-burning routine.

Mark Borgerson

bartc

unread,

Jan 23, 2010, 2:46:47 PM1/23/10

to

"Mark Borgerson" <mborg...@comcast.net> wrote in message
news:MPG.25c428eeb...@news.eternal-september.org...

> In article <pan.2010.01.23....@nowhere.com>,
> nob...@nowhere.com says...
>> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote:
>>
>> > I need to know the size of a function or module because I need to
>> > temporarily relocate the function or module from flash into sram to
>> > do firmware updates.
>>
>> Do you need to be able to run it from RAM? If so, simply memcpy()ing it
>> may not work. And you would also need to copy anything which the function
>> calls (just because there aren't any explicit function calls in the
>> source
>> code, that doesn't mean that there aren't any in the resulting object
>> code).
>>
>>
> At the expense of a few words of code and a parameter, you could do
>
>
> int MoveMe(...., bool findend){
> if(!findend){
>
> // do all the stuff the function is supposed to do
>
> } else Markend();
>
> }
>

If you're going to add a special parameter (and assume the return type is
compatible with a return address), it might be possible to use gcc's feature
of obtaining the address of a label.

Then findend can return the address of a label placed near the closing brace
of the function (which possibly may be less likely to be rearranged than a
function call).

int MoveMe(...., bool findend){
if(findend) return (int)&&endoffunction;

// do all the stuff the function is supposed to do

endoffunction:
return 0;
}

--
Bartc

Ben Pfaff

unread,

Jan 23, 2010, 2:59:14 PM1/23/10

to

Mark Borgerson <mborg...@comcast.net> writes:

> In article <87y6jop...@blp.benpfaff.org>, b...@cs.stanford.edu
> says...
>> Mark Borgerson <mborg...@comcast.net> writes:
>> You seem to be assuming that the compiler emits machine code that
>> is in the same order as the corresponding C code, i.e. that the
>> call to Markend() will occur at the end of MoveMe(). This is not
>> a good assumption.
>

> This would certainly be a dangerous technique on a processor
> with multi-threading and possible out-of-order execution.
> I think it will work OK on the MSP430 that is the CPU where
> I am working on a flash-burning routine.

Threading and out-of-order execution has little if anything to do
with it. The issue is the order of the code emitted by compiler,
not the order of the code's execution.
--
Ben Pfaff
http://benpfaff.org

Mark Borgerson

unread,

Jan 23, 2010, 4:17:45 PM1/23/10

to

In article <87tyucp...@blp.benpfaff.org>, b...@cs.stanford.edu

says...
> Mark Borgerson <mborg...@comcast.net> writes:
>
> > In article <87y6jop...@blp.benpfaff.org>, b...@cs.stanford.edu
> > says...
> >> Mark Borgerson <mborg...@comcast.net> writes:
> >> You seem to be assuming that the compiler emits machine code that
> >> is in the same order as the corresponding C code, i.e. that the
> >> call to Markend() will occur at the end of MoveMe(). This is not
> >> a good assumption.
> >
> > This would certainly be a dangerous technique on a processor
> > with multi-threading and possible out-of-order execution.
> > I think it will work OK on the MSP430 that is the CPU where
> > I am working on a flash-burning routine.
>
> Threading and out-of-order execution has little if anything to do
> with it. The issue is the order of the code emitted by compiler,
> not the order of the code's execution.
>

But woudn't an optimizing compiler generating code for a
complex processor be more likely to compile optimize in
a way that changed the order of operations? I think
that might apply particularly to a call to a function
that returns no result to be used in a specific
place inside the outer function.

Mark Borgerson

Willem

unread,

Jan 23, 2010, 4:35:45 PM1/23/10

to

Mark Borgerson wrote:
) In article <87tyucp...@blp.benpfaff.org>, b...@cs.stanford.edu
) says...
)> Mark Borgerson <mborg...@comcast.net> writes:
)>
)> > In article <87y6jop...@blp.benpfaff.org>, b...@cs.stanford.edu
)> > says...
)> >> Mark Borgerson <mborg...@comcast.net> writes:
)> >> You seem to be assuming that the compiler emits machine code that
)> >> is in the same order as the corresponding C code, i.e. that the
)> >> call to Markend() will occur at the end of MoveMe(). This is not
)> >> a good assumption.
)> >
)> > This would certainly be a dangerous technique on a processor
)> > with multi-threading and possible out-of-order execution.
)> > I think it will work OK on the MSP430 that is the CPU where
)> > I am working on a flash-burning routine.
)>
)> Threading and out-of-order execution has little if anything to do
)> with it. The issue is the order of the code emitted by compiler,
)> not the order of the code's execution.
)>
) But woudn't an optimizing compiler generating code for a
) complex processor be more likely to compile optimize in
) a way that changed the order of operations? I think
) that might apply particularly to a call to a function
) that returns no result to be used in a specific
) place inside the outer function.

More specifically, it could generate code like this:
(example in pseudocode)

(begin MoveMe)
TEST var
SKIP NEXT on zero
JUMP Markend
... ; the rest of the code
RETURN
(end MoveMe)

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Jon Kirwan

unread,

Jan 23, 2010, 5:10:09 PM1/23/10

to

Ben quite correctly brought you up short on the right point.
Your example was, just to refresh ourselves:

>: int MoveMe ( ...., bool findend ) {
>: if ( !findend ) {
>: // do normal function stuff
>: } else
>: Markend();
>: }

Let's divert from this for a moment and take the case of a
for-loop in c. It looks like:

>: for ( init-block; condition; iterate-block )
>: body-block;

A compiler will often translate this into this form:

>: init-block;
>: goto A;
>: B: body-block;
>: C: iterate-block;
>: A: if ( condition ) goto B;
>: D:

(The reason for the C label is to support the continue-
statement and the reason for the D label is to support a
break-statement, of course.)

The straight interpretation would have been more like this:

>: init-block;
>: A: if ( !condition ) goto D;
>: B: body-block;
>: C: iterate-block;
>: goto A;
>: D:

But note that the execution of the for-loop's main body,
presumed by the compiler to have "many iterations" as a
reasonable guess, includes execution for the "goto A"
statement in each and every iteration. But so is, in effect,
the conditional test, too. In other words, it takes longer
to execute the body, even if that only means the execution of
one jump instruction. It's more efficient to redesign the
model used by the compiler to the first example I gave,
merely because the c compiler takes the position that the
added one-time execution of the first "goto A" will be the
lower cost approach (which it almost always will be.)

Now let's assume that the compiler takes the position that
the first case of an if-statement section is the more
frequently travelled one. In other words, when the
conditional case is executed, it will more often be "true"
than "false." The model used might very well then be to
convert:

>: if ( condition )
>: s1-block;
>: else
>: s2-block;

into:

>: if ( !condition ) goto A;
>: s2-block;
>: goto B;
>: A: s1-block;
>: B:

This provides s1-block execution with one less jump and
therefore lets it execute slightly faster with the idea that
it is the preferred path.

So let's revisit your example again in this light:

>: int MoveMe ( ...., bool findend ) {
>: if ( !findend ) {
>: // do normal function stuff
>: } else
>: Markend();
>: }

This _may_ be taken by a c compiler to be:

>: int MoveMe ( ...., bool findend ) {
>: if ( findend ) goto A;
>: Markend();
>: goto B;
>: A: // do normal function stuff
>: B:
>: }

Leaving your function call to Markend not exactly where you'd
have liked to see it occur.

An old book you can pick up talking about a method used to
_explicitly_ inform the compiler about statistics of branch
likelihoods is the Ph.D. thesis by John Ellis:

http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2

Worth a read, some snowy day.

Jon

Mark Borgerson

unread,

Jan 23, 2010, 6:07:23 PM1/23/10

to

In article <slrnhlmqth....@turtle.stack.nl>, wil...@stack.nl
says...

I've actually seen constructs like that intentionally
coded in assembly language, since it saves the
address push and pop you would need need in a branch
to a subroutine. I haven't seen it recently
in compiler output, but that may be because I
limit optimization to make debugging easier. Since
I do limited numbers of systems in a niche market,
I save money by spending a few extra dollars on
more memory and CPU cycles if it saves me a few
hours of debugging time.

In any of these instances, I would certainly review
the assembly code to make sure the compiler was doing
what I intended in the order I wanted. Maybe programmers
in comp.lang.c don't do that as often as programmers
in comp.arch.embedded. ;-)

Mark Borgerson

unread,

Jan 23, 2010, 6:18:25 PM1/23/10

to

In article <fvqml5lfdi277l5ri...@4ax.com>,
jo...@infinitefactors.org says...

I've also run across main processing loops such as

void MainLoop(void)
while(1){
get user input
execute commands
}
MarkEnd();
}

where MarkEnd doesn't appear in the generated machine
code, because the compiler, even at lowest optimization
setting, recognizes that the code after the loop
will never get executed.

That could certainly occur. I would be interested in the logic
that could come to the conclusion that one or the other
of the branches would be more likely to occur. I guess the
compiler could check all the calls to MoveMe and compare the
number of times the findend parameter was true and false. However that
might be pretty difficult if a variable was used.

Still, a good reason, as I've said in other posts, to look
at the resulting assembly langauage. I did it for one
MSP430 compiler, and it worked the way I wanted. YMMV.

I wonder how many compilers would make that kind of optimization
and under which optimization settings.

>
> An old book you can pick up talking about a method used to
> _explicitly_ inform the compiler about statistics of branch
> likelihoods is the Ph.D. thesis by John Ellis:
>
> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
>
> Worth a read, some snowy day.

Those are pretty rare in Corvallis. Only one or two so far this winter.
Now, rainy days----those I get in plentitude!
>
Mark Borgerson

Jon Kirwan

unread,

Jan 23, 2010, 7:44:58 PM1/23/10

to

Yes, of course. That is another possibility. The intended
function may be essentially the "main loop" of the code and
as such never returns. However, whether or not MarkEnd()
were optimized out, it wouldn't ever get executed anyway. So
you'd never get the address stuffed into something useful...
and so it doesn't even matter were it that the compiler kept
the function call. So it makes a good case against your
approach for an entirely different reason than optimization
itself.

I wasn't suggesting that the optimizer includes a feature
where it "tries" to adduce the likelihood. I was suggesting
the idea that the compiler writer makes the 'a priori'
decision that it is.

Think of it this way. Ignorant of application specific
information, the compiler writer has two options to take when
considering the if..else case's approach. Regardless of
which way the compiler author chooses, one of the two blocks
will get a run-time preference. So, does the compiler author
_choose_ to prefer the if-case or the else-case? Without
knowledge, which way would _you_ decide to weigh in on?
Either way you go, you are making a choice. No escaping that
fact.

Now, the Bulldog compiler provides a way for the author of
the code to supply known information _or_ to use run-time
profiling to provide that information, automatically. But
I'm not talking about this case. That's for another
discussion. I only pointed that out for leisure reading. Not
as a point in this specific discussion.

>I guess the
>compiler could check all the calls to MoveMe and compare the
>number of times the findend parameter was true and false. However that
>might be pretty difficult if a variable was used.

Run-time profiling could provide that information. But that
wasn't anything I wanted you worrying over in this talk. It
distracts from the central point -- which is that a compiler
writer, sans application knowledge and sans anything in the
compiler or compiler syntax provided to the application coder
to better inform him/her about which way to go, must choose.
Either prefer the if-case or prefer the else-case. There is
no other option. So which way would you go?

>Still, a good reason, as I've said in other posts, to look
>at the resulting assembly langauage. I did it for one
>MSP430 compiler, and it worked the way I wanted. YMMV.

Indeed. I think the point here is that one is left entirely
to the vagaries of the compiler author. And on that point,
they may decide to go either way. There is NOTHING in the c
language itself to help them decide which is better.

>I wonder how many compilers would make that kind of optimization
>and under which optimization settings.

I think this question is moot. The point I was making
remains even _without_ optimizations that may help inform the
compiler about frequency of execution. So there is no need
to argue this point.

>> An old book you can pick up talking about a method used to
>> _explicitly_ inform the compiler about statistics of branch
>> likelihoods is the Ph.D. thesis by John Ellis:
>>
>> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
>>
>> Worth a read, some snowy day.
>
>Those are pretty rare in Corvallis. Only one or two so far this winter.
>Now, rainy days----those I get in plentitude!
>
>Mark Borgerson

Hehe. I live near Mt. Hood at an elevation of about 1000'
ASL. So I get three feet of snow and ice, from time to time.
I've had to use my JD 4320 tractor on more than one occasion!
:)

Jon

Mark Borgerson

unread,

Jan 23, 2010, 9:41:37 PM1/23/10

to

In article <uq4nl51thfeujhkgk...@4ax.com>,
jo...@infinitefactors.org says...

> On Sat, 23 Jan 2010 15:18:25 -0800, Mark Borgerson
> <mborg...@comcast.net> wrote:
>

<<SNIP discussion of unpredictable, but legal, compiler behavior>>

> >
> >I've also run across main processing loops such as
> >
> >void MainLoop(void)
> > while(1){
> > get user input
> > execute commands
> > }
> > MarkEnd();
> >}
> >
> >where MarkEnd doesn't appear in the generated machine
> >code, because the compiler, even at lowest optimization
> >setting, recognizes that the code after the loop
> >will never get executed.
>
> Yes, of course. That is another possibility. The intended
> function may be essentially the "main loop" of the code and
> as such never returns. However, whether or not MarkEnd()
> were optimized out, it wouldn't ever get executed anyway. So
> you'd never get the address stuffed into something useful...
> and so it doesn't even matter were it that the compiler kept
> the function call. So it makes a good case against your
> approach for an entirely different reason than optimization
> itself.

Well, I wuould not dream of using this approach on a function
that never returns. OTOH a flash-update routine had better
return, or it won't be particularly useful (unless your goal
is to test the write-endurance of the flash) ;-)
>
<<SNIP>>

> >> An old book you can pick up talking about a method used to
> >> _explicitly_ inform the compiler about statistics of branch
> >> likelihoods is the Ph.D. thesis by John Ellis:
> >>
> >> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
> >>
> >> Worth a read, some snowy day.
> >
> >Those are pretty rare in Corvallis. Only one or two so far this winter.
> >Now, rainy days----those I get in plentitude!
> >
> >Mark Borgerson
>
> Hehe. I live near Mt. Hood at an elevation of about 1000'
> ASL. So I get three feet of snow and ice, from time to time.
> I've had to use my JD 4320 tractor on more than one occasion!
> :)
>

Mark Borgerson

Grant Edwards

unread,

Jan 23, 2010, 10:37:37 PM1/23/10

to

On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>
> "Grant Edwards" <inv...@invalid.invalid> wrote in message
> news:hjdjjj$njp$1...@reader1.panix.com...
>> On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>>
>>> in this case, it might actually be better advised to generate
>>> the function as a chunk of arch-specific ASM or machine code
>>> (ASM is preferable IMO, but requires an assembler...), which
>>> could then be located wherever (such as the heap).
>>
>> IMO, the "right" thing to do is to tell the compiler to put
>> the function into a separate section and then have it linked
>> so that it's "located" to run in RAM at the proper address but
>> stored in ROM.
>>
>> That way you know the code will work correctly when it's run
>> from RAM. Defining approprate symbols in the linker command
>> file will allow the program to refer to the start and end of
>> the section's address in ROM.
>
> this is a little closer to the second option, of having a
> secondary image file embedded as data...

Yup, it's pretty much exactly that.

>> The OP needs to spend some time studying the manuals for his
>> compiler and linker.
>
> this is, assuming the linker or image format actually supports
> the "separate section" idea...

Every C compiler/toolchain I've used for embedded systems
development for the past 25 years supported things like that.
If his tools don't support multiple sections, then the first
order of business is to find a decent toolchain.

> dunno about ELF,

ELF supports multile sections, and I've done exactly such
things with ELF-based toolchains (Gnu binutils and GCC) when
working on stuff like bootloaders where the memory map changes
completely part-way through the program as the memory
controller gets configured.n

> but PE/COFF would not support this, since it would require
> breaking some of the internal assumptions of the file format
> (for example, that the image is continuous from ImageBase to
> ImageBase+ImageSize, ...).
>
> ELF may have similar restrictions (actually, I think most ELF
> images are position independent anyways,

That depends on the compiler options and linker command file.
In my experience, "executable" ELF files on embedded systems
(images that are ready to load into RAM and run) are generally
not relocatable.

> can't say so much about other file formats though...

The COFF-based toolchains I've used all seem to support
multiple sections, but that may have been due to
vendor-specific extensions.

--
Grant

BGB / cr88192

unread,

Jan 24, 2010, 12:44:57 AM1/24/10

to

"Grant Edwards" <inv...@invalid.invalid> wrote in message

news:hjgfa1$jcd$1...@reader1.panix.com...

> On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>>
>> "Grant Edwards" <inv...@invalid.invalid> wrote in message
>> news:hjdjjj$njp$1...@reader1.panix.com...
>>> On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>>>

<snip>

>>
>> this is a little closer to the second option, of having a
>> secondary image file embedded as data...
>
> Yup, it's pretty much exactly that.
>

ok.

>>> The OP needs to spend some time studying the manuals for his
>>> compiler and linker.
>>
>> this is, assuming the linker or image format actually supports
>> the "separate section" idea...
>
> Every C compiler/toolchain I've used for embedded systems
> development for the past 25 years supported things like that.
> If his tools don't support multiple sections, then the first
> order of business is to find a decent toolchain.
>

well, I haven't personally had much experience with embedded systems, so I
am not sure here.

>> dunno about ELF,
>
> ELF supports multile sections, and I've done exactly such
> things with ELF-based toolchains (Gnu binutils and GCC) when
> working on stuff like bootloaders where the memory map changes
> completely part-way through the program as the memory
> controller gets configured.n
>

yes, I know it has multiple sections, but AFAIK it is generally assumed that
the final image is in a continuous region of memory (with the sections
generally packed end-to-end), at least in the cases I have seen. granted, in
cases I have seen, ELF has usually been x86 and PIC as well (the default
build for Linux).

>> but PE/COFF would not support this, since it would require
>> breaking some of the internal assumptions of the file format
>> (for example, that the image is continuous from ImageBase to
>> ImageBase+ImageSize, ...).
>>
>> ELF may have similar restrictions (actually, I think most ELF
>> images are position independent anyways,
>
> That depends on the compiler options and linker command file.
> In my experience, "executable" ELF files on embedded systems
> (images that are ready to load into RAM and run) are generally
> not relocatable.
>

interesting...

well, I have really only seen ELF on Linux on x86, and there it is almost
invariably position-independent.

granted, I don't know what other systems do...

>> can't say so much about other file formats though...
>
> The COFF-based toolchains I've used all seem to support
> multiple sections, but that may have been due to
> vendor-specific extensions.
>

COFF has multiple sections, but PE/COFF (in particular) also has ImageBase
and ImageSize fields (I forget their exact official names right off), which
are located in the optional header, which is mandatory in PE/COFF, and also
contains things like the subsystem (Console, GUI, ...), and references to
the import and export tables (related to DLL's), ...

AFAIK, PE/COFF also tends to assume that the image is continuous between
these addresses, and also that all loadable sections be between them (doing
otherwise could break the DLL / EXE loader). however, they may support
additional "non-loadable" sections, which AFAIK need not obey this (but are
usually ignored by the loader).

granted, to really know, I would have to dig around more closely in the
PE/COFF spec (and Microsoft's sometimes confusing writing style, which
caused great fun in a few cases when trying to debug my custom EXE/DLL
loader...).

however, I can't say much about how much of this is common with other
variants of COFF (IOW: the ones which don't necessarily begin with an MS-DOS
stub, ...).

nevermind the added layer of hackery needed for .NET ...

I guess it all depends then on whether the particular linker for the
particular target supports non-continuous images then, or if alternative
means would be needed instead...

or such...

Flash Gordon

unread,

Jan 24, 2010, 5:36:49 AM1/24/10

to

<snip>

Some times you *do* write such functions so they never return. Whilst
reprogramming the flash it keeps kicking the watchdog, but it stops when
it's finished and the watchdog resets the system thus booting it in to
the new code. Or it might branch to the reset (or power-up) vector
rather than return. In fact, returning could easily be impossible
because the code from which the function was called is no longer there!
--
Flash Gordon

Flash Gordon

unread,

Jan 24, 2010, 5:05:51 AM1/24/10

to

BGB / cr88192 wrote:
> "Grant Edwards" <inv...@invalid.invalid> wrote in message
> news:hjgfa1$jcd$1...@reader1.panix.com...
>> On 2010-01-23, BGB / cr88192 <cr8...@hotmail.com> wrote:
>>> "Grant Edwards" <inv...@invalid.invalid> wrote in message

<snip>

> well, I haven't personally had much experience with embedded systems, so I
> am not sure here.

Then believe people who have...

>>> dunno about ELF,
>> ELF supports multile sections, and I've done exactly such
>> things with ELF-based toolchains (Gnu binutils and GCC) when
>> working on stuff like bootloaders where the memory map changes
>> completely part-way through the program as the memory
>> controller gets configured.n
>>
>
> yes, I know it has multiple sections, but AFAIK it is generally assumed that
> the final image is in a continuous region of memory (with the sections
> generally packed end-to-end), at least in the cases I have seen. granted, in
> cases I have seen, ELF has usually been x86 and PIC as well (the default
> build for Linux).

It simply isn't true for embedded systems.

>>> but PE/COFF would not support this, since it would require
>>> breaking some of the internal assumptions of the file format
>>> (for example, that the image is continuous from ImageBase to
>>> ImageBase+ImageSize, ...).
>>>
>>> ELF may have similar restrictions (actually, I think most ELF
>>> images are position independent anyways,
>> That depends on the compiler options and linker command file.
>> In my experience, "executable" ELF files on embedded systems
>> (images that are ready to load into RAM and run) are generally
>> not relocatable.
>
> interesting...
>
> well, I have really only seen ELF on Linux on x86, and there it is almost
> invariably position-independent.
>
> granted, I don't know what other systems do...

Then believe people who do...

<snip>

> I guess it all depends then on whether the particular linker for the
> particular target supports non-continuous images then, or if alternative
> means would be needed instead...
>
> or such...

Or you could believe people with experience on embedded systems. It is a
common requirement to have non-contiguous sections, and sections which
are loaded in to one location but run from another, and all sorts of
funky things. Sometimes you can execute code faster from RAM than ROM,
so you move the code at run-time, having boot loaders which get code
from one place (sometimes on a different processor) and put it in
another is common, and I've even had gaps in the ROM where there was
RAM! All of which means having separate sections which are not adjacent.

It's all specific to the given tool-chain as to the best way to achieve
it though.
--
Flash Gordon

Mark Borgerson

unread,

Jan 24, 2010, 12:19:01 PM1/24/10

to

In article <efkt27x...@news.flash-gordon.me.uk>,
sm...@spam.causeway.com says...

Hmmmm. I hadn't thought of that watchdog idea, since TI recommends
shutting off the watchdog and disabling interrupts while programming
flash.

I also agree about the return not being normal---there's probably not
much chance returning to the address on the stack is going to
work out, so a reset is probably the best idea after a firmware
update.

I should have said that I wouldn't use this idea on a function
designed to run forever---or at least not one that the compiler
might think runs forever. I would also examine the
resulting code to make sure the compiler was doing what I
intended.

I think that the ideas I have described will work on some
processors and compilers for some functions, but not
on all compilers for all processors and functions. If you
do a lot of embedded systems programming, restrictions
like that are nothing new.

Mark Borgerson

Albert van der Horst

unread,

Jan 24, 2010, 3:02:18 PM1/24/10

to

In article <hjda8u$t4k$1...@speranza.aioe.org>, john <jo...@nospam.com> wrote:
>Hi,

>
>I need to know the size of a function or module because I need to
>temporarily relocate the function or module from flash into sram to
>do firmware updates.
>

>How can I determine that at runtime? The
>sizeof( myfunction)
>generates an error: "size of function unknown".

Admit it, you do something that can't be done in C.
By far the simplest is to generate assembler code, and
add a small instrumentation to that.
Start by accessing the function through a pointer to subroutine.
Then you can store an sram address there when needed.

>
>Thanks.

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

David Brown

unread,

Jan 24, 2010, 3:00:18 PM1/24/10

to

Keith Thompson wrote:
> WangoTango <Asga...@mindspring.com> writes:
>> In article <hjda8u$t4k$1...@speranza.aioe.org>, jo...@nospam.com says...

>>> I need to know the size of a function or module because I need to
>>> temporarily relocate the function or module from flash into sram to
>>> do firmware updates.
>>>
>>> How can I determine that at runtime? The
>>> sizeof( myfunction)
>>> generates an error: "size of function unknown".
>>>

>> Good question, and I would like to know if there is an easy way to do it
>> during runtime, and a portable way would be nice too. I would probably
>> look at the map file and use the size I calculated from there, but
>> that's surely not runtime.
>>
>> You can get the starting address of the function pretty easy, but how
>> about the end? Hmmm, gotta' think about that.
>
> You can't even portably assume that &func is the memory address of the
> beginning of the function. I think there are systems (AS/400) where
> function pointers are not just machine addresses.
>

Closer to comp.arch.embedded, &func may not be the memory address of a
function on smaller micros with more than 64KB (or sometimes 64K words)
of flash. gcc for the AVR, for example, uses trampolines for function
pointers on devices with more than 64K words flash - &func gives the
address of a jump instruction in the lower 64K memory, which jumps to
the real function. That way you can use 16-bit function pointers with
larger memories.

> Given whatever it is you're doing, you're probably not too concerned
> with portability, so that likely not to be an issue. But there's no
> portable way in C to determine the size of a function, so you're more
> likely to get help somewhere other than comp.lang.c.
>

David Brown

unread,

Jan 24, 2010, 4:44:49 PM1/24/10

to

Mark Borgerson wrote:
> In article <pan.2010.01.23....@nowhere.com>,
> nob...@nowhere.com says...
>> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote:
>>

>>> I need to know the size of a function or module because I need to
>>> temporarily relocate the function or module from flash into sram to
>>> do firmware updates.

>> Do you need to be able to run it from RAM? If so, simply memcpy()ing it
>> may not work. And you would also need to copy anything which the function
>> calls (just because there aren't any explicit function calls in the source
>> code, that doesn't mean that there aren't any in the resulting object code).
>>
>>
> At the expense of a few words of code and a parameter, you could do
>
>
> int MoveMe(...., bool findend){
> if(!findend){
>
> // do all the stuff the function is supposed to do
>
> } else Markend();
>
> }
>
>

> Where Markend is a function that pulls the return
> address off the stack and stashes it somewhere
> convenient. Markend may have to have some
> assembly code. External code can then
> subtract the function address from the address
> stashed by Markend(), add a safety margin, and
> know how many bytes to move to RAM.
>
>

> Mark Borgerson
>

Anything that relies on the compiler being stupid, or deliberately
crippled ("disable all optimisations") or other such nonsense is a bad
solution. It is conceivable that it might happen to work - /if/ you can
get the compiler in question to generate bad enough code. But it is
highly dependent on the tools in question, and needs to be carefully
checked at the disassembly level after any changes.

In this particular example of a highly risky solution, what happens when
the compiler generates proper code? The compiler is likely to generate
the equivalent of :

int MoveMe(..., bool findend) {
if (findend) "jump" Markend();

// do all the stuff
}

Or perhaps it will inline Markend, MoveMe, or both. Or maybe it will
figure out that MoveMe is never called with "findend" set, and thus
optimise away that branch. All you can be sure of, is that there is no
way you can demand that a compiler produces directly the code you
apparently want it to produce - C is not assembly.

David Brown

unread,

Jan 24, 2010, 4:53:01 PM1/24/10

to

You get good and bad compilers for all sorts of processors, and even a
half-decent one will be able to move code around if it improves the
speed or size of the target - something that can apply on any size of
processor.

<snip>

>
> In any of these instances, I would certainly review
> the assembly code to make sure the compiler was doing
> what I intended in the order I wanted. Maybe programmers
> in comp.lang.c don't do that as often as programmers
> in comp.arch.embedded. ;-)
>

I don't know about typical "comp.lang.c" programmers, but typical
"comp.arch.embedded" programmers use compilers that generate tight code,
and they let the compiler do its job without trying to force the tools
into their way of thinking. At least, that's the case for good embedded
programmers - small and fast code means cheap and reliable
microcontrollers in this line of work. And code that has to be
disassembled and manually checked at every change is not reliable or
quality code.

James Harris

unread,

Jan 24, 2010, 6:13:15 PM1/24/10

to

On 24 Jan, 21:44, David Brown <david.br...@hesbynett.removethisbit.no>
wrote:
...

> Anything that relies on the compiler being stupid, or deliberately
> crippled ("disable all optimisations") or other such nonsense is a bad
> solution.

I *think* Mark is aware of the limitations of his suggestion but there
seems to be no C way to solve the OP's problem. It does sound like the
problem only needs to be solved as a one-off in a particular
environment.

That said, what about taking function pointers for all functions and
sorting their values? It still wouldn't help with the size of the last
function. Can we assume the data area would follow the code? I guess
not.

James

Jon Kirwan

unread,

Jan 24, 2010, 6:19:52 PM1/24/10

to

You give me a great way to segue into something. There are
cases where you simply have no other option than to do
exactly that. I'll provide one example. There are others.

I was working on a project using the PIC18F252 processor and,
at the time, the Microchip c compiler was in its roughly-v1.1
incarnation. We'd spent about 4 months in development time
and the project was nearing completion when we discovered an
intermittent (very rarely occurred) problem in testing. Once
in a while, the program would emit strange outputs that we
simply couldn't understand when closely examining and walking
through the code that was supposed to generate that output.
It simply wasn't possible. Specific ASCII characters were
being generated that simply were not present in the code
constants.

In digging through the problem, by closely examining the
generated assembly output, I discovered one remarkable fact
that led me to imagine a possibility that might explain
things. The Microchip c compiler was using static variables
for compiler temporaries. And it would _spill_ live
variables that might be destroyed across a function call into
them. They would be labelled something like __temp0 and the
like.

There was _no_ problem when the c compiler was doing that for
calls made to functions within the same module, because they
had anticipated that there might be more than one compiler
temporary needed in nested calls and they added the extra
code in the c compiler to observe if a decendent function,
called by a parent, would also need to spill live variables
and would then construct more __temp1... variables to cover
that case. Not unlike what good 8051 compilers might do when
generating static variable slots for nested call parameters
for efficiency (counting spills all the way down, so to
speak.)

However, when calling functions in _other_ modules, where the
c compiler had _no_ visibility about what it had already done
over there on a separate compilation, it had no means to do
that and, of course, there became a problem. What was
spilled into __temp0 in module-A was also spilled into
__temp0 in module-B and, naturally, I just happened to have a
case where that became a problem under the influence of
interrupt processing. I had completely saved _all_ registers
at the moment of the interrupt code before attempting to call
any c functions, of course. That goes without saying. But
I'd had _no_ idea that I might have to save some statics
which may, or may not, at the time be "live."

Worse, besides the fact that there was no way I could know in
advance which naming the c compiler would use in any
circumstance, the c compiler chose these names in such a way
that they were NOT global or accessible either to c code or
to assembly. I had to actually _observe_ in the linker file
the memory location where they resided and make sure that the
interrupt routine protected them, as well.

This required me to document a procedure where every time we
made a modification to the code that might _move_ the
location of these compiiler generated statics, we had to
update a #define constant to reflect it, and then recompile
again.

Got us by.

Whether it is _reliable_ or not would be another debate. The
resulting code was very reliable -- no problems at all.
However, the process/procedures we had to apply were not
reliable, of course, because we might forget to apply the
documented procedure before release. So on that score, sure.

Life happens. Oh, well.

Jon

bartc

unread,

Jan 24, 2010, 6:37:35 PM1/24/10

to

"James Harris" <james.h...@googlemail.com> wrote in message
news:c448f39c-2775-4ea5...@b2g2000yqi.googlegroups.com...

You'd need to sort *all* the functions of an application (include
non-global functions), and there would still be the possibility that some
function or other stuff you don't know about resides between 'consecutive'
functions f() and g().

Reading f() might be alright but overwriting it would be tricky.

--
Bartc

Jon Kirwan

unread,

Jan 24, 2010, 6:46:50 PM1/24/10

to

In general, no universally "good" assumptions exist. Partly
also because the very idea itself of "moving a function" in
memory at run-time is itself not yet well-defined by those
talking about it here.

Any given function may have the following:

code --> Code is essentially strings of constants. It may
reside in a von-Neumann memory system or a Harvard one. It
therefore may be readable by other code, or not. Many of the
Harvard implementations include a special instruction or a
special pointer register, perhaps, to allow access to the
code space memory. But not all do. In general, it may not
even be possible to read and move code. Even in von-Neumann
memory systems where, in theory there is no problem, the code
may have been "distributed" in pieces. An example here would
be an implementation I saw with Metaware's c compiler where
they had extended it to support a type of co-routine called
an 'iterator.' In this case, the body-block of a for-loop
would be moved outside the function's code region into a
separate function so that their implementation could call the
for-loop body through their very excellently considered
support mechanism for iterators. You'd need to know where
that part was, as well, to meaningfully move things.

constants --> A function may include instanced constants
(which a smart compiler may "understand" from something like
'const int aa= 5;', if it also finds that some other code
takes an address to 'aa'.) These may also need to be moved.
Especially if one is trying to download an updated function
into ram before flashing it for permanence as a "code update"
procedure. These constants may also be placed either in
von-Neumann memory systems and be accessed via PC-relative or
absolute memory locations -- itself a potential bag of worms
-- or in Harvard code space if the processor supports
accessing it or in Harvard data space, otherwise, especially
if there is some of that which is non-volatile.

static initialized data --> A function may include instanced
locations that must be initialized prior to main(), but where
the actual values of these instances are located in some
general collection place used by who-knows-what code in the
crt0 library routine that does this job of pre-initing. Once
again, more issues to deal with and wonder about.

And that's just what trips off my tongue to start.

It's a tough problem to solve generally. To do it right, the
language semantics (and syntax, most likely, as well) itself
would need to be expanded to support it. That could be done,
I suppose. But I imagine a lot of gnashing of teeth along
the way.

Jon

James Harris

unread,

Jan 24, 2010, 7:08:14 PM1/24/10

to

On 22 Jan, 22:53, john <j...@nospam.com> wrote:

> I need to know the size of a function or module because I need to
> temporarily relocate the function or module from flash into sram to
> do firmware updates.
>

> How can I determine that at runtime? The
> sizeof( myfunction)
> generates an error: "size of function unknown".

...

On 24 Jan, 23:37, "bartc" <ba...@freeuk.com> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > there seems to be no C way to solve the OP's problem.

...

> > That said, what about taking function pointers for all functions and
> > sorting their values? It still wouldn't help with the size of the last
> > function. Can we assume the data area would follow the code? I guess
> > not.
>
> You'd need to sort *all* the functions of an application (include
> non-global functions), and there would still be the possibility that some
> function or other stuff you don't know about resides between 'consecutive'
> functions f() and g().
>
> Reading f() might be alright but overwriting it would be tricky.

Since you've commented, Bart, do you have any thoughts on making
metadata about functions available in a programming language? Maybe
you already do this in one of your languages.

The thread got me thinking that if a function is a first-class object
perhaps some of its attributes should be transparent. Certainly its
code size and maybe its data size too; possibly its location, maybe a
signature for its input and output types. Then there are other
attributes such as whether it is in byte code or native code, whether
it is relocatable or not, what privilege it needs etc.

If portability is not needed a function object could also be
decomposed to individual instruction or subordinate function objects.
I'm not saying I like this idea - portability is a key goal for me -
but I'm just offering some ideas for comment.

Any thoughts on what's hot and what's not?

Followups set to only comp.lang.misc.

James

Mark Borgerson

unread,

Jan 24, 2010, 11:48:04 PM1/24/10

to

In article <RdadncEft4Y4IsHW...@lyse.net>,
david...@hesbynett.removethisbit.no says...

That's true. But it is also true that you can verify that a particular
compiler DOES produce the desired code an use that code effectively.
For embedded programming, it doesn't particularly matter if 50
other compilers don't produce what you want, as long as the compiler
your are using does.

Mark Borgerson

unread,

Jan 24, 2010, 11:58:46 PM1/24/10

to

In article <fbCdncswcJUMXMHW...@lyse.net>,
david...@hesbynett.removethisbit.no says...

None of that is at odds with writing a flash update routine once,
verifying that the end of the code is properly marked and using
the code. If you are worried about changes in optimization
levels for future compiles, you can generate a binary library
for the flash update function and link that in future applications.
AFAIK, linking a library does not generally result in any change
in the binary code of the library if the library was generated
from position-independent code. (And, if your going to copy
a function to a differnt location for execution, it had better
be position-independent.)

That said, I will have to look very carefully at some of the
MSP430 code that I have generated---the compiler may access
I/O locations using PC-relative addressing. That would totally
mess up code that got simply copied to RAM. However, that's
an altogether different problem than simply finding the length
of the function.

Mark Borgerson

David Brown

unread,

Jan 25, 2010, 4:21:59 AM1/25/10

to

In embedded development, /every/ rule has an exception, except this one :-).

There are definitely times when you have to manually check your outputs,
or write code that only works with specific compiler options, or add
assembly code hacks that rely on details of the compiler working. But
you don't do it unless you have no better way - you certainly don't
design in your hacks at the first step.

Another rule for embedded development is always know your tools, and
preferably pick /good/ tools. Microchip are known to be good for many
things - the quality of their 16-bit PIC C compilers is definitely not
one of them.

David Brown

unread,

Jan 25, 2010, 4:46:56 AM1/25/10

to

On 25/01/2010 00:13, James Harris wrote:
> On 24 Jan, 21:44, David Brown<david.br...@hesbynett.removethisbit.no>
> wrote:
> ...
>> Anything that relies on the compiler being stupid, or deliberately
>> crippled ("disable all optimisations") or other such nonsense is a bad
>> solution.
>
> I *think* Mark is aware of the limitations of his suggestion but there
> seems to be no C way to solve the OP's problem. It does sound like the
> problem only needs to be solved as a one-off in a particular
> environment.
>

You are correct that there is no standard C way to solve the problem.
But for the majority of compilers used in embedded development, there
are ways that will reliably solve this problem when working /with/ the
compiler, rather than /against/ the compiler. We are not trying to get
a highly portable solution here, but it is always better to find a
design that could be reused if possible. And it is always better to
work with the features of your toolset, especially when there is no
standard C solution, rather than trying to find ways to limit your tools.

For this problem, the best solution is generally to use a specific
section for the functions in question. This can often be done using the
gcc "__attribute__" syntax (even for non-gcc compilers), or by using
compiler-specific pragmas. Any tools suitable for embedded development
will support something to this effect, and give you control over the
linking and placement of the function (this is assuming, of course, you
are working with a microcontroller that supports execution from ram).

The details of how you do this depend on the situation. For example,
you may be happy to dedicate the required ram space to the function, or
you may want to copy it into ram only when needed. The former case is
the easiest, as you can arrange for the linker to put the code in flash,
but linked as though it were in ram. There is no need for any
position-independent code, and you can happily debug and step through
the code in ram. You can often "cheat" and put the code in the ".data"
section, then you don't even have to think about the linker file or
copying over the function - the C startup code handles that (since it
treats the function like initialised data). With gcc on the msp430, you
have your function defined something like this:

static void critical __attribute__ ((section(".data"))) progflash(...)

Of course, you still have to ensure that the function doesn't call other
functions - or that these are also in ram. And it is worth checking the
disassembly here if you are not sure - it is easy to accidentally
include library functions calls. But the difference is that you have a
reliable and safe way to achieve the effect you want, that is
independent of details such as the compiler flags or precise compiler
version, and will continue to work even if the source is changed.
Because you are working /with/ the tools, you can take full advantage of
debugging and optimisation. And though the details may vary for
different processors or toolchains, the principle can be re-used. As
with all code that cannot be implemented in standard C, there is always
the possibility of this solution failing with future compilers or
different devices, and you must check the results carefully - but this
is the best you can get.

> That said, what about taking function pointers for all functions and
> sorting their values? It still wouldn't help with the size of the last
> function. Can we assume the data area would follow the code? I guess
> not.
>

You can't make any assumptions about the ordering of code or data. You
cannot practically speaking make function pointers for all functions
without a great deal of effort, and making an unnecessary pointer to a
function cripples the compiler's optimisations of that function and
functions that call it.

David Brown

unread,

Jan 25, 2010, 4:50:41 AM1/25/10

to

On 25/01/2010 05:48, Mark Borgerson wrote:
> In article<RdadncEft4Y4IsHW...@lyse.net>,

True enough - but it /does/ matter that the compiler you are using
produces the code you want each of the 50 times you change and compile
the program, or when you change the compiler flags and compile them, or
when you update the compiler and recompile (I recommend keeping exactly
the same compiler version for any given project, but sometimes that is
not practical). If you have code that relies on working around the
compiler, you need to check it /every/ time, and you are never able to
take advantage of your tools to generate the best code.

Jon Kirwan

unread,

Jan 25, 2010, 5:26:13 AM1/25/10

to

On Mon, 25 Jan 2010 10:21:59 +0100, David Brown
<da...@westcontrol.removethisbit.com> wrote:

>On 25/01/2010 00:19, Jon Kirwan wrote:

>> <snip>

>> You give me a great way to segue into something. There are
>> cases where you simply have no other option than to do
>> exactly that. I'll provide one example. There are others.
>
>In embedded development, /every/ rule has an exception, except this one :-).

:)

>There are definitely times when you have to manually check your outputs,
>or write code that only works with specific compiler options, or add
>assembly code hacks that rely on details of the compiler working. But
>you don't do it unless you have no better way - you certainly don't
>design in your hacks at the first step.

Just to be argumentative (no other good reason, really), one
of my applications requires equal execution times across two
code edges. In other words, the execution time must be
constant regardless which branch is taken. c doesn't provide
for that, quite simply. So the very first thing I do porting
this application to a new processor is to ensure that I can
achieve this well, or if not, exactly what the variability
will be (because I must then relax the clocking rate to
account for it.) It's one of those unknowns that must be
locked down, immediately.

So yes, I hack at the very first step in this case. But I'm
just toying. In general, I take your point here.

.....

As an aside, one of the first things I may do with a new c
compiler and target is to explore methods to support process
semantics. The c language doesn't provide quite a number of
very useful semantics, this being one of them.

(Another I enjoy the use of is named, link-time constants.
They are not variable instances, in case you are confused
about my wording here. Instead, they are much like #define
in c except that these constants are link-time, not compile-
time, and if you change them there is no need to recompile
all the c code that uses them. You just change one file that
creates those constants and re-link. The linker patches in
the values, directly. Saves recompile time. Probably every
assembler supports them, and every linker _must_ support
them. But c does not provide syntax to access the semantic
that is available in its own linker.)

With cooperative switching (and I use that where possible,
because it is much easier to implement and support) I may be
able to write fairly simple routines in assembly to support
it (a dozen lines, or two.) But there is no escaping the
idea that whatever I do there relies on details about the
compiler. Different compilers on the MSP430, for example,
make different choices about register assignments, which must
be preserved across calls, which are scratchable, and which
are used to optionally pass parameters (and the conditions
under which registers may be chosen to pass them.)

With pre-emptive switching, it opens up a Pandora's box.
Library routines that may use static memory, for example. But
if pre-emptive switching is a part of the product, then I
face the problems squarely and usually up front in the
development. It's crucial to know exactly what works and how
well it works, right away.

I also enjoy the use of coroutine thunking, from time to
time. This, and process semantics, make for clear, very
readable code that works well and is able to be maintained by
a broader range of programmers (so long as they don't try and
rewrite the core o/s code, of course.)

I still take your point. But I hope you don't mind a small
moment of banter just to add to your suggestion that every
rule has exceptions, including the rule of not hacking things
at the outset. ;)

>Another rule for embedded development is always know your tools, and
>preferably pick /good/ tools. Microchip are known to be good for many
>things - the quality of their 16-bit PIC C compilers is definitely not
>one of them.

><snip>

Well, there is that. I cannot defend their use of _static_
memory for compiler temporaries, as they chose to do. It's
unconscionable. Their argument to me (one or two of those
who actually _wrote_ its code) was that it led to faster
emitted code -- in short, it appeared to show off their parts
better. And they felt they "had it covered."

Well, they were wrong and a false bargain was made.

I'm sure they aren't the only ones guilty of choosing to sell
the smell of sizzle over the quality of meat, though. Not by
a long shot.

Jon

David Brown

unread,

Jan 25, 2010, 7:13:37 AM1/25/10

to

On 25/01/2010 11:26, Jon Kirwan wrote:
> On Mon, 25 Jan 2010 10:21:59 +0100, David Brown
> <da...@westcontrol.removethisbit.com> wrote:
>
>> On 25/01/2010 00:19, Jon Kirwan wrote:
>>> <snip>
>>> You give me a great way to segue into something. There are
>>> cases where you simply have no other option than to do
>>> exactly that. I'll provide one example. There are others.
>>
>> In embedded development, /every/ rule has an exception, except this one :-).
>
> :)
>
>> There are definitely times when you have to manually check your outputs,
>> or write code that only works with specific compiler options, or add
>> assembly code hacks that rely on details of the compiler working. But
>> you don't do it unless you have no better way - you certainly don't
>> design in your hacks at the first step.
>
> Just to be argumentative (no other good reason, really), one

Being argumentative /is/ a good reason if it makes us think.

> of my applications requires equal execution times across two
> code edges. In other words, the execution time must be
> constant regardless which branch is taken. c doesn't provide
> for that, quite simply. So the very first thing I do porting
> this application to a new processor is to ensure that I can
> achieve this well, or if not, exactly what the variability
> will be (because I must then relax the clocking rate to
> account for it.) It's one of those unknowns that must be
> locked down, immediately.
>
> So yes, I hack at the very first step in this case. But I'm
> just toying. In general, I take your point here.
>

That's an example of when you need special consideration. My point is
that you only do that sort of thing if you have no better way to
implement the required functionality.

> .....
>
> As an aside, one of the first things I may do with a new c
> compiler and target is to explore methods to support process
> semantics. The c language doesn't provide quite a number of
> very useful semantics, this being one of them.
>
> (Another I enjoy the use of is named, link-time constants.
> They are not variable instances, in case you are confused
> about my wording here. Instead, they are much like #define
> in c except that these constants are link-time, not compile-
> time, and if you change them there is no need to recompile
> all the c code that uses them. You just change one file that
> creates those constants and re-link. The linker patches in
> the values, directly. Saves recompile time. Probably every
> assembler supports them, and every linker _must_ support
> them. But c does not provide syntax to access the semantic
> that is available in its own linker.)
>

Are you talking about using constants in your code which are evaluated
at link time, much in the way that static addresses are handled? Maybe
I've misunderstood you, but that strikes me as a poor way to handle what
are really compile-time constants - it's bad modularisation and
structure (sometimes a single file is the best place to put these
constants - but it should be because that's the best place, not because
you want to fit some weird way of compiling). It is highly
non-standard, potentially leading to confusion and maintenance issues.
It also limits the compiler's options for optimising the code. And if
re-compilation time is a serious issue these days, you need to consider
getting better tools (PC and/or compiler), or making better use of them
(better makefile setup, or use ccache).

Of course, it is always fun getting your tools to do interesting things
in unusual ways - but it's not always a good idea for real work.

> With cooperative switching (and I use that where possible,
> because it is much easier to implement and support) I may be
> able to write fairly simple routines in assembly to support
> it (a dozen lines, or two.) But there is no escaping the
> idea that whatever I do there relies on details about the
> compiler. Different compilers on the MSP430, for example,
> make different choices about register assignments, which must
> be preserved across calls, which are scratchable, and which
> are used to optionally pass parameters (and the conditions
> under which registers may be chosen to pass them.)
>

Yes, these are more examples of where you need to work with the compiler
details.

It is certainly perfectly possible to use static memory for compiler
temporaries, and it will certainly be faster than the normal alternative
(temporaries on a stack) for many small processors. But it has to be
implemented correctly!

WangoTango

unread,

Jan 25, 2010, 10:26:08 AM1/25/10

to

In article <lnzl453...@nuthaus.mib.org>, ks...@mib.org says...

> WangoTango <Asga...@mindspring.com> writes:
> > In article <hjda8u$t4k$1...@speranza.aioe.org>, jo...@nospam.com says...

> >> I need to know the size of a function or module because I need to
> >> temporarily relocate the function or module from flash into sram to
> >> do firmware updates.
> >>

> >> How can I determine that at runtime? The
> >> sizeof( myfunction)
> >> generates an error: "size of function unknown".
> >>

> > Good question, and I would like to know if there is an easy way to do it
> > during runtime, and a portable way would be nice too. I would probably
> > look at the map file and use the size I calculated from there, but
> > that's surely not runtime.
> >
> > You can get the starting address of the function pretty easy, but how
> > about the end? Hmmm, gotta' think about that.
>
> You can't even portably assume that &func is the memory address of the
> beginning of the function. I think there are systems (AS/400) where
> function pointers are not just machine addresses.
>

> Given whatever it is you're doing, you're probably not too concerned
> with portability, so that likely not to be an issue. But there's no
> portable way in C to determine the size of a function, so you're more
> likely to get help somewhere other than comp.lang.c.

First off, I think you might be confusing me with the OP, and he did
cross post to comp.arch.embedded. Anyway, I agree that this is a lot
trickier than just using the sizeof operator or doing some pointer math.
That's why it is a head scratcher. 'I' think that this is a very
target/compiler specific issue/problem, but I have been wrong before.
That's why I left it open for someone that may have already cracked this
nut. I don't have any experience doing what the OP wants to do, and
haven't experimented in any way shape or form. I think I would write
the function in assembly so I knew EXACTLY what was going on and I knew
there were no external dependencies on library code or jumps to other
functions, and go from there.

Mark Borgerson

unread,

Jan 25, 2010, 10:41:28 AM1/25/10

to

In article <4b5d6978$0$6274$8404...@news.wineasy.se>,
da...@westcontrol.removethisbit.com says...

Unless, you put the function into a separately-compiled library to
be linked in when you build the program the next 50 times. If
you change compilers, you may have to rebuild and verify the
library.

Mark Borgerson

Jon Kirwan

unread,

Jan 25, 2010, 3:42:54 PM1/25/10

to

On Mon, 25 Jan 2010 13:13:37 +0100, David Brown
<da...@westcontrol.removethisbit.com> wrote:

Understood, and agreed.

Well, you are of course correct in the sense that a specific
constant value shouldn't be scattered throughout a series of
modules like casting dust to the winds. It's not a good
idea. Your point is wisely made. However, you are also
wrong in suggesting, once again, some absolute rule that
_always_ applies. In this case, my point remains because
there is _some_ need for the semantic. It doesn't matter if
there are better ways for most things, if there are some
times a need for this semantic.

I think you understood me, correctly. Just in case there is
any question at all, I'm talking about this semantic, if you
are familiar with the Microsoft assembler:

QUANTUM EQU 47
PUBLIC QUANTUM

You can't do that in c. There is no syntax for it.

In the above example, this constant might be the default
number of timer ticks used per process quantum in a round
robin arrangement. But as you say, you are correct to
suggest that this kind of value usually only needs placement
in a single module, so the advantage may arguably be reduced
to a theoretical one, not a practical one. (Though I suppose
I could always posit a specific case where this QUANTUM might
be used in several reasonable places.)

However, there are times where there are values which may be
required in several modules. These may be field masks and
init values, for example, of hardware registers or software
control flags. It's not always the case that writing a
specific subroutine to compose them for you is the better
solution. Sometimes, it's better to expose the constants,
broadly speaking, and use them in a simple, constant-folding,
c language way. Libraries in c are riddled with these.

In addition, these public link-time constants can be used to
conditionally include or exclude code sections. In fact,
almost every compiler uses this fact in one way or the other.
CRT0, in particular, may take advantage of such features to
conditionally include or exclude initialization code for
libraries which may, or may not, have been linked in. And
most linkers support the concept in some fashion -- because
it is needed.

And yes, I'd sometimes like c-level access to it.

>> With cooperative switching (and I use that where possible,
>> because it is much easier to implement and support) I may be
>> able to write fairly simple routines in assembly to support
>> it (a dozen lines, or two.) But there is no escaping the
>> idea that whatever I do there relies on details about the
>> compiler. Different compilers on the MSP430, for example,
>> make different choices about register assignments, which must
>> be preserved across calls, which are scratchable, and which
>> are used to optionally pass parameters (and the conditions
>> under which registers may be chosen to pass them.)
>
>Yes, these are more examples of where you need to work with the compiler
>details.

Yes. No question.

Well, _if_ one is going to use statics _then_ of course it
has to be implemented correctly! Who could argue otherwise?

The problem is in the _doing_ of that. It requires (or at
least I imagine so, right now, being ignorant of a better
way) looking at the entire program block to achieve. And
that is a bit of a step-change away from the usual c compiler
mode of operation. It _might_ be implemented in the linker
stage, I suppose. Though I'm struggling to imagine something
a little less than a Rube Goldberg contraption to get there
in the linker side.

>> Well, they were wrong and a false bargain was made.
>>
>> I'm sure they aren't the only ones guilty of choosing to sell
>> the smell of sizzle over the quality of meat, though. Not by
>> a long shot.
>>
>> Jon

As an aside, I have a lot of other things I'd like in c or
c++ which just aren't there. For example, I dearly miss
having access to thunking semantics in c or c++ (which does
NOT break the c/c++ program model in any way, shape, or form
and could easily be implemented as part of either language
with no dire impacts at all. I might use this for efficient
iterators (don't imagine that I'm talking about std library
iterators here, which are somewhat similar in use but in no
way similar in their implementation details -- they are much
less efficient), so also do I miss this. There is no good
reason I can think of not to have it and its utility is
wonderful. (I'd be so happy to talk about it, at some point,
as the examples are excellent and easily shown.)

Jon

David Kelly

unread,

Jan 25, 2010, 4:45:18 PM1/25/10

to

On 1/22/10 7:32 PM, Grant Edwards wrote:
> On 2010-01-23, BGB / cr88192<cr8...@hotmail.com> wrote:
>

>> in this case, it might actually be better advised to generate the function
>> as a chunk of arch-specific ASM or machine code (ASM is preferable IMO, but
>> requires an assembler...), which could then be located wherever (such as the
>> heap).
>

> IMO, the "right" thing to do is to tell the compiler to put the

> function into a separate section and then have it linked so
> that it's "located" to run in RAM at the proper address but
> stored in ROM.
>
> That way you know the code will work correctly when it's run
> from RAM. Defining approprate symbols in the linker command
> file will allow the program to refer to the start and end of
> the section's address in ROM.
>

> The OP needs to spend some time studying the manuals for his
> compiler and linker.

Ditto.

If the function is not linked to the RAM run address then it probably
won't work.

Did this a year or so ago with Metrowerks and an HC12 but don't remember
quite how. It wasn't terribly hard once one knows what is needed. The
linker properly generated constant "variables" containing start and end
addresses for me. Quite properly it would not generate the runtime code
for moving the image from FLASH to RAM as this code was not intended to
permanently reside in RAM.

Grant Edwards

unread,

Jan 25, 2010, 5:23:29 PM1/25/10

to

You've also got to be careful in RAM-resident routine not to
write any code that generates library calls. On a 16-bit CPU,
doing long arithmetic will likely generate a library call, so
be mindful of which integer types you use.

--
Grant Edwards grante Yow! FROZEN ENTREES may
at be flung by members of
visi.com opposing SWANSON SECTS ...

David Brown

unread,

Jan 26, 2010, 3:14:50 AM1/26/10

to

Constant values should, like everything else, be declared and defined in
the place that makes most sense for the structure of the program. That
may mean just locally within a module or function, or in a module's
header file, in a program-global header, or occasionally declared in a
header and defined in an implementation file (for better data hiding,
though perhaps missed optimisation opportunities). So the only rule
here is to put them in the right place for the program, not just because
it shaves a quarter second off the re-compile time.

> modules like casting dust to the winds. It's not a good
> idea. Your point is wisely made. However, you are also
> wrong in suggesting, once again, some absolute rule that
> _always_ applies. In this case, my point remains because

Rules in embedded development are never absolute - we have both said as
much in this thread. But there are plenty of rules, written and
unwritten, that are strong enough to state as though they always apply.
If you feel you need to break them, you do so when you have clear and
reasoned arguments why your software will be better with the rule broken.

> there is _some_ need for the semantic. It doesn't matter if
> there are better ways for most things, if there are some
> times a need for this semantic.
>
> I think you understood me, correctly. Just in case there is
> any question at all, I'm talking about this semantic, if you
> are familiar with the Microsoft assembler:
>
> QUANTUM EQU 47
> PUBLIC QUANTUM
>
> You can't do that in c. There is no syntax for it.
>

I can't imagine a time when I would need to do that, or any problem it
might solve. Sometimes you want symbols that are defined at the linker
level to be exported to C - the start and end of a section, for example
- but I don't see any reason to pass constants around within the C
program itself in this way. I suppose you might have an assembly module
which exports constants that you then want to use in C, but it would be
better to use a common #define that is available to both C code and the
assembly code.

> In the above example, this constant might be the default
> number of timer ticks used per process quantum in a round
> robin arrangement. But as you say, you are correct to
> suggest that this kind of value usually only needs placement
> in a single module, so the advantage may arguably be reduced
> to a theoretical one, not a practical one. (Though I suppose
> I could always posit a specific case where this QUANTUM might
> be used in several reasonable places.)
>

Can you give me an example in which this is actually a required way to
handle such constants? As I said above, you could have "#define QUANTUM
47" in a header that is included by both the C code and assembly code as
needed. (And if your assembler doesn't like that sort of syntax, get a
better assembler. If that fails, between the C preprocessor and the
assembler's macro capabilities, you should be able to concoct a common
way of including the constants. And if that also fails, write a script
that is called by your Makefile and generates the required headers and
include files.)

> However, there are times where there are values which may be
> required in several modules. These may be field masks and
> init values, for example, of hardware registers or software
> control flags. It's not always the case that writing a
> specific subroutine to compose them for you is the better
> solution. Sometimes, it's better to expose the constants,
> broadly speaking, and use them in a simple, constant-folding,
> c language way. Libraries in c are riddled with these.
>

I agree entirely that exposing the global constants is often the best
way of using such values - I hate these silly little "accessor"
functions people write because they think that global data (variable or
constant) is somehow "bad", and it's better to write an inefficient and
unclear global function instead. But your constants should be available
to the compiler at compile time if at all possible - having them
available only at link time wastes your compiler's strengths.

> In addition, these public link-time constants can be used to
> conditionally include or exclude code sections. In fact,
> almost every compiler uses this fact in one way or the other.
> CRT0, in particular, may take advantage of such features to
> conditionally include or exclude initialization code for
> libraries which may, or may not, have been linked in. And
> most linkers support the concept in some fashion -- because
> it is needed.
>
> And yes, I'd sometimes like c-level access to it.
>

On devices for which I write my own CRT0 or other pre-C startup code, I
write it in C. Typically there's a couple of lines of assembly to set
the stack pointer and jump to the C startup function, but things like
clearing .bss and copying .data are all done in C. If I wanted sections
that may or may not be included, I'd use C - either with #if's, or by
relying on the compiler to eliminate dead code. And some symbols, such
as section start and end points, are passed from the linker into the C
code - but only those that /must/ be passed in that way.

Apparently the quality controllers and testers of your particular
compiler would argue otherwise!

> The problem is in the _doing_ of that. It requires (or at
> least I imagine so, right now, being ignorant of a better
> way) looking at the entire program block to achieve. And
> that is a bit of a step-change away from the usual c compiler
> mode of operation. It _might_ be implemented in the linker
> stage, I suppose. Though I'm struggling to imagine something
> a little less than a Rube Goldberg contraption to get there
> in the linker side.
>

It certainly isn't an easy problem to use statics for temporary data in
a way that makes efficient use of memory - and of course, in a way that
is safe and correct. I'm sure Walter Banks could tell you all about it
(though he might consider it a trade secret).

>>> Well, they were wrong and a false bargain was made.
>>>
>>> I'm sure they aren't the only ones guilty of choosing to sell
>>> the smell of sizzle over the quality of meat, though. Not by
>>> a long shot.
>>>
>>> Jon
>
> As an aside, I have a lot of other things I'd like in c or
> c++ which just aren't there. For example, I dearly miss
> having access to thunking semantics in c or c++ (which does
> NOT break the c/c++ program model in any way, shape, or form
> and could easily be implemented as part of either language
> with no dire impacts at all. I might use this for efficient
> iterators (don't imagine that I'm talking about std library
> iterators here, which are somewhat similar in use but in no
> way similar in their implementation details -- they are much
> less efficient), so also do I miss this. There is no good
> reason I can think of not to have it and its utility is
> wonderful. (I'd be so happy to talk about it, at some point,
> as the examples are excellent and easily shown.)
>

What do you mean by "thunking" in this context? The term has several
meanings, as far as I know. If your answer is going to take more than a
dozen lines (you are better known for your in-depth explanations than
your short summaries!), it should probably be in its own thread.

mvh.,

David

Jon Kirwan

unread,

Jan 26, 2010, 5:58:45 AM1/26/10

to

On Tue, 26 Jan 2010 09:14:50 +0100, David Brown
<da...@westcontrol.removethisbit.com> wrote:

Let's just leave it here as an agreement to disagree, then. I
believe you are sincere in considering what I've already
written, probably doing me more justice than I may deserve.
But since you still cannot gather, I have to assume it's my
fault in writing poorly and that the effort may require more
time than I care to have for a semantic only of some small
value. We've already made more of it than the newsgroup's
time is worth. So I'm fine leaving the topic behind and just
saying that I have used the semantic before to good use and
sometimes miss it in c. There's no payoff here. Let's just
leave it as a mild disagreement over a not-terribly-important
issue.

I think they simply didn't see it, before. A failure to
imagine as well as they should have done. Nothing more.

>> The problem is in the _doing_ of that. It requires (or at
>> least I imagine so, right now, being ignorant of a better
>> way) looking at the entire program block to achieve. And
>> that is a bit of a step-change away from the usual c compiler
>> mode of operation. It _might_ be implemented in the linker
>> stage, I suppose. Though I'm struggling to imagine something
>> a little less than a Rube Goldberg contraption to get there
>> in the linker side.
>
>It certainly isn't an easy problem to use statics for temporary data in
>a way that makes efficient use of memory - and of course, in a way that
>is safe and correct. I'm sure Walter Banks could tell you all about it
>(though he might consider it a trade secret).

I'm sure he knows a great deal more about the complexities
than I do, so you are most certainly right that he could tell
me about the subject. I've never claimed expertise here.
Merely the ability to observe specific failures when I see
poor results from poorly imagined solutions.

>>>> Well, they were wrong and a false bargain was made.
>>>>
>>>> I'm sure they aren't the only ones guilty of choosing to sell
>>>> the smell of sizzle over the quality of meat, though. Not by
>>>> a long shot.
>>>>
>>>> Jon
>>
>> As an aside, I have a lot of other things I'd like in c or
>> c++ which just aren't there. For example, I dearly miss
>> having access to thunking semantics in c or c++ (which does
>> NOT break the c/c++ program model in any way, shape, or form
>> and could easily be implemented as part of either language
>> with no dire impacts at all. I might use this for efficient
>> iterators (don't imagine that I'm talking about std library
>> iterators here, which are somewhat similar in use but in no
>> way similar in their implementation details -- they are much
>> less efficient), so also do I miss this. There is no good
>> reason I can think of not to have it and its utility is
>> wonderful. (I'd be so happy to talk about it, at some point,
>> as the examples are excellent and easily shown.)
>
>What do you mean by "thunking" in this context? The term has several
>meanings, as far as I know. If your answer is going to take more than a
>dozen lines (you are better known for your in-depth explanations than
>your short summaries!), it should probably be in its own thread.

There are some excellent discussions already available. See,
for example, the Metaware C (and Pascal) compiler manuals and
their implemention of an iterator semantic, as well as their
extensive discussion (with well-made examples) of its
abundant benefits. There is also a somewhat different, but
also interesting discussion made by Randall Hyde in The Art
of Assembly manuals he generously put together some years
back. (Don't worry yourself thumbing through Microsoft's use
of the term.)

...

The following case below is not nearly as well thought out,
syntax wise, as Metaware's implementation (in other words,
don't mistake it as a complete syntax designed by experts)
but it may get the seed of a point across.

for ( p in Primes( 20, 70 ) )
printf( "%d\n", p );

The code for Primes() might be, in short-hand, something like
this below. (Please excuse my abuse of basic knowledge about
prime numbers by using an increment by 1 in the for loop or
any other tests for starting or ending on an even number only
because I want to keep the code with as few lines as is
reasonable to make the point. The idea is the point, not the
implementation here.)

integer Primes( int a, int b ) {
int i;
for ( i= a; i <= b; ++i )
if ( IsPrime( i ) )
yield( i );
return;
}

I'm intentionally being brief, as well as leaving out a short
discussion of each line. Partly, because longer examples may
interfere with the central points. Partly, because you
shouldn't need any discussion and should be able to almost
instantly see what the above code "means" without that extra.
It's in a form that should be plain without a manual.

The place to focus isn't so much on examining Primes(), but
instead more by imagining a wide variety of the types of
for() loops which may require the mechanism itself. (Not my
poor example, which may or may not be useful to anyone.)

For example, you might prefer to imagine that Primes() is
instead a routine that yields all the nodes of some arbitrary
tree or graph using some very specific walking mechanism. If
you use your imagination and let it take you for a ride, then
perhaps the point may be clarified.

If that gets you towards where I'm pointing, then the next
question is how would you implement this in assembly code,
consistent with c compilers you are aware of and in such a
way that does _not_ break an existing linker in the process?

On the other hand, if this doesn't do it for you at all -- if
in short, your imagination isn't moved by those examples
beyond the short distance I walked with them -- then let me
commend again Metaware's c/pascal implementations and Hyde's
AofA documentation before further discussion continues.

But less than the above made my imagination literally spin
with ideas when I first came across it. So maybe the above
is enough. I hope so. It's also closely connected in a
round about way to the idea of nested functions, aka Pascal.
(If you see the connection, then I think you've probably got
the larger picture.)

Jon

David Kelly

unread,

Jan 26, 2010, 11:45:48 AM1/26/10

to

On 1/25/10 4:23 PM, Grant Edwards wrote:
> On 2010-01-25, David Kelly<n4...@Yahoo.com> wrote:

[...]

>> If the function is not linked to the RAM run address then it
>> probably won't work.
>>
>> Did this a year or so ago with Metrowerks and an HC12 but
>> don't remember quite how. It wasn't terribly hard once one
>> knows what is needed. The linker properly generated constant
>> "variables" containing start and end addresses for me. Quite
>> properly it would not generate the runtime code for moving the
>> image from FLASH to RAM as this code was not intended to
>> permanently reside in RAM.
>
> You've also got to be careful in RAM-resident routine not to
> write any code that generates library calls. On a 16-bit CPU,
> doing long arithmetic will likely generate a library call, so
> be mindful of which integer types you use.

Yes, lots of manual attention and review is needed as the code which is
to be relocated to RAM can not depend on anything else. Most compilers
and linkers will happily use any variable or routine elsewhere in the
project.

Would be a good idea to reset the stack pointer at the start of the RAM
code. Globally disable interrupts too as sometimes its hard to
individually disable every single interrupt source on an MCU. Also do
not forget the interrupt vector table (or whatever means is used to
route interrupt service routines).

David Brown

unread,

Jan 26, 2010, 5:31:15 PM1/26/10

to

Jon Kirwan wrote:
> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown
> <da...@westcontrol.removethisbit.com> wrote:
>
>> On 25/01/2010 21:42, Jon Kirwan wrote:
>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown
>>> <da...@westcontrol.removethisbit.com> wrote:
>>>

<snip>

Okay. I'm not sure whether I am misunderstanding you, or disagreeing
with you, but I'm happy to leave it for now. Maybe the topic will turn
up another time, and it will all suddenly become obvious.

<snip>

Ah, what you are talking about is often called a generator (for example,
in Python or JavaScript), or perhaps a closure. A generator is somewhat
like a lazily evaluated list, although it could be generalised (for
example, do the parameter values have to be the same for repeat calls as
they are for the initial call?).

I've used generators in Python - they are a very nice way to solve some
kinds of problems. Unfortunately, they are not easy to implement
cleanly in a stack-based language because a general implementation
requires that the generator (or "thunk", if you prefer) has its own
stack for arbitrary local variables and state. Thus most languages that
implement generators use garbage collection.

You can get some of the features of generators in C++ using a class to
encapsulate the generator's state in class data. The newer lambda
syntax makes it a little neater, and comes somewhat closer to generators
or closures - but there are limitations. You can't implement them in
general without substantial helper code (as seen in boost's libraries)
or a major change in the structure of the language (including garbage
collection).

In short, generators (and closures) are a very nice high-level concept -
you need a high level language (or, somewhat ironically, assembly) to
use them, and C / C++ are not suitable.

Jon Kirwan

unread,

Jan 26, 2010, 6:14:43 PM1/26/10

to

I only introduced it as an aside, to start. It's just not
that important.

You are way off the reservation, already. Let me suggest you
think about the _implementation_ for a moment. Maybe that
will clarify what I'm pointing towards. For one thing,
garbage collection has _NOTHING_ whatever to do with it. If
you think so, you are far off-point.

>You can get some of the features of generators in C++ using a class to
>encapsulate the generator's state in class data. The newer lambda
>syntax makes it a little neater, and comes somewhat closer to generators
>or closures - but there are limitations. You can't implement them in
>general without substantial helper code (as seen in boost's libraries)
>or a major change in the structure of the language (including garbage
>collection).
>
>In short, generators (and closures) are a very nice high-level concept -
>you need a high level language (or, somewhat ironically, assembly) to
>use them, and C / C++ are not suitable.

I recommend that reading I suggested. Take the AofA one
first. It's easier to get ahold of. It gets into the
details of implementation, but does NOT deal with it as a c
level concept. So you will have to fend for yourself there
until you can get ahold of Metaware docs. (Or, I might be
tempted to copy some of them for this purpose.)

Anyway, I can see I've sent you spinning in the wrong
direction. Take a breath, read AofA on the topic of thunks
and the nearby related chapters to it. That should provide
an idea about implementation. Not the _whole_ idea, by the
way. As it might be done in c, it involves the concept of
nested functions (which you clearly don't yet see) without
the use of the specific syntax you are used to seeing for
them (it's entirely hidden at the c language level, but
explicit at the assembly level.) If you _see_ this much, we
are probably on the same page.

Jon

David Brown

unread,

Jan 27, 2010, 8:26:35 AM1/27/10

to

On 27/01/2010 00:14, Jon Kirwan wrote:
> On Tue, 26 Jan 2010 23:31:15 +0100, David Brown
> <david...@hesbynett.removethisbit.no> wrote:
>
>> Jon Kirwan wrote:
>>> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown
>>> <da...@westcontrol.removethisbit.com> wrote:
>>>
>>>> On 25/01/2010 21:42, Jon Kirwan wrote:
>>>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown
>>>>> <da...@westcontrol.removethisbit.com> wrote:

>> <snip>
>>

I don't have these books and manuals you are referring to, nor do I have
easy access to them. If you have web links of interest then I'll
happily look at them - but I am not going to find and order books just
to read a few pages. This discussion is interesting, but there's a
limit to what is a practical and appropriate use of time and money for a
discussion.

> Anyway, I can see I've sent you spinning in the wrong
> direction. Take a breath, read AofA on the topic of thunks
> and the nearby related chapters to it. That should provide
> an idea about implementation. Not the _whole_ idea, by the
> way. As it might be done in c, it involves the concept of
> nested functions (which you clearly don't yet see) without
> the use of the specific syntax you are used to seeing for
> them (it's entirely hidden at the c language level, but
> explicit at the assembly level.) If you _see_ this much, we
> are probably on the same page.
>

Nested functions are perfectly possible in some extensions to C - in
particular, gcc supports them (since gcc also supports Ada, which has
nested functions, much of the gcc structure already supports nested
functions, and thus the C and C++ front-ends can get them almost for free).

Nested functions, C++ classes, the new C++ lambda syntax, etc., are all
ways to implement a limited form of generator or iterator. Compiler
extensions can be used to make a nicer syntax, and to automate some of
the manual work involved. But without some sort of multiple stack
system or garbage collection, you have serious limitations. I don't
mean to say that these ideas are not useful despite the limitations -
just that you cannot add proper flexible generators to a language with
the sort of structure of C or C++ without fundamental changes to the way
the language works - the compiler would need to be free to allocate (and
free) dynamic memory as needed, rather than through explicit malloc /
new calls.

It could well be that what you call "thunking" really means "generators
with various limitations", in which case you are right that garbage
collection is not needed, and it's reasonably easy to figure out several
good implementations. But the term "thunking" is not a well known or
well-defined expression, and is used in many different ways by different
people - I have no idea how the author of a particular book you've read
happens to use it.

To look at some more general generators, and see why they can be used
much more freely in a language like Python than they can in C or C++,
let's vary your Primes function, using Python syntax so that we have
actual working code:

def IsPrime(i) :
if i < 2 :
return False
for j in range(2, int(math.sqrt(i)) + 1) :
if (i % j) == 0 :
return False
return True

def Primes(a, b) :
i = a
if (i == 2) :
yield i
if (i % 2 == 0) :
i = i + 1
while ( i <= b ) :
if (IsPrime(i)) :
yield i
i = i + 1
return

for p in Primes(1, 20) :
print p

To make this implementation of Primes work, the Primes closure has to
include information about where the execution currently is in the Primes
function, and in general it must also track any local variables. This
becomes increasingly difficult for more complex functions, especially
when doing it manually - you have to define a struct (or C++ class) to
hold all the local data, as well as a current "state" which is used for
jumping to the correct re-entry point on later calls. A language
extension could hide and automate much of this, however.

If you were using such generators in a real program, you would want to
use structured and modular programming - and then things get difficult.
To use generators in a stack-based language, you would have to
allocate a structure containing all the local data on the caller
function's stack - that means you (either the programmer figuring out
the closure data manually, or the extended compiler) need access to the
implementation when declaring the generation and using it. With a
garbage collecting language, the generator itself would allocate space
on the heap as needed - the caller need not know anything about the details.

You start getting really high-level programming when you can pass
generators around as parameters and return values. This is something
that cannot be done with a stack model - if a function returns a
generator (or any function which requires closure data), the closure
data must exist even after the calling stack frame has exited. There
are ways to implement this without a general garbage collection facility
(for example, a pointer to a clean-up function could be passed up or
down the call chain while the closure itself is on the heap). But
basically, complex function manipulation like this needs more advanced
automatic control of the memory that you get in C or C++.

Jon Kirwan

unread,

Jan 27, 2010, 3:15:29 PM1/27/10

to

On Wed, 27 Jan 2010 14:26:35 +0100, David Brown
<da...@westcontrol.removethisbit.com> wrote:

Go here:
http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html

Then download and read all of Volume 5.
http://webster.cs.ucr.edu/AoA/Windows/PDFs/Volume5.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/Thunks.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/Iterators.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/Coroutines.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/ParameterImplementation.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/LexicalNesting.pdf
http://webster.cs.ucr.edu/AoA/Windows/PDFs/V5Questions.pdf

I apologize for not doing this earlier. I had expected that
you already knew about AofA and it's availability on the web.
Had I known you didn't know about it, I would have
immediately provided you the links. Again, my sincere
apologies for not doing this earlier.

>> Anyway, I can see I've sent you spinning in the wrong
>> direction. Take a breath, read AofA on the topic of thunks
>> and the nearby related chapters to it. That should provide
>> an idea about implementation. Not the _whole_ idea, by the
>> way. As it might be done in c, it involves the concept of
>> nested functions (which you clearly don't yet see) without
>> the use of the specific syntax you are used to seeing for
>> them (it's entirely hidden at the c language level, but
>> explicit at the assembly level.) If you _see_ this much, we
>> are probably on the same page.
>
>Nested functions are perfectly possible in some extensions to C - in
>particular, gcc supports them (since gcc also supports Ada, which has
>nested functions, much of the gcc structure already supports nested
>functions, and thus the C and C++ front-ends can get them almost for free).

Yes, but as a general rule I don't always have the option of
using gcc. Customers sometimes already have existing tools
they want used, for example. There are other reasons, too.
So it's not a general solution. Just an interesting one.

>Nested functions, C++ classes, the new C++ lambda syntax, etc., are all
>ways to implement a limited form of generator or iterator. Compiler
>extensions can be used to make a nicer syntax, and to automate some of
>the manual work involved. But without some sort of multiple stack
>system or garbage collection, you have serious limitations. I don't
>mean to say that these ideas are not useful despite the limitations -
>just that you cannot add proper flexible generators to a language with
>the sort of structure of C or C++ without fundamental changes to the way
>the language works - the compiler would need to be free to allocate (and
>free) dynamic memory as needed, rather than through explicit malloc /
>new calls.
>
>It could well be that what you call "thunking" really means "generators
>with various limitations", in which case you are right that garbage
>collection is not needed, and it's reasonably easy to figure out several
>good implementations. But the term "thunking" is not a well known or
>well-defined expression, and is used in many different ways by different
>people - I have no idea how the author of a particular book you've read
>happens to use it.

Hopefully, the above will tell you more.

In fact, that's what is vital. In the implementation done by
Metaware's compilers, it was very well handled and the
implementation was quite general and nestable to any depth
without the programmer worrying over details such as that.
It's all simply kept as stack frame contexts, just as normal
functions do. The difference is that a thunk is used,
instead, to move back and forth in order to preserve the
stack context while the iterator remains "live." Once the
iterator completes, though, the stack is unwound in the usual
way and the context disappears just as you would expect for
any function call.

>If you were using such generators in a real program, you would want to
>use structured and modular programming - and then things get difficult.

Things do not get difficult. I've used metaware's compiler
tools and there was NO difficulty involved. It's _exactly_
like using c, except you've got a wonderful additional
semantic to handle, in beautiful and efficient ways, concepts
like walking graphs. The idea of "data hiding" is expanded
to also now include "algorithm hiding," but in a very light
weight fashion that is entirely consistent with the c
worldview.

> To use generators in a stack-based language, you would have to
>allocate a structure containing all the local data on the caller
>function's stack - that means you (either the programmer figuring out
>the closure data manually, or the extended compiler) need access to the
>implementation when declaring the generation and using it. With a
>garbage collecting language, the generator itself would allocate space
>on the heap as needed - the caller need not know anything about the details.

Got it. I understand your point now about garbage collection
-- makes sense. But the way Metaware handles it is beautiful
and doesn't require any of that. It's entirely handled
within the standard c style program model with a single stack
and all the usual, normal stack frame elements. The body of
a for loop is placed into a separate, nested function within
the body of the enclosing function. The iterator is called
using all the usual means, but includes a pointer to the
body. The iterator may itself call any number of other
functions, as well as other iterators if it likes, which may
be nested down the stack to any depth you want. When a yield
takes place, it is really a call to the for-body nested
function but with the stack frame pointer set to the
enclosing function so all the usual local variables are
appropriately accessible off of the base pointer reference
that all c compilers may normally use. The nested function
returns rather normally, restoring the frame back to the down
stream end of the stack. If the for-body temporarily stores
on the stack, it does so at the end of course and obviously
must restore it before returning. But that's just basic,
anyway.

>You start getting really high-level programming when you can pass
>generators around as parameters and return values. This is something
>that cannot be done with a stack model - if a function returns a
>generator (or any function which requires closure data), the closure
>data must exist even after the calling stack frame has exited. There
>are ways to implement this without a general garbage collection facility
>(for example, a pointer to a clean-up function could be passed up or
>down the call chain while the closure itself is on the heap). But
>basically, complex function manipulation like this needs more advanced
>automatic control of the memory that you get in C or C++.

So let me think about this for a second. Passing a generator
would involve being able to return not only a pointer to
code, but also its entire current context (and any such
context of all activation records still active at the time?)
In other words, it's not just a generator at its initial
point, but one that may have already been used for a bit but
hasn't yet completed and so it can be returned to a caller
for more continued use? Interesting, and I gather the
additional value here.

However, as an _embedded_ programmer usually working on tiny
micros, not workstation-competent board-level systems, I'm
focused upon very modest but very useful extensions where I
_know_ I have good application and where I don't have to pay
for it with a significant change to the existing models I
have grown to know well and fully understand and trust.

That said, I'd be interested in seeing how to implement
something like that which would work in ways where the
run-time execution duration is entirely predictable and
invariant (knowing, obviously, the initial conditions for the
generator.) I think you hint towards this, but I'd need to
see a specific implementation.

In the meantime, you might look at the PDFs I've referred you
towards. They aren't that long and are quite detailed. They
do NOT show you the implementation used by Metaware, but I
can talk about that.

Jon

David Brown

unread,

Jan 28, 2010, 4:30:58 AM1/28/10

to

Thanks - that makes a /huge/ difference! Of course, now I just need the
time to read it. As I've only had a brief look at it (I read most of
the chapter on thunks), I may be misjudging it here, but I have
difficulty seeing the relevance of the book at this time. I can see the
point of a DOS assembly book long ago, and I can see the point of a
general book on assembly for multiple architectures. But I can't think
of any reason (other than for fun) why anyone would write software in
assembly for the x86 - and certainly not for Windows. There are
certainly times when you might want to /use/ assembly on an x86 -
speeding up critical loops, for example - but not to write entire
programs. The HLA concept strikes me as a waste of time in this day and
age.

Having said that, some of the concepts (such as in the chapters you have
indicated) are interesting and have wider applications. Coroutines are
useful devices - it's just that the implementation details of how to use
them with HLA x86 assembly are irrelevant to reality. Had the author
shown how to use them in C, Java, Python, or even in an artificial HLL,
it would have been more useful.

Anyway, now I see what you mean by the term "thunk" - and it is clear
from the book that these are limited devices that are basically
equivalent to a C++ class with initialisation of private data values and
a single method (or alternatively a C function that takes a struct
pointer). Useful, but hardly revolutionary. Your proposed syntax for
them in C is, however, neat and elegant - that would be a useful
addition to the C language.

>>> Anyway, I can see I've sent you spinning in the wrong
>>> direction. Take a breath, read AofA on the topic of thunks
>>> and the nearby related chapters to it. That should provide
>>> an idea about implementation. Not the _whole_ idea, by the
>>> way. As it might be done in c, it involves the concept of
>>> nested functions (which you clearly don't yet see) without
>>> the use of the specific syntax you are used to seeing for
>>> them (it's entirely hidden at the c language level, but
>>> explicit at the assembly level.) If you _see_ this much, we
>>> are probably on the same page.
>>
>> Nested functions are perfectly possible in some extensions to C - in
>> particular, gcc supports them (since gcc also supports Ada, which has
>> nested functions, much of the gcc structure already supports nested
>> functions, and thus the C and C++ front-ends can get them almost for free).
>
> Yes, but as a general rule I don't always have the option of
> using gcc. Customers sometimes already have existing tools
> they want used, for example. There are other reasons, too.
> So it's not a general solution. Just an interesting one.
>

Agreed. I use various gcc extensions if I think they improve the code
(with the emphasis here on improving the source code rather than the
target code - that's a bonus). I haven't used nested functions - they
often make code less readable because it is often unclear where
different functions start and end.

I agree here that such a syntax and compiler-aided handling of the
details would give you a very nice way to use these "thunks" - much more
convenient that doing things manually in C or C++. I suspect you could
get a fair way with "normal C" using a system similar to Adam Dunkels'
protothreads - but integrating it into the language would be best.

>> If you were using such generators in a real program, you would want to
>> use structured and modular programming - and then things get difficult.
>
> Things do not get difficult. I've used metaware's compiler
> tools and there was NO difficulty involved. It's _exactly_
> like using c, except you've got a wonderful additional
> semantic to handle, in beautiful and efficient ways, concepts
> like walking graphs. The idea of "data hiding" is expanded
> to also now include "algorithm hiding," but in a very light
> weight fashion that is entirely consistent with the c
> worldview.
>
>> To use generators in a stack-based language, you would have to
>> allocate a structure containing all the local data on the caller
>> function's stack - that means you (either the programmer figuring out
>> the closure data manually, or the extended compiler) need access to the
>> implementation when declaring the generation and using it. With a
>> garbage collecting language, the generator itself would allocate space
>> on the heap as needed - the caller need not know anything about the details.
>
> Got it. I understand your point now about garbage collection
> -- makes sense. But the way Metaware handles it is beautiful
> and doesn't require any of that. It's entirely handled
> within the standard c style program model with a single stack
> and all the usual, normal stack frame elements. The body of

The Metaware implementation, as far as I can see, is limited to
situations where the thunk's frame can be allocated on the stack (or
possibly as a statically allocated region). That is certainly the
situation described in AofA. That is, of course, an entirely reasonable
limitation for an extension to C.

My view of such concepts has come down from higher-level languages like
Python (and also functional programming languages), in which you have
much more general capability in how you work with function-like objects
and closures. From that angle, these "thunks" look limited, because you
need a compiler and run-time that handles dynamic memory (typically some
sort of garbage collection, but that's not absolutely necessary) to
implement the capabilities I assume. But when you are thinking of these
as an upwards extension of C, I can see these being a very useful
addition to the language.

That's correct. You can see here how this requires the generator's
local frame to remain valid after the function that created it has
exited - that means it has to exist outside the main stack. Thus for
this sort of thing to be handled directly by the language and the
compiler, rather than through explicit "new" or "malloc" calls, the
compiler has to have a direct understanding and control of dynamic memory.

> However, as an _embedded_ programmer usually working on tiny
> micros, not workstation-competent board-level systems, I'm
> focused upon very modest but very useful extensions where I
> _know_ I have good application and where I don't have to pay
> for it with a significant change to the existing models I
> have grown to know well and fully understand and trust.
>

I agree here - and I would appreciate the addition to C of the sort of
capabilities you have been describing. For embedded systems, it is
important to be able to understand the implementation for the code you
write, and that is possible for "thunks" as you have described them. On
a PC, it (typically) doesn't matter if the software takes a few extra MB
of run space, and runs through a byte code virtual machine - thus I
program in Python and take advantage of the language's power to write
shorter code.

bartc

unread,

Jan 28, 2010, 6:19:09 AM1/28/10

to

"David Brown" <da...@westcontrol.removethisbit.com> wrote in message
news:4b615962$0$3853$8404...@news.wineasy.se...

> On 27/01/2010 21:15, Jon Kirwan wrote:

>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html

> Thanks - that makes a /huge/ difference! Of course, now I just need the
> time to read it. As I've only had a brief look at it (I read most of the
> chapter on thunks), I may be misjudging it here, but I have difficulty
> seeing the relevance of the book at this time. I can see the point of a
> DOS assembly book long ago, and I can see the point of a general book on
> assembly for multiple architectures. But I can't think of any reason
> (other than for fun) why anyone would write software in assembly for the
> x86 - and certainly not for Windows. There are certainly times when you
> might want to /use/ assembly on an x86 - speeding up critical loops, for
> example - but not to write entire programs. The HLA concept strikes me as
> a waste of time in this day and age.

Assembler is 100% flexible compared to any HLL, even C, so sometimes it
makes life easier.

While you probably wouldn't write applications in it, there are types of
programs which do have a big proportion of assembler (in my case, these are
interpreters).

Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler
within a HLL) which makes writing large amounts of assembler much less
painful.

And, if you are working on a language product which generates assembler
code, then you need to understand how it works even if you are not manually
writing the code yourself.

--
Bartc

Jon Kirwan

unread,

Jan 28, 2010, 6:26:07 AM1/28/10

to

On Thu, 28 Jan 2010 10:30:58 +0100, David Brown
<da...@westcontrol.removethisbit.com> wrote:

Sweet. Now if we can just convince those c-standard folks!

By the way, just so you know, Metaware's founder (one of
them, at least) was Dr. Frank DeRemer. He's known well for
his Ph.D. thesis on LALR parsing, "Practical translators for
LR(k) languages," MIT, Cambridge, Massachusetts, 1969. He
and Dr. Tom Pennello went on to write some tools for compiler
compilers and an article called "Efficient Computation of
LALR(1) Look-Ahead Set," TOPLAS, vol 4, no 4, in October
1982. Which was around the time, I think, that Metaware was
becoming a reality of sorts.

I very much enjoyed my conversations and learned a few things
from them (especially Tom), back around that time. They were
generous with their time and help and willingness to teach.

If you wonder how a paper that doesn't use LALR in its title
is about that, take a look at the wiki page here:

http://en.wikipedia.org/wiki/LALR_parser_generator

Dr. DeRemer invented LALR.

Jon

Jon Kirwan

unread,

Jan 28, 2010, 6:29:13 AM1/28/10

to

On Thu, 28 Jan 2010 11:19:09 GMT, "bartc" <ba...@freeuk.com>
wrote:

This last paragraph makes an excellent point, regardless of
how one may take the rest of what you say (which I also
consider well-spoken.)

Jon

David Brown

unread,

Jan 28, 2010, 8:06:07 AM1/28/10

to

I am not saying there is no place for assembly - for small systems,
assembly can still be a good choice (and I have done a lot of assembly
programming on small systems through the years). There are also parts
of large systems that are best done in assembly. And of course you
should understand assembly when working with embedded systems, and as
you say, a compiler writer is going to have to be an assembly expert.
But the days of writing large applications in x86 assembly for PCs (this
book is targeting x86 assembly for windows) are long gone, bar a few
specialist applications or keen enthusiasts.

Walter Banks

unread,

Jan 28, 2010, 9:31:51 AM1/28/10

to

bartc wrote:

> Assembler is 100% flexible compared to any HLL, even C, so sometimes it
> makes life easier.
>
> While you probably wouldn't write applications in it, there are types of
> programs which do have a big proportion of assembler (in my case, these are
> interpreters).
>
> Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler
> within a HLL) which makes writing large amounts of assembler much less
> painful.

Embedded assembly in a C program becomes a lot more manageable
when the assembler and C compiler share the same symbol table.
Getting rid of address mangling between C and asm and giving the
asm full access to C's symbol table makes it a lot easier to
add embedded assembler to an application.

Regards,

Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

Albert van der Horst

unread,

Feb 1, 2010, 11:56:18 AM2/1/10

to

In article <Noe8n.32589$Ym4....@text.news.virginmedia.com>,

bartc <ba...@freeuk.com> wrote:
>
>"David Brown" <da...@westcontrol.removethisbit.com> wrote in message
>news:4b615962$0$3853$8404...@news.wineasy.se...
>> On 27/01/2010 21:15, Jon Kirwan wrote:
>
>>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html
>
>> Thanks - that makes a /huge/ difference! Of course, now I just need the
>> time to read it. As I've only had a brief look at it (I read most of the
>> chapter on thunks), I may be misjudging it here, but I have difficulty
>> seeing the relevance of the book at this time. I can see the point of a
>> DOS assembly book long ago, and I can see the point of a general book on
>> assembly for multiple architectures. But I can't think of any reason
>> (other than for fun) why anyone would write software in assembly for the
>> x86 - and certainly not for Windows. There are certainly times when you
>> might want to /use/ assembly on an x86 - speeding up critical loops, for
>> example - but not to write entire programs. The HLA concept strikes me as
>> a waste of time in this day and age.
>
>Assembler is 100% flexible compared to any HLL, even C, so sometimes it
>makes life easier.
>
>While you probably wouldn't write applications in it, there are types of
>programs which do have a big proportion of assembler (in my case, these are
>interpreters).
>
>Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler
>within a HLL) which makes writing large amounts of assembler much less
>painful.

That is not HLA. HLA is IMHO a bizarre concept. It is abstraction
added on top of an assembler.
Assembler is the pinnacle of concreteness.
If you go for assembler, you should complement it with macro's.
Macro's may be a pain to use, but they are extremely flexible
and the result remains horribly concrete.

I don't think this is just an opinion, and I would add this to
Wikipedia, but I have some stakes here.

1]
I have a Forth system, where macro's help me to have the same
source 16/32/64 bit, linux/msdos/mswindows/standalone.
It is legitimate for a language implementation to be written
in assembler (maybe complemented with parts written in the
language itself.) (The alternative is using C as a portable
assembler, but that leaves C itself to be written basically
in assembler.)

2]
I also have (dis)assembler that doesn't hide the difference (i86)
between move-from-register-a-to-b and move-to-register-b-from-a
MOV BX,AX
(No not LEA. )
which makes it suitable for reverse engineering stealth viruses.

>
>And, if you are working on a language product which generates assembler
>code, then you need to understand how it works even if you are not manually
>writing the code yourself.

Also this notion should be more prominent on Wikipedia pages.
Sometimes it gives the impression that those using assembler are
behind the times instead of supplying the foundation for the
whole IT industry.

>
>--
>Bartc
>

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst