Code size in AVR: ASM vs Imagecraft C ?

car...@marte.cinv.iteso.mx

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Hi !

Has anyone used the C compiler from Imagecraft ?
I want to know how much larger will the object file be
against pure assembly. Maybe in percentages.

Thanks !

Sent via Deja.com http://www.deja.com/
Before you buy.

Lars Wictorsson

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Hi,

> Has anyone used the C compiler from Imagecraft ?
> I want to know how much larger will the object file be
> against pure assembly. Maybe in percentages.

I guess there is no such fair comparision, since it
all depends on the programmer writing the assembler,
how skilled he/she is.

Instead it is better to compare C compilers from
different vendors for a specific target.

IMHO

/Lars

----------------------------------------------------------
LAWICEL / SWEDEN Phone : +46 (0)451 - 598 77
Lars Wictorsson Fax : +46 (0)451 - 598 78
E-mail: la...@lawicel.com WWW : http://www.lawicel.com

Embedded hardware/software together with 8051/C16x/AVR and
smart distributed I/O with CAN (Controller Area Network).
See CANDIP at http://www.lawicel.com/candip/ AVR+SJA1000
----------------------------------------------------------

Georg Michel

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

AFAIK generates the free GNU C compiler the best optimized code. I wrote
the same program in assembler and C. The C code was about 0.8 times
larger. You can download gcc from http://medo.fov.uni-mb.si/mapp

Regards
Georg

Petri Juntunen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Georg Michel <mic...@ipp.mpg.de> wrote:
: AFAIK generates the free GNU C compiler the best optimized code. I wrote

: the same program in assembler and C. The C code was about 0.8 times
: larger. You can download gcc from http://medo.fov.uni-mb.si/mapp

But how about AVR's "native" compiler, the IAR? If someone has this, it
would be nice to see how it compares to avr-gcc.

Namely, I am completely happy despite the fact I have to start the
compiler from cmd line or write Makefiles. I would be more happy if
someone could prove that the avr-gcc makes tighter code than IAR...

--
Petri

Ted Wood

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

car...@marte.cinv.iteso.mx wrote:
> Has anyone used the C compiler from Imagecraft ?
> I want to know how much larger will the object file be
> against pure assembly. Maybe in percentages.
>

Just started using the Imagecraft compiler. It seems to produce pretty
tight code. Haven't done any comparison with assembler. However you
might be interested in this page:
http://www3.igalaxy.net/~jackt/avr_compilers.htm
Which compares different tools for the AVR.
Cheers
TW
--
Any views expressed in this message are those of
the individual sender, except where the sender
specifically states them to be the views of
Sortex Ltd.

Georg Michel

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

For us the problem was that the IAR C compiler costs money and does not
run under Linux. A general assessment of the cross gcc in comparison to
commercial compilers can be found on
http://www.embedded.com/2000/0002/0002feat2.htm

Georg

Lars Wictorsson

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

> > Has anyone used the C compiler from Imagecraft ?
> > I want to know how much larger will the object file be
> > against pure assembly. Maybe in percentages.
> >
> Just started using the Imagecraft compiler. It seems to produce pretty
> tight code. Haven't done any comparison with assembler. However you
> might be interested in this page:
> http://www3.igalaxy.net/~jackt/avr_compilers.htm
> Which compares different tools for the AVR.

This page is almost a half year old and isn't updated
with the latest versions of compilers, so the numbers
are missleading for many tools.

/Lars

Richard F. Man

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

If you absolutely need the tightest code possible, use assembly. However, ease
of maintenance, and possible use of better algorithms (because you don't have
to worry a lot of the low level details) will throw the weight in favor of C
most of the times.

Regarding code efficiency, RIGHT at this moment (3/14/00) it is indeed true
that avr-gcc generates the best code, even comparing to IAR at $$$. However,
Open Source Compiler comes with a heavy price in terms of lack of ease of use,
lack of official support, lack of documentation etc. In other words, it is not
for everyone. I fit works for you, great!

RIGHT at this moment (3/14/00) avr-gcc is also more efficient than ICCAVR.
However, we support full math.h functions, full debugging with AVR Studio,
including data watchpoint, plus easy to use IDE, documentation, and quick
support. etc.

On top of that, within a month, we will be releasing an optimizer that should
bring our code size in parity with IAR and may be even with avr-gcc. This
optimization is a perfect fit for something like the AVR to decrease the code
size, and is something not in GCC code at all. Is all these worth the $199 for
a STD version and $499 for a PRO version? You decide - you can download a
fully functional 30 day demo from our website.

Petri Juntunen wrote:

--
// richard
http://www.imagecraft.com

Lars Wictorsson

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Hi Georg,

> AFAIK generates the free GNU C compiler the best optimized code. I wrote
> the same program in assembler and C. The C code was about 0.8 times
> larger. You can download gcc from http://medo.fov.uni-mb.si/mapp

Does this mean that the C code was 80% in size of the Assembler code,
meaning that the C code generated tigher code or what do you refer to
when you say "0.8 times larger"?

/Lars

Georg Michel

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

0.8 times larger means 1.8 times as big

Georg

David Brown

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

What exactly is the difference between the standard and Pro versions of the
Imagecraft AVR compiler? The website is very clear on the price
difference - not so clear on what the package actually contains.

Richard F. Man wrote in message <38CF672C...@imagecraft.com>...

>> : AFAIK generates the free GNU C compiler the best optimized code. I

wrote
>> : the same program in assembler and C. The C code was about 0.8 times
>> : larger. You can download gcc from http://medo.fov.uni-mb.si/mapp
>>

Bernhard

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

I recently did a test with IAR C, ICC and GCC on a project I am working
on. The result was:
IAR: 1720 Bytes (optimized for size)
IAR: 1860 Bytes (optimized for speed)
GCC: 2185 Bytes (optimized for size -Os)
GCC: 2215 Bytes (optimized for speed -O3)
ICC: 2794 Bytes (default optimization)
For I am working with the AT90S2313 only the IAR code fits in the
device. What surprised me ist that the GCC has smaller code in most
functions but the overhead and lib functions seem to increase the size.
I am not very familiar with GCC. Perhaps someone who is more experienced
with GCC can give me a hint for more optimization. I used the -Os switch
for size optimization.

Bernhard

Jyrki O Saarinen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Bernhard <Bernhard.Winter@no_spam.kst.siemens.de> wrote:

: device. What surprised me ist that the GCC has smaller code in most

: functions but the overhead and lib functions seem to increase the size.

Shouldn't you testing about compiling to an object file, without
linking any libraries?

What library functions you used? It's known that newlib (which
you probably used for a C-library) isn't the smallest one.

Bernhard

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

For me the running program on the chip counts. A fast function is
useless when the rest doesn't fit into the device.
I used the standard make file from Volker Oth's GCC examples from the
'Atmel for Dummies' web page
(http://members.xoom.com/volkeroth/index_e.htm).
!!! Thanks to Volker, a great page and easy to install !!!
As far as I can see the following files were linked in addition to mine:
gcrt1-8515.o
libgcc.a(udivsi3.o)
libgcc.a(divsi3.o)

Bernhard

John Devereux

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

On Wed, 15 Mar 2000 02:34:20 -0800, "Richard F.
Man" <ric...@imagecraft.com> wrote:

>If you absolutely need the tightest code possible, use assembly. However, ease
>of maintenance, and possible use of better algorithms (because you don't have
>to worry a lot of the low level details) will throw the weight in favor of C
>most of the times.
>
>Regarding code efficiency, RIGHT at this moment (3/14/00) it is indeed true
>that avr-gcc generates the best code, even comparing to IAR at $$$.

This is an amazing achievement for a free compiler
designed for 32 bit machines! I would have thought
that the byte fiddling required for an 8 bit
architecture would have made it a relatively poor
performer. GCC-PIC anyone? :)

I would just like to say that I for one appreciate
your candor here, a very refreshing change from
what was going on in certain other threads....

> However,
>Open Source Compiler comes with a heavy price in terms of lack of ease of use,
>lack of official support, lack of documentation etc. In other words, it is not
>for everyone. I fit works for you, great!
>
>RIGHT at this moment (3/14/00) avr-gcc is also more efficient than ICCAVR.
>However, we support full math.h functions, full debugging with AVR Studio,
>including data watchpoint, plus easy to use IDE, documentation, and quick
>support. etc.
>
>On top of that, within a month, we will be releasing an optimizer that should
>bring our code size in parity with IAR and may be even with avr-gcc. This
>optimization is a perfect fit for something like the AVR to decrease the code
>size, and is something not in GCC code at all. Is all these worth the $199 for
>a STD version and $499 for a PRO version? You decide - you can download a
>fully functional 30 day demo from our website.
>

>Petri Juntunen wrote:
>
>> Georg Michel <mic...@ipp.mpg.de> wrote:
>> : AFAIK generates the free GNU C compiler the best optimized code. I wrote
>> : the same program in assembler and C. The C code was about 0.8 times
>> : larger. You can download gcc from http://medo.fov.uni-mb.si/mapp
>>
>> But how about AVR's "native" compiler, the IAR? If someone has this, it
>> would be nice to see how it compares to avr-gcc.
>>
>> Namely, I am completely happy despite the fact I have to start the
>> compiler from cmd line or write Makefiles. I would be more happy if
>> someone could prove that the avr-gcc makes tighter code than IAR...
>>
>> --
>> Petri

-- John Devereux

jo...@devereux.demon.co.uk

Jyrki O Saarinen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

John Devereux <jo...@devereux.demon.co.uk> wrote:

: This is an amazing achievement for a free compiler

: designed for 32 bit machines! I would have thought
: that the byte fiddling required for an 8 bit
: architecture would have made it a relatively poor
: performer. GCC-PIC anyone? :)

gcc has been developed such a long time that it's no wonder
it's optimizations are very good in the pre-backend stage.

I wouldn't say that 'gcc is designed for 32-bit machines',
the problem with PICs and such are elsewhere than the word
sizes (or the lack of them).

Jyrki O Saarinen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Bernhard <Bernhard.Winter@no_spam.kst.siemens.de> wrote:

: For me the running program on the chip counts. A fast function is

: useless when the rest doesn't fit into the device.

True.

: As far as I can see the following files were linked in addition to mine:
: gcrt1-8515.o
: libgcc.a(udivsi3.o)
: libgcc.a(divsi3.o)

Oh, you wasn't talking about C-library after all, but libgcc,
which includes 'emulation code' for instructions not implemented
in hardware. You seem to be using unsigned and signed integer division.
These emulation libraries are general in gcc, and implemented
in C-language. So ofter commercial compilers are likely to have
better hand-tuned emulation libraries for a specific architecture.

Georg Michel

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

from your post I lern that you are using Volker Oth's Windows port
which is derived from the original avr-gcc (Unix) code. Maybe the
original GNU binutils generate smaller binaries than the Windows port.

Georg

Jyrki O Saarinen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Georg Michel <mic...@ipp.mpg.de> wrote:

: from your post I lern that you are using Volker Oth's Windows port

: which is derived from the original avr-gcc (Unix) code. Maybe the
: original GNU binutils generate smaller binaries than the Windows port.

They can't produce any smaller binaries. Linker doesn't have much
changes to affect here, but the user who forgots to strip all
debugging information and other unnecessary bits from the binary has.
This is why people keep asking 'why hello world compiled with X is this
big?'.

D. Cook

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Another issue may be vector tables.

I know in the AtMega series you can use the vector table area if you are not
using the interrupts. Reserving this space always can make the code a bit
bigger.

-- Devin

"Jyrki O Saarinen" <jxsa...@cc.helsinki.fi> wrote in message
news:8ao9ak$o5m$1...@oravannahka.helsinki.fi...

Jyrki O Saarinen

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

Georg Michel <mic...@ipp.mpg.de> wrote:

: from your post I lern that you are using Volker Oth's Windows port
: which is derived from the original avr-gcc (Unix) code. Maybe the
: original GNU binutils generate smaller binaries than the Windows port.

They can't produce any smaller binaries. The linker doesn't have much

changes to affect here, but the user who forgots to strip all
debugging information and other unnecessary bits from the binary has.

This is probably why the original poster asked why libgcc
code made his binary too much larger; libgcc is built with
debugging information by default, thus using:
"strip --strip-all mybinary" will remove all the unneeded bits.

John Devereux

unread,

Mar 15, 2000, 3:00:00 AM3/15/00

to

On 15 Mar 2000 13:59:17 GMT, Jyrki O Saarinen
<jxsa...@cc.helsinki.fi> wrote:

>John Devereux <jo...@devereux.demon.co.uk> wrote:
>
>: This is an amazing achievement for a free compiler
>: designed for 32 bit machines! I would have thought
>: that the byte fiddling required for an 8 bit
>: architecture would have made it a relatively poor
>: performer. GCC-PIC anyone? :)
>
>gcc has been developed such a long time that it's no wonder
>it's optimizations are very good in the pre-backend stage.

Sure, but it would appear that it is possible to
produce quite an efficient "8 bit backend", too,
which I did not realise had been done.

>
>I wouldn't say that 'gcc is designed for 32-bit machines',

Actually I'm pretty sure I read this in the FSF
docs about the objectives for GCC; something to
the effect that "GCC was designed for processors
with 32 bit registers, that can address bytes".

>the problem with PICs and such are elsewhere than the word
>sizes (or the lack of them).

Indeed. Horrendous stack limitations and separate
program / data spaces to start with, I would
guess.

-- John Devereux

jo...@devereux.demon.co.uk

Jyrki O Saarinen

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

John Devereux <jo...@devereux.demon.co.uk> wrote:

: Sure, but it would appear that it is possible to

: produce quite an efficient "8 bit backend", too,
: which I did not realise had been done.

Why wouldn't it possible to produce an efficient 8-bit backend?
There are efficient 16-bit backends too, like the H8.

The AVR port is contributed to the gcc project officially,
so it will be (don't know if it's already) in main source tree.

http://gcc.gnu.org/

: Actually I'm pretty sure I read this in the FSF

: docs about the objectives for GCC; something to
: the effect that "GCC was designed for processors
: with 32 bit registers, that can address bytes".

Maybe this was about the host system that gcc runs on?

About the AVR anyway; I was thinking about why they chose
8-bit registers? What's the rationale behind this? Even
if there are 32 of them but in practice one needs to have
larger word sizes, and thus when using shorts we have
16 registers left, and with ints only 8 of them.

Bernhard

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

I would like to try this on my sample project. Where do I have to put
the --strip-all switch? Do I need it to rebuild libgcc or can I use this
switch for my project in the make file when including libgcc ????

Thanks Bernhard

Jyrki O Saarinen

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Bernhard <Bernhard.Winter@no_spam.kst.siemens.de> wrote:

: I would like to try this on my sample project. Where do I have to put

: the --strip-all switch? Do I need it to rebuild libgcc or can I use this
: switch for my project in the make file when including libgcc ????

No, you don't have to rebuild libgcc. Just strip your executable
binary with "strip" utility which is included in binutils package
like as (assembler), ld (linker) etc.

Bernhard

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

When I run strip over my executable obj I get the following error
message:

->BFD: st000305: Error writing stabs !
->strip: st000305: Symbol needs debug section which does not exist

does this mean that the debug info was already removed? When I run strip
over the object files before linking the size of these files is
significantly reduced but they won't link after that procedure.

David Brown

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Jyrki O Saarinen wrote in message <8aptth$9jk$1...@oravannahka.helsinki.fi>...

>
>About the AVR anyway; I was thinking about why they chose
>8-bit registers? What's the rationale behind this? Even
>if there are 32 of them but in practice one needs to have
>larger word sizes, and thus when using shorts we have
>16 registers left, and with ints only 8 of them.
>

It's a matter of applications - for many applications where AVRs are used,
it is far more useful to have 32 8-bit registers than 16 16-bit registers.
In most of the embedded systems I have written, almost all variables fit in
8-bit bytes - 16-bit words or shorts are far less common. Occasionally I
need 24-bit or 32-bit data, but it is rare.

The AVR does have 4 pairs of registers (three 16-bit pointers, and an extra
pair because 4 is a nicer number than 3) that can be treated as 16-bit words
to some extent, such as for adding or subtracting small values (typically
index offsets). While it would be nice to have 16-bit operations directly,
most can easily be done in two instructions. I suspect the main limitation
is the instruction set width - with 16-bit instructions, there is a limit to
the available instruction space while keeping everything as orthogonal as
possible. Atmel have already had to make some compromises to keep within a
single 16-bit word (leading to single-cycle execution) for the majority of
instructions. Support for 16-bit date generally means either
variable-length instructions, or 32-bit wide instructions.

The only real problem with 8-bit registers is that ANSI C specifies that
ints must be at least 16-bits wide. For micros like the AVR, the standard
size should be 8-bit (and on many other 8-bitters, the standard should also
be unsigned).

Jyrki O Saarinen

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

David Brown <david....@westcontrol.com> wrote:

: It's a matter of applications - for many applications where AVRs are used,

: it is far more useful to have 32 8-bit registers than 16 16-bit registers.
: In most of the embedded systems I have written, almost all variables fit in
: 8-bit bytes - 16-bit words or shorts are far less common. Occasionally I
: need 24-bit or 32-bit data, but it is rare.

This seems to depend very much on the application - the term
"embedded system" is very poorly defined. ;)

: is the instruction set width - with 16-bit instructions, there is a limit to

: the available instruction space while keeping everything as orthogonal as

This is true. With two word size operands the instruction set size
would roughly double I think.

: possible. Atmel have already had to make some compromises to keep within a

: single 16-bit word (leading to single-cycle execution) for the majority of
: instructions. Support for 16-bit date generally means either
: variable-length instructions, or 32-bit wide instructions.

Or operands which work always on the full register size like in
most RISCs. What about that one ARM MCU which has 16 32-bit
registers and 2 byte instructions?

Jyrki O Saarinen

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Bernhard <Bernhard.Winter@no_spam.kst.siemens.de> wrote:

: ->BFD: st000305: Error writing stabs !

: ->strip: st000305: Symbol needs debug section which does not exist

Sorry, can't help with that.

: does this mean that the debug info was already removed? When I run strip

: over the object files before linking the size of these files is
: significantly reduced but they won't link after that procedure.

Yes, if you do "strip --strip-all", they won't go through linker
because there's no information to the linker how it should do its
work! See strip documentation how to strip your object files
so that they go through the linker. For libgcc you could try
"strip --strip-debug".

http://www.gnu.org/manual/binutils/html_chapter/binutils_9.html#SEC11

David Brown

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Jyrki O Saarinen wrote in message <8aqbie$mlj$1...@oravannahka.helsinki.fi>...

>David Brown <david....@westcontrol.com> wrote:
>
>: It's a matter of applications - for many applications where AVRs are
used,
>: it is far more useful to have 32 8-bit registers than 16 16-bit
registers.
>: In most of the embedded systems I have written, almost all variables fit
in
>: 8-bit bytes - 16-bit words or shorts are far less common. Occasionally I
>: need 24-bit or 32-bit data, but it is rare.
>
>This seems to depend very much on the application - the term
>"embedded system" is very poorly defined. ;)

Note the qualifier "embedded systems I have written"... You are perfectly
correct, of course. I was thinking mostly of small embedded systems (a
slightly less poorly defined term), in which small 8-bit micros are commonly
used.

>
>: is the instruction set width - with 16-bit instructions, there is a limit
to
>: the available instruction space while keeping everything as orthogonal as
>
>This is true. With two word size operands the instruction set size
>would roughly double I think.

It's not that bad - an extra bit or two is enough to specify operand size.
But there is a huge difference between a 16-bit instruction set and a 17-bit
set.

>
>: possible. Atmel have already had to make some compromises to keep within
a
>: single 16-bit word (leading to single-cycle execution) for the majority
of
>: instructions. Support for 16-bit date generally means either
>: variable-length instructions, or 32-bit wide instructions.
>
>Or operands which work always on the full register size like in
>most RISCs. What about that one ARM MCU which has 16 32-bit
>registers and 2 byte instructions?
>

You will always need instructions that operate on smaller sizes. If your
registers are 32-bit, you need to be able to work with bytes and 16-bit
words as well. When your micro has 128 bytes RAM, you do not want to have
to store everything as 32-bit! Having a few 8 and 16-bit instructions (such
as just for loading a storing) is not really a good solution, as you would
quickly need extra code/time/hardware to handle the thunking.

David Brown

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Bernhard wrote in message <38CF86A5.1846597A@no_spam.kst.siemens.de>...

>I recently did a test with IAR C, ICC and GCC on a project I am working
>on. The result was:
>IAR: 1720 Bytes (optimized for size)
>IAR: 1860 Bytes (optimized for speed)
>GCC: 2185 Bytes (optimized for size -Os)
>GCC: 2215 Bytes (optimized for speed -O3)
>ICC: 2794 Bytes (default optimization)
>For I am working with the AT90S2313 only the IAR code fits in the

>device. What surprised me ist that the GCC has smaller code in most
>functions but the overhead and lib functions seem to increase the size.

>I am not very familiar with GCC. Perhaps someone who is more experienced
>with GCC can give me a hint for more optimization. I used the -Os switch
>for size optimization.
>
>Bernhard
>

If anyone who has some AVR compilers (especially IAR, ICC and GCC, as they
seem to be the mainstream compilers) has the time and inclination, I think a
lot of us would be very interested in some comparison details, such as how
well do these compilers optomize code? Any answers or opinions on the
following questions would be much appreciated.

Do the compilers use registers for passing data to functions, or do they use
standard C stack frames?

Do they automatically assign local variables to registers even without the
register keyword? If so, are they smart enough to use single registers
rather than pairs when "int" is specified but not needed (e.g., int i; for
(i = 0; i < 10; i++) {...};)?

Do they always use the same registers for everything, or do they split
register usage across the whole set (for example, if parameters are always
passed in R6 and R7, and function return values are always in R0 and R1,
then there is going to be a lot of unneccessary swapping of data in and out
of these registers)?

Do they support struct parameters and return values?

Are there any odd limitations in the implementation of the language (some
compilers specify arbitary limits on expression complexity, for example)?

Do they implement bit instructions well, especially when using bitfields?
For example, with the code:
struct {char b0 : 1; char b1 : 1; } bits;
bits.b1 = bits.b0;
does the compiler generate bit instructions or rotating, masking and logical
operations?

Are there additional keywords for supporting bit types directly?

Are there additional keywords for interrupt routines? Is it possible to
reserve registers for use in interrupt routines?

If anyone can be bothered, I would be very interrested to see the resulting
assembly from the following C code - it tests several basic features of the
compiler, especially pointer manipulation.

char Test(char v) {
char a[10];
static b[12];
int i; /* Could also try with
unsigned char i */

for (i = 0; i < 10; i++) {
a[i] = b[i + 2];
};
};

A lot of this could be included in
http://www3.igalaxy.net/~jackt/avr_compilers.htm , which has some
comparisons between different AVR tools. But as another poster has pointed
out, this side has not been updated for over 6 months.

Jyrki O Saarinen

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

David Brown <david....@westcontrol.com> wrote:

: You will always need instructions that operate on smaller sizes. If your

: registers are 32-bit, you need to be able to work with bytes and 16-bit
: words as well. When your micro has 128 bytes RAM, you do not want to have
: to store everything as 32-bit! Having a few 8 and 16-bit instructions (such

Of course storing of variables would be done as small as possible.
But you are thinking of terms of very small systems, I'm thinking
of mid-sizes systems (I'm involved in a rather large H8/S project).
I think this just shows the diversity of embedded systems..

About your post concerning compiler benchmarking; does anyone have
an idea what would be a good and fair test program for testing
compiler output quality? I would be interested in comparing gcc
vs. IAR on H8/S platform.

David Brown

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Jyrki O Saarinen wrote in message <8aqgac$r94$2...@oravannahka.helsinki.fi>...

The diversity of applications and microcontrollers makes it very difficult
to provide general tests. I mentioned a number of points that I think would
be relevant for AVR C compilers - had I been looking at compilers for a
CPU32, which I use on higher end systems, I would have asked different
questions. Maybe you need to start a new thread comparing compilers for the
H8/S.

D. Cook

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

"Jyrki O Saarinen" <jxsa...@cc.helsinki.fi> wrote in message
news:8aptth$9jk$1...@oravannahka.helsinki.fi...

> John Devereux <jo...@devereux.demon.co.uk> wrote:
>
> : Sure, but it would appear that it is possible to
> : produce quite an efficient "8 bit backend", too,
> : which I did not realise had been done.
>
> Why wouldn't it possible to produce an efficient 8-bit backend?
> There are efficient 16-bit backends too, like the H8.

The problem is usually the number of registers. Stacks tend to be small and
expensive (time wise) on most 8-bit systems.

The AVR shines because it doesn't require the stack (except for returns)
unless things get way out of hand.

-- Devin

Marc

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

> This is an amazing achievement for a free compiler
> designed for 32 bit machines! I would have thought
> that the byte fiddling required for an 8 bit
> architecture would have made it a relatively poor
> performer. GCC-PIC anyone? :)

The AVR architecture is very straight forward. I had to implement
a 32x16 = 48 bit multiplication once in assembler for example, and
it was done in less than 10 minutes. There is no "byte fiddeling"
like you know it from PIC. Forget the headaches :-)

Add, sub, shift, all the supported 8bit operations do n-bit as well:

add Q0,P0 ; 24 bit add
adc Q1,P1
adc Q2,P2
brne total_result_not_zero

subi P0,(1234567890>> 0) ; 56 bit sub immediate
sbci P1,(1234567890>> 8)
sbci P2,(1234567890>>16)
sbci P3,(1234567890>>24)
sbci Q0,(1234567890>>32)
sbci Q1,(1234567890>>40)
sbci Q2,(1234567890>>48)

Marc

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

> The only real problem with 8-bit registers is that ANSI C specifies that
> ints must be at least 16-bits wide. For micros like the AVR, the standard
> size should be 8-bit (and on many other 8-bitters, the standard should also
> be unsigned).

In my own programs (even on PC) I almost never use "int". I made typedefs
that explicitly define sign and size, such as "uchar" "uword" or "slong"
for example. The only situations where I use "int" is when I use ANSI
standard library code. This happens only seldomly, because most of my
PC programs are for DOS and self-contained, no ANSI code except for
file-io and printf() usually.

Those programs port quickly and are very memory efficient on the AVR.

Marc

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

> Or operands which work always on the full register size like in
> most RISCs. What about that one ARM MCU which has 16 32-bit
> registers and 2 byte instructions?

The point is that an instruction set can either support a
many registers of the same width, or few registers of multiple
width, or must prefer some registers as "special", or must
have a wide instruction word (or multiple words forming one
instruction) to store the extra wishes of the programmer.

In ARM THUMB mode there are a a lot of restrictions in regard to
which single register or combination of registers can be used
with a particular instruction. To profit from all 16 registers
(well 13 general purpose) you often have to spend an extra
instruction to shuffle the data around.

(Not that the AVR does not have preferred registers either) :-)

And THUMB can't operate on smaller register sizes without
scratching the data in the higher bits. An LDRB loads to the
lower 8 bits only (not possible to directly load to bits 16-23
for example) and even erases bits 8-31.

The C166 family is nicer in this respect. It lets you specify
whether the higher bits should be cleared, sign expanded or
left untouched when you load the lower part (for example MOVB vs
MOVBZ), and lets you specifically address the high byte as
separate register (rh0 for example is the higher part of r0).

The AVR is 8 bit with few exceptions, the THUMB is 32bit with
little more exceptions. C166 is quite good at 8 _and_ 16 bit
data.

As usual, each architecture has its own advantages.

John Devereux

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

On Thu, 16 Mar 2000 14:23:49 +0000, Marc
<ma...@aargh.franken.de> wrote:

>> This is an amazing achievement for a free compiler
>> designed for 32 bit machines! I would have thought
>> that the byte fiddling required for an 8 bit
>> architecture would have made it a relatively poor
>> performer. GCC-PIC anyone? :)
>
>The AVR architecture is very straight forward. I had to implement
>a 32x16 = 48 bit multiplication once in assembler for example, and
>it was done in less than 10 minutes. There is no "byte fiddeling"
>like you know it from PIC. Forget the headaches :-)

I have to confess I have never actually used the
PIC, well suited though it may be to small
applications.

It is seemed to me to have similarities to the old
8048 (the worst microprocessor in the whole world)
with which I am only too familiar. I still have
nightmares...

For 8 bitters I tend to use 68HC11 and 8051
derivatives since I have compilers for them, but
will definitely look at AVR + GCC (or Imagecraft
ICC) for new projects.

>
>Add, sub, shift, all the supported 8bit operations do n-bit as well:
>
> add Q0,P0 ; 24 bit add
> adc Q1,P1
> adc Q2,P2
> brne total_result_not_zero
>
> subi P0,(1234567890>> 0) ; 56 bit sub immediate
> sbci P1,(1234567890>> 8)
> sbci P2,(1234567890>>16)
> sbci P3,(1234567890>>24)
> sbci Q0,(1234567890>>32)
> sbci Q1,(1234567890>>40)
> sbci Q2,(1234567890>>48)

-- John Devereux

jo...@devereux.demon.co.uk

Marc

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

> If anyone can be bothered, I would be very interrested to see the resulting
> assembly from the following C code - it tests several basic features of the
> compiler, especially pointer manipulation.
>
> char Test(char v) {
> char a[10];
> static b[12];
> int i; /* Could also try with
> unsigned char i */
>
> for (i = 0; i < 10; i++) {
> a[i] = b[i + 2];
> };
> };

IAR-C generates these errors:

(623) : Warning[14]: Type specifier missing; assumed "int"
(630) : Warning[33]: Local or formal 'v' was never referenced
(630) : Warning[22]: Non-void function: explicit "return" <expression>; expected
(630) : Error[109]: ';' unexpected
(631) : Error[4]: Unexpected end of file encountered

I changed your code to:

char Test(char v) {
char a[10];

static char b[12];

int i; /* Could also try with
unsigned char i */

i = (int)v;

for (i = 0; i < 10; i++) {
a[i] = b[i + 2];
}

return a[0];
}

and IAR-C generated this (max optimization for code-size, not speed):

620
621 char Test(char v) {
\ Test:
\ 00000472 9A93 ST -Y,R25
\ 00000474 8A93 ST -Y,R24
\ 00000476 2A97 SBIW R28,LOW(10)
622 char a[10];
623 static char b[12];
624 int i; /* Could also try with
625 unsigned char i */
626
627 i = (int)v;
628
629 for (i = 0; i < 10; i++) {
\ 00000478 8827 CLR R24
\ 0000047A 9927 CLR R25
\ ?0112:
\ 0000047C E82F MOV R30,R24
\ 0000047E F92F MOV R31,R25
\ 00000480 3A97 SBIW R30,LWRD(10)
\ 00000482 64F4 BRGE ?0111
630 a[i] = b[i + 2];
\ 00000484 E82F MOV R30,R24
\ 00000486 F92F MOV R31,R25
\ 00000488 .... SUBI R30,LOW(-((?0110+2)))
\ 0000048A .... SBCI R31,HIGH(-((?0110+2)))
\ 0000048C 0081 LD R16,Z
\ 0000048E EC2F MOV R30,R28
\ 00000490 FD2F MOV R31,R29
\ 00000492 E80F ADD R30,R24
\ 00000494 F91F ADC R31,R25
\ 00000496 0083 ST Z,R16
\ 00000498 0196 ADIW R24,LWRD(1)
\ 0000049A F0CF RJMP ?0112
\ ?0111:
631 }
632
633 return a[0];
\ 0000049C 0881 LD R16,Y
634 }
\ 0000049E 2A96 ADIW R28,LOW(10)
\ 000004A0 8991 LD R24,Y+
\ 000004A2 9991 LD R25,Y+
\ 000004A4 0895 RET
\ ; i R24-R25
\ ; v R16
635

I've re-coded the algorithm in a manner like I usually code when I plan
to port to AVR (IAR):

char Test(char v) {
char a[10];

static char b[12];
char i;
char *p1;
char *p2;

i = v;

p1 = a;
p2 = b+2;

i=10; do {
*p1++ = *p2++;
} while (--i);

return a[0];
}

When I started to use IAR-C I've made a "training day" where I tried several
C constructs and looked to what they compile. From looking at the compiler
output I learned to predict its behaviour while coding an algorithm, and
changed my programming style towards good results.

The above code is not result of specific iterative optimization, but
such habitual pro-IAR coding. There might still be room for further
"tweaked" C source and better output.

The output of the above is:

621 char Test(char v) {
\ Test:
\ 00000472 ........ CALL ?PROLOGUE4_L09
\ 00000476 2A97 SBIW R28,LOW(10)
\ 00000478 0A93 ST -Y,R16
622 char a[10];
623 static char b[12];
624 char i;
625 char *p1;
626 char *p2;
627
628 i = v;
629
630 p1 = a;
\ 0000047A AC2F MOV R26,R28
\ 0000047C BD2F MOV R27,R29
\ 0000047E 1196 ADIW R26,LWRD(1)
631 p2 = b+2;
\ 00000480 .... LDI R30,LOW((?0110+2))
\ 00000482 .... LDI R31,((?0110+2) >> 8)
632
633 i=10; do {
\ 00000484 0AE0 LDI R16,10
\ ?0113:
634 *p1++ = *p2++;
\ 00000486 1191 LD R17,Z+
\ 00000488 1D93 ST X+,R17
635 } while (--i);
\ 0000048A 0A95 DEC R16
\ 0000048C E1F7 BRNE ?0113
636
637 return a[0];
\ 0000048E 0981 LDD R16,Y+1
638 }
\ 00000490 2B96 ADIW R28,LOW(11)
\ 00000492 E4E0 LDI R30,4
\ 00000494 ........ JMP ?EPILOGUE_B4_L09
\ ; i R16
\ ; p2 R30-R31
\ ; p1 R26-R27
639

(?PROLOGUE und ?EPILOGUE are subfunctions that save registers at -Y.
They are one of the main contributors to the code size savings when
optimizing for size not speed)

David Brown

unread,

Mar 16, 2000, 3:00:00 AM3/16/00

to

Marc wrote in message <38D0FB47...@aargh.franken.de>...

>I changed your code to:
>
> char Test(char v) {
> char a[10];
> static char b[12];
> int i; /* Could also try
with
>unsigned char i */
>
> i = (int)v;
>
> for (i = 0; i < 10; i++) {
> a[i] = b[i + 2];
> }
>
> return a[0];
> }

Sorry about the typos! This was pretty much what I meant, with b as a char
array (although it would also be interesting to see what happened with an
int array). And I had thought to simply return v as the result, but forgot
to add the final line. The i = (int) v line is optomised out by the
compiler (a good sign).

>
>and IAR-C generated this (max optimization for code-size, not speed):
>

> [snip]
>

That is really poor code. The compiler should be able to transform the
given code into something much the same as your modified code. Your
modified code produces an optimal inner loop, but to get that you have
pretty much written the assembly yourself. At a rough count, the original
code is about 3 times the size and time of the second code sample.

I seldom use C compilers for 8-bit micros - I prefer to write things in
assembly normally. One of the reasons is that most C compilers produce such
poor code for 8-bit micros. To get anything resembling the speed or
compactness of assembly code, you need to hand-optomise your C code so much
you might as well write assembler. As far as I understood it, Atmel and IAR
worked together during the development of the AVR, to make a microcontroller
that would work well with C code. Features such as three pointers with
auto-increment modes were added specifically for such code, and yet IAR's C
compiler does not use it unless you pretty much force it to! Am I asking
too much from a compiler, expecting an "optomising compiler" to actually do
some optomisation? Are other compilers just as limited?

Anyway, thank you for generating this code - it was very informative.

>
>
>I've re-coded the algorithm in a manner like I usually code when I plan
>to port to AVR (IAR):
>
> char Test(char v) {
> char a[10];
> static char b[12];
> char i;
> char *p1;
> char *p2;
>
> i = v;
>
> p1 = a;
> p2 = b+2;
>
> i=10; do {
> *p1++ = *p2++;
> } while (--i);
>
> return a[0];
> }
>
>When I started to use IAR-C I've made a "training day" where I tried
several
>C constructs and looked to what they compile. From looking at the compiler
>output I learned to predict its behaviour while coding an algorithm, and
>changed my programming style towards good results.
>
>The above code is not result of specific iterative optimization, but
>such habitual pro-IAR coding. There might still be room for further
>"tweaked" C source and better output.
>
>
>The output of the above is:
>

> [snip]