Fixed-point Math help

Las

unread,

Dec 12, 2004, 6:00:10 PM12/12/04

to

Before I start Goolging my brains out I thought I would ask the group:

I'm looking for any information to on implementing float-point algorithms in
fixed-point math.

Thanks.

Tim Wescott

unread,

Dec 12, 2004, 6:06:00 PM12/12/04

to

You've searched Embedded.com?

If not:

Why? Do you really mean floating point, or do you mean fixed-point
non-integer? If you really mean floating point, what toolset are you
using that doesn't have a perfectly good floating point library to use
-- or are you coding in assembly and not C?

For the most part all the processors I've used in the last 8 years have
had decent floating point support. The exceptions have been the 186
using the Borland tools that required a patch from US Software, and the
196 with the old Intel tools that crashed the ICE any time you attempted
floating point operations.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Andrew Reilly

unread,

Dec 13, 2004, 6:53:00 AM12/13/04

to

Don't do that. Understand the problems/algorithms and then implement them
directly in fixed point. Sometimes you really do need to carry the
scale around with every computation, but that's acutally pretty unusual.
More usually, you can work with some notion of "full scale" and a
corresponding "noise floor" and just leave it at that. If you try to
translate directly from floating point you're unlikely to really
understand the numeric issues and the resulting code will be inefficient
as a result.

Cheers,

--
Andrew

bungalow_steve

unread,

Dec 13, 2004, 11:32:51 AM12/13/04

to

try http://www.wwnet.net/~stevelim/fixed.html

basic idea is to look at the maximum/minium values of your floating
point inputs, and the maximum/minimum values of each computation, and
move the "decimal" point as needed for each computation to prevent
overflow AND preserve the maximum resolution. If your using a large
enough date type (32 bits) you may be able to get away with keeping the
"decimal" point fixed throughout the calculations. If you are using a
16 bit processor, many calculations can be done with 16 bits, stuff
like integrations you will need 32 bits.

bungalow_steve

unread,

Dec 13, 2004, 11:45:08 AM12/13/04

to

For high volume applications, fixed point is usually chosen over
floating point because of the reduced die size, cost, and power
requirements of the processor. I think 90% of DSP processor sold are
still fixed point for this reason. Emulation of floating point math on
a fixed point processor is usually not an option as the throughput is
reduced by 100x or more. In this case, even if the application still
could run with the reduced throughput, it may still be converted to
fixed point math so that the processor clock could be dropped from say,
40Mhz to 1 Mhz (reduce power, lower EMI etc). For low volume
applications, is easier to use floating point.

Paul E. Bennett

unread,

Dec 13, 2004, 1:40:02 PM12/13/04

to

bungalow_steve wrote:

I rarely need floating point either. Someone else has already raised the
issue of the speed hit when you do a soft floating point calculation on
hardware without floating point support. However, if you really do need to
do it then take a look at some Forth floating point code. That is usually
quite good as a basis (mind you, I am already part way there as I use Forth
for most of my systems anyway. Still, the techniques should be
translatable).

In my current project I am using 48bit intermediaries on a 16 bit machine
in order to avoid delving into floating point calculations. It is still a
speed win and maintains the accuracy I need in the result.

--
********************************************************************
Paul E. Bennett ....................<email://peb@a...>
Forth based HIDECS Consultancy .....<http://www.amleth.demon.co.uk/>
Mob: +44 (0)7811-639972 .........NOW AVAILABLE:- HIDECS COURSE......
Tel: +44 (0)1235-811095 .... see http://www.feabhas.com for details.
Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..
********************************************************************

bungalow_steve

unread,

Dec 13, 2004, 3:36:47 PM12/13/04

to

I'm not sure how good forth floating point code is, but for the
floating point routines I am using on a 16 DSP fixed point processor,
this is what I'm getting (clock cycles) for fixed vs floating point
math

Addition: 16 bit Fixed 1 cycle, Single Precision Float 122 Cycles
Subtraction: 1 cycle fixed , 124 cycles float
Multiplication 1 cycle fixed, 109 cycles float
Division 16 cycles fixed, 361 cycles float

this is custom floating point assembly code optimized for the processor
from the manufacturer. So I'm basically getting over 100x performance
boost when using fixed point, really hard to throw away that
improvement, though I still use floating point for non critical and
debug purposes.

Paul Keinanen

unread,

Dec 13, 2004, 4:34:59 PM12/13/04

to

On 13 Dec 2004 12:36:47 -0800, "bungalow_steve"
<bungalo...@yahoo.com> wrote:

>Addition: 16 bit Fixed 1 cycle, Single Precision Float 122 Cycles
>Subtraction: 1 cycle fixed , 124 cycles float
>Multiplication 1 cycle fixed, 109 cycles float
>Division 16 cycles fixed, 361 cycles float

While I agree that doing floating point addition and subtraction in
software can be quite time consuming due to the denormalisation and
normalisation phases, I really do not understand, how the
multiplication can take that long. Basically you just multiply the
mantissa and add the exponents.

This should not take too long, unless the mantissa size is larger than
the integer register size. On a 16 bit integer processor, it would be
sensible to use a floating point format with 8 bit exponent and 16 bit
mantissa.

Paul

Andrew Reilly

unread,

Dec 13, 2004, 5:15:12 PM12/13/04

to

Those sort of numbers are almost certainly for IEEE-conformant floating
point emulation. So you have full subroutine call overhead, packing and
unpacking the 32-bit (or 64-bit) IEEE format on a 16-bit DSP that wasn't
necessarily designed for such operations, and then taking care of the
special cases (denorms, NaNs and Infs). That would be likely to be very
ugly on most 16-bit fixed point DSPs.

I don't think that the C standard stipulates IEEE arithmetic yet,
does it? Many users probably expect it, though.

--
Andrew

CBFalconer

unread,

Dec 13, 2004, 5:52:20 PM12/13/04

to

"Paul E. Bennett" wrote:
>
... snip ...

>
> In my current project I am using 48bit intermediaries on a 16 bit
> machine in order to avoid delving into floating point calculations.
> It is still a speed win and maintains the accuracy I need in the
> result.

Doesn't Forth have some other form for handling many of these
problems, something like rational fractions is jiggling my memory.

--
Chuck F (cbfal...@yahoo.com) (cbfal...@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Paul E. Bennett

unread,

Dec 13, 2004, 9:35:12 PM12/13/04

to

CBFalconer wrote:

> "Paul E. Bennett" wrote:
>>
> ... snip ...
>>
>> In my current project I am using 48bit intermediaries on a 16 bit
>> machine in order to avoid delving into floating point calculations.
>> It is still a speed win and maintains the accuracy I need in the
>> result.
>
> Doesn't Forth have some other form for handling many of these
> problems, something like rational fractions is jiggling my memory.

That is the thing about Forth. If you need it you either find it in the
Forth Scientific Library or, more likely, you end up rolling your own. My
48bit intermediary is quite necessary to maintain the accuracy up until the
point I do the cube root. It is all back to 16 bits then.

bungalow_steve

unread,

Dec 13, 2004, 10:12:07 PM12/13/04

to

The floating point functions are IEEE-754 compliant (32 bit not 64
bit), with signed zero, signed infinity, NaN (Not a Number) and
denormal support and operated in the "round to nearest" mode. I
supposed if I rolled my own floating point format as you suggest the
multiply would be much faster.

Tim Wescott

unread,

Dec 14, 2004, 10:16:18 AM12/14/04

to

bungalow_steve wrote:

I've used floating point on production code for one of four reasons:

1. For startup, to read parameters out of EEPROM and calculate the
appropriate fixed-point parameters in a way that is easily maintainable.

2. For scientific instruments with complicated math and without a need
for said math to happen quickly.

3. In secondary processes that were not time critical, but where
maintainability was enhanced by using floating point.

4. Because, much to my surprise, floating point on a TI '2812 is only a
few times slower (rather than 100x) than fixed point math.

Tim Wescott

unread,

Dec 14, 2004, 1:13:52 PM12/14/04

to

bungalow_steve wrote:

> 4. Because, much to my surprise, floating point on a TI '2812 is only
> a
>
>>few times slower (rather than 100x) than fixed point math.
>>
>
>

> Sorry but there is no way floating point emulation on a TI 2812 is only
> a few times slower then fixed, if so TI wouldn't need to make floating
> point DSP's anymore! Floating point on the 2812 (or any fixed point
> dsp) is about 100 times slower then fixed point.
>
Benchmarks, please?

I _do_ have to fess up that I have only compared it to an integer
package that includes bounds checking, which seriously slows down the
integer computation -- and we didn't use it on that processor for
anything other than what I've already advocated; the "real" computation
happened in fixed point.

dalai lamah

unread,

Dec 14, 2004, 2:03:00 PM12/14/04

to

Il 14 Dec 2004 10:05:09 -0800, bungalow_steve ha scritto:

> 4. Because, much to my surprise, floating point on a TI '2812 is only
> a
>> few times slower (rather than 100x) than fixed point math.
>>
>

> Sorry but there is no way floating point emulation on a TI 2812 is only
> a few times slower then fixed, if so TI wouldn't need to make floating
> point DSP's anymore! Floating point on the 2812 (or any fixed point
> dsp) is about 100 times slower then fixed point.

Maybe it's 100 times slower in the worst case, for example a MAC operation
done in assembly with the MAC-specific hardware or in C with double
precision math. But for "generic" operations, if you compare fixed-point
32-bit C code and float C code, the "few times slower" statement looks very
familiar to me. The same applies with C5400 family.

--
asd

Tim Wescott

unread,

Dec 14, 2004, 2:41:44 PM12/14/04

to

dalai lamah wrote:

This is my experience -- and I failed to point out that I was comparing
MAC-less integer arithmetic to floating point. With a MAC, of course,
integer arithmetic is way faster.

Tim Wescott

unread,

Dec 14, 2004, 8:31:33 PM12/14/04

to

bungalow_steve wrote:

> TI doesn't publish floating point emulation specs for its fixed point
> processors, that I know of anyhow, it depends on the complier, If you
> have a C complier just write a simple fixed point rountine and compare
> it to a emulated floating point
>
> here is benchmarks for a microchip 16 bit dsp (single cycle integer
> multiplys/adds) that I posted earlier, probably similar in performance
> to a 16 bit TI chip
> http://ww1.microchip.com/downloads/en/DeviceDoc/51459b.pdf
>
No, I'm asking _you_ for the benchmarks that _you_ are using to back up
your ever so strongly voiced opinions.

_I_ know how fast the damn processor is -- when I benchmarked it against
my fixed point math package I nearly fell out of my chair. It's ratio
of floating point math vs. fixed point math is between 20x and 50x
better than a Pentium.

With a fixed point math package that does 1r15 arithmetic, in ANSI C,
with saturation, a Pentium is about 20-50 times faster than it is with
floating point. The '2812 runs neck and neck. It certainly doesn't do
floating point as fast as pure integer math, but it certainly _does_
knock the socks off of anything else I've had occasion to use.

Frankly I would have responded as you did if I hadn't done the
experiment myself.

So, upon what benchmarks are you basing your claim?

Jonathan Kirwan

unread,

Dec 15, 2004, 2:22:05 AM12/15/04

to

On 14 Dec 2004 20:51:40 -0800, "bungalow_steve" <bungalo...@yahoo.com>
wrote:

>My
>personal experience with Analog Devices fixed point DSP's indicates two
>orders of magnitude difference between fixed point and emulated
>floating point, but I just thought you would be more interested in a
>published benchmark that backs up my claim, thats all.

I tend to agree that one shouldn't rely on P-II comparisons, for example, as a
means of comparing integer vs floating point performance when discussing the
general turf of embedded processors. Nor should one compare apples and oranges.

The OP was asking about how to IMPLEMENT floating point routines using fixed
point, by the way, and this branch here is decidedly moved away from anything
helpful there -- though perhaps still interesting. I'm not sure how Tim's segue
comment addressed this (seems to me it was arguably on topic to suggest
searching google, but otherwise boils down to telling someone that they don't
need to know how to implement floating point because the support is already
there 'so why ask' at all...)

I agree with your comment, "fixed point is usually chosen over floating point

because of the reduced die size, cost, and power requirements of the processor."

In our case, cost was certainly a consideration though not a large one (at
first.) However, power requirements and dissipation were vital issues for us as
well as getting the necessary processing done on time (of course.) I don't
think die size directly mattered, though I'm sure that an excessively large
package would have been a problem then. Package size *is* more important for
some applications I work on, so that puts a more direct pressure on the die, of
course.

I cannot agree with your rejoinder, though, that "Emulation of floating point

math on a fixed point processor is usually not an option as the throughput is

reduced by 100x or more." First off, my very first application on the ADSP-21xx
from Analog Devices dealt with values that are important to maintain over a very
wide dynamic range. At least 16 bits of precision had to be maintained for some
6-8 orders of magnitude, for example. Some kind of floating point features were
essential, even while using a very cool running DSP like these were at the time.
(What kept us from using TI integer DSPs at the time was a different issue that
was inescapable with their parts and required for the application.) So floating
point on a fixed point processor wasn't only an option, it was vital.

One of the things that is glossed over in your comment here is that floating
point processing doesn't have to be used 24/7 by the application. If it were
always the case that the CPU was bottlenecked doing floating point continually,
then yes -- for a given performance level you'd probably be better off with a
DSP supporting floating point if you needed floating point in that way, rather
than using a super-high speed integer processor and emulating it at a similar
rate. But if what you need is modest bursts of floating point operations as
well as very low power requirements and low cost, etc., and when you could well
use the boost of a MAC or fully combinatorial barrel shifter to help it along,
then an integer DSP is probably quite reasonable. The price of adding floating
point in hardware is usually a continuous drain and excessive power consumption,
if you don't need it all the time, and that adds unnecessary cost both to the
processor and all of the surrounding circuitry and dissipation support required.
Further, if your application requires a wide dynamic range, some kind of
floating point support remains.

So, floating point is not only an option on integer processors.. sometimes, it
is a requirement for them.

But you've make me slightly curious. I use ADSP-21xx integer DSPs routinely
(not moved up to BlackFin, though) and my experience using the barrel shifter
with integer operations for floating point purposes hasn't been as bad as 100:1
versus fixed point for operations that reasonably might be considered similar in
precision (but not in dynamic range, of course.) But I write my own code and do
NOT use libraries nor do I use C, and I use the full capability of packing
instructions. Can you be precise about what you are comparing here so I can
consider some specific cases just for my own sake?

Jon

Jonathan Kirwan

unread,

Dec 15, 2004, 2:40:21 AM12/15/04

to

On Sun, 12 Dec 2004 18:00:10 -0500, "Las" <e.f...@rainbow.net> wrote:

>I'm looking for any information to on implementing float-point algorithms in
>fixed-point math.

No one has directly addressed this, I think. Probably because there is a lot of
truth in the idea of "integer ADC --> processing --> integer DAC" means you
shouldn't insert FP into the "processing" block if you can avoid it. Most
applications just don't have the wide dynamic range that floating point supports
and there are "gotchas" with using floating point that require care to use well.
And why add it, if everything coming in is integer and everything going out is
integer? Just stay in integer... if you can.

But if you are really interested in learning how, Analog Devices has a book on
implementing floating point with their ADSP-21xx processors that I believe may
be able to be downloaded. You can also examine the floating point formats
commonly used, along with a tedious description about special cases, in Intel's
documentation on their processors, available from Intel's web site. Some of the
older DEC manuals included details on implementing floating point, too. (In the
earlier days, teaching programmers about floating point details was important as
it was a required skill for everyday programmers. That's far less true, today.)

I don't have a convenient web site in mind, but Tim Wescott's suggestion of
using google is probably a good one. Use "floating point" and "normalization"
and "denormalization" and "exponent" and "mantissa" and "hidden bit" and perhaps
the four common operations to help track something down. This is more your 'do
diligence,' until you've done this yourself and can explain why it's not getting
you there.

The basic idea is that you have an exponent (signed, twos-complement) and an
unsigned mantissa (with a possible hidden bit for non-zero values) and a
separate sign bit. These can be packed in any format you like or is convenient
to you. Each of these is an integer. There is no explicit radix (decimal)
point, but it is usually assumed for the mantissa at any convenient place and
the exponent then adjusts this, left or right, for - or + values of the
exponent. The mantissa is usually stored 'normalized' which means that it is
shifted until the leading bit is always a '1' (which is always possible unless
the value is actually zero, but that is easily detected.) Some formats simply
throw away the leading bit, because it is always '1', and put it back when
needed in order to add one apparent bit of precision.

The rest is just software. Try a paper exercise and see where it takes you.
That's a good start, if you plan to try and implement something yourself.
Another choice would be to examine library code -- again, search google.

Jon

Andrew Reilly

unread,

Dec 15, 2004, 3:44:03 AM12/15/04

to

On Wed, 15 Dec 2004 08:40:21 +0000, Jonathan Kirwan wrote:
> On Sun, 12 Dec 2004 18:00:10 -0500, "Las" <e.f...@rainbow.net> wrote:
>
>>I'm looking for any information to on implementing float-point algorithms in
>>fixed-point math.

> The basic idea is that you have an exponent (signed, twos-complement)

> and an unsigned mantissa (with a possible hidden bit for non-zero
> values) and a separate sign bit. These can be packed in any format you
> like or is convenient to you.

If you're doing floating point on a fixed point DSP, for dynamic range
reasons, and you have no particular reason to comply with IEEE floating
point formats, why would you bother with an unsigned mantissa or implied
leading bit? Is it because you knew that you absolutely needed that extra
bit of precision? One can go a long way with a simple two-word format:
mantissa and exponent, with nothing special about the mantissa, so that
the chip's normal signed multiplies and adds work fine. (I never used it,
but I believe that the Motorola C compiler for the 56000 series used this
format. At least, one of the debuggers knew how to display memory blocks
in that format...)

Many (most?) DSP processors have "normalize" or "count leading
zeros/leading ones" instructions too, which makes the
normalization/alignment process a bit of a slam-dunk.

> Each of these is an integer. There is no
> explicit radix (decimal) point, but it is usually assumed for the
> mantissa at any convenient place and the exponent then adjusts this,
> left or right, for - or + values of the exponent. The mantissa is
> usually stored 'normalized' which means that it is shifted until the
> leading bit is always a '1' (which is always possible unless the value
> is actually zero, but that is easily detected.) Some formats simply
> throw away the leading bit, because it is always '1', and put it back
> when needed in order to add one apparent bit of precision.
>
> The rest is just software. Try a paper exercise and see where it takes
> you. That's a good start, if you plan to try and implement something
> yourself. Another choice would be to examine library code -- again,
> search google.

There are some fairly good introductions on the web (some in pdf, from
memory), but I'm afraid that I don't have them handy. The suggestions to
google are good ones.

Here are a couple of other random suggestions:

If your need for floating point (for dynamic range reasons) is on the
real-time critical path, so it has to be time/power efficient, you can
often get away with what's known as "block floating point". That is, a
collection of calculations, (the passes of an FFT, for example) might
usefully share a single exponent. That doesn't give you quite as even a
dynamic range/precision trade-off as conventional floating point, but it
makes the bulk of the work look more like fixed point, while still having
some of the dynamic range advantage.

One application area that I am familiar with that requires vast dynamic
range is anything that does pattern matching with hidden-markov models (or
similar). Most of the fixed-point DSP implementations of these algorithms
meet the precision/range trade-off by performing the arithmetic in the log
domain. This requires log() and exp() functions to get in and out, but
the win can be large if a large amount of processing has to take place in
between. [The use of log arithmetic also helps to explain a virtue of
Viterbi searches, as opposed to forward/backward or the like: additions
are replaced by maximums.]

Hope some of these rambles help. Or at least offer some more search terms
to help narrow down google's focus.

Cheers,

--
Andrew

Jonathan Kirwan

unread,

Dec 15, 2004, 4:04:42 AM12/15/04

to

On Wed, 15 Dec 2004 19:44:03 +1100, Andrew Reilly
<andrew-...@areilly.bpc-users.org> wrote:

>If you're doing floating point on a fixed point DSP, for dynamic range
>reasons, and you have no particular reason to comply with IEEE floating
>point formats, why would you bother with an unsigned mantissa or implied
>leading bit?

><snip>

Actually, I don't use hidden bit notation, at all. Everything explicit. The
mantissas are signed (or unsigned) as needed, and I don't use a separate sign
bit in some other field. I was talking generally at that point and hinting at
common format standards.

>Many (most?) DSP processors have "normalize" or "count leading
>zeros/leading ones" instructions too, which makes the
>normalization/alignment process a bit of a slam-dunk.

Yup. On the ADSP-21xx processors I referenced there is a fully combinatorial
32-bit barrel shifter with the ability to find the leading '1' in a single cycle
and report the required shift.

Jon

CBFalconer

unread,

Dec 15, 2004, 10:51:38 AM12/15/04

to

Jonathan Kirwan wrote:
>
... snip ...

>
> I don't have a convenient web site in mind, but Tim Wescott's
> suggestion of using google is probably a good one. Use "floating
> point" and "normalization" and "denormalization" and "exponent"
> and "mantissa" and "hidden bit" and perhaps the four common
> operations to help track something down. This is more your 'do
> diligence,' until you've done this yourself and can explain why
> it's not getting you there.
>
> The basic idea is that you have an exponent (signed,
> twos-complement) and an unsigned mantissa (with a possible
> hidden bit for non-zero values) and a separate sign bit. These
> can be packed in any format you like or is convenient to you.
> Each of these is an integer. There is no explicit radix (decimal)
> point, but it is usually assumed for the mantissa at any convenient
> place and the exponent then adjusts this, left or right, for - or +
> values of the exponent. The mantissa is usually stored 'normalized'
> which means that it is shifted until the leading bit is always a '1'
> (which is always possible unless the value is actually zero, but
> that is easily detected.) Some formats simply throw away the
> leading bit, because it is always '1', and put it back when needed
> in order to add one apparent bit of precision.
>
> The rest is just software. Try a paper exercise and see where it
> takes you. That's a good start, if you plan to try and implement
> something yourself. Another choice would be to examine library
> code -- again, search google.

You can find a complete example in the Dr. Dobbs Journal archives.
I published a complete system for the 8080 there about 25 years
ago. It's purpose was to supply dynamic range, and used a 16 bit
significand with an 8 bit exponent. The result was much faster
than anything else available at the time, because it could all be
done in registers, and in addition was re-entrant. The system
included i/o procedures, transcendentals, etc. and had
over/underflow detection. The system underwent minor revisions and
continuous use in the ten years or so since publication, and
processed the majority of tests in a 1000 bed hospital for much
longer. I.E. it was reliable and accurate.

Tim Wescott

unread,

Dec 15, 2004, 11:10:24 AM12/15/04

to

bungalow_steve wrote:
>
>
> The topic at hand is relative performance between emulated floating
> point vs fixed point math executed on the same processor, not sure why
> your talking about Pentiums vs whatever, that totally irrelevant to the
> discussion. I can only guess you didn't understand the .pdf file I
> referenced, or we are talking about two different topics. The reference
> says that a floating point add takes 122 cycles, vs 1 cycle for fixed
> point. This is one example of the 100 to 1 ratio I'm talking about. My

> personal experience with Analog Devices fixed point DSP's indicates two
> orders of magnitude difference between fixed point and emulated
> floating point, but I just thought you would be more interested in a
> published benchmark that backs up my claim, thats all.
>

You misunderstand. Please actually read my posts before you argue with
things I did not say.

That's _your_ topic at hand, and I understand you, I for the most part I
agree with you. If you read my retraction where I remembered (and
fessed up) that I was comparing a whole integer package that does slow
things down with floating point you'd realize that.

_My_ topic is that however it's done the '2812 in specific is _very_
good at emulated floating point -- probably not 1:1, but I believe it's
way better than 100:1. That's why I didn't waste my time reading the
paper about the DsPIC (unless you're trying to point out that it's
relative performance is as good as the '2812? Do you have benchmarks?).

I quoted the speedup (or lack of slowdown) between the '2812 and the
Pentium because the Pentium has your floating point hardware AND IT IS
SLOWER than the '2812.

So far you've quoted the ADI part and the Microchip part, but you
haven't addressed _my_ topic, which is that the '2812 IN SPECIFIC has
better floating point vs. integer performance than anything else I've
personally worked with -- including the Pentium, which has floating
point hardware and should blow it away.

Tim Wescott

unread,

Dec 15, 2004, 11:18:54 AM12/15/04

to

Jonathan Kirwan wrote:

> On 14 Dec 2004 20:51:40 -0800, "bungalow_steve" <bungalo...@yahoo.com>
> wrote:
>
>
>>My
>>personal experience with Analog Devices fixed point DSP's indicates two
>>orders of magnitude difference between fixed point and emulated
>>floating point, but I just thought you would be more interested in a
>>published benchmark that backs up my claim, thats all.
>
>
> I tend to agree that one shouldn't rely on P-II comparisons, for example, as a
> means of comparing integer vs floating point performance when discussing the
> general turf of embedded processors. Nor should one compare apples and oranges.
>
> The OP was asking about how to IMPLEMENT floating point routines using fixed
> point, by the way, and this branch here is decidedly moved away from anything
> helpful there -- though perhaps still interesting. I'm not sure how Tim's segue
> comment addressed this (seems to me it was arguably on topic to suggest
> searching google, but otherwise boils down to telling someone that they don't
> need to know how to implement floating point because the support is already
> there 'so why ask' at all...)

Yes, we've floated way off the original topic. My comment wasn't that
the OP shouldn't know how floating point was implemented, it was that
the OP could probably save loads of time using a package that someone
else has written and (hopefully at least) debugged. Before he/she goes
haring off writing a bunch of code he/she should make sure that it isn't
already sitting there in his/her toolkit.

If the OP is coding in assembly then the above comment is probably much
less relevant -- but if the OP is coding in C then he/she should
evaluate his/her toolsets for their floating-point capability.

Spehro Pefhany

unread,

Dec 15, 2004, 11:32:19 AM12/15/04

to

On 15 Dec 2004 08:23:09 -0800, the renowned "bungalow_steve"
<bungalo...@yahoo.com> wrote:

>And why add it, if everything coming in is integer and everything
>going out is
>> integer? Just stay in integer... if you can.
>

>The problem is what if you developed/tested all your floating point
>control laws code on a pc based simulator (e.g., Matlab/Simulink) and
>now want to put that code between an A/D and D/A in an embedded
>product. Easier just to drop it into a processor that supports the
>identical floating point format as the simulator then to rewrite it in
>fixed point math.

Cheap, fast(easy), good: pick any two.

Best regards,
Spehro Pefhany
--
"it's the network..." "The Journey is the reward"
sp...@interlog.com Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog Info for designers: http://www.speff.com

Tim Wescott

unread,

Dec 15, 2004, 12:33:13 PM12/15/04

to

bungalow_steve wrote:

> And why add it, if everything coming in is integer and everything
> going out is
>
>>integer? Just stay in integer... if you can.
>
>

> The problem is what if you developed/tested all your floating point
> control laws code on a pc based simulator (e.g., Matlab/Simulink) and
> now want to put that code between an A/D and D/A in an embedded
> product. Easier just to drop it into a processor that supports the
> identical floating point format as the simulator then to rewrite it in
> fixed point math.
>

For light DSP on conventional processors I usually end up writing a
little fixed-point math package that (a) allows multiplication as
fractional numbers and (b) automatically saturates additions and
subtractions. This is usually a much better fit for implementing
filters and the like, yet is much faster than floating point on most
machines (including the Pentium, oddly enough).

It's easy enough to simulate the effects of the fixed-point math in
Simulink or whatnot by the judicious use of quantification. If you get
ambitious enough you can even build a filter library in C or C++, with a
matching library of blocks in your simulation program.

Jonathan Kirwan

unread,

Dec 15, 2004, 1:38:29 PM12/15/04

to

On Wed, 15 Dec 2004 15:51:38 GMT, CBFalconer <cbfal...@yahoo.com> wrote:

>You can find a complete example in the Dr. Dobbs Journal archives.
>I published a complete system for the 8080 there about 25 years
>ago. It's purpose was to supply dynamic range, and used a 16 bit
>significand with an 8 bit exponent. The result was much faster
>than anything else available at the time, because it could all be
>done in registers, and in addition was re-entrant. The system
>included i/o procedures, transcendentals, etc. and had
>over/underflow detection. The system underwent minor revisions and
>continuous use in the ten years or so since publication, and
>processed the majority of tests in a 1000 bed hospital for much
>longer. I.E. it was reliable and accurate.

It would be wonderful if that were on the web somewhere or in a .DOC from a site
you have. Do you have separate rights to it? Or can you modify it and regain
the right to make it publicly available?

Jon

Jonathan Kirwan

unread,

Dec 15, 2004, 2:44:56 PM12/15/04

to

On 15 Dec 2004 08:23:09 -0800, "bungalow_steve" <bungalo...@yahoo.com>
wrote:

>> And why add it, if everything coming in is integer and everything
>> going out is
>> integer? Just stay in integer... if you can.
>

>The problem is what if you developed/tested all your floating point
>control laws code on a pc based simulator (e.g., Matlab/Simulink) and
>now want to put that code between an A/D and D/A in an embedded
>product. Easier just to drop it into a processor that supports the
>identical floating point format as the simulator then to rewrite it in
>fixed point math.

There's nothing wrong in capitalizing on your prior efforts. If you've already
tested your algorithms extensively using floating point *and* if you will be
using a floating point package that EXACTLY duplicates the behavior you tested
on the PC then you are in luck and it makes sense. But your comment seems just
a little cavalier to me, so I'll expand on my point.

What many people do NOT realize is that floating point operation behavior
introduces unexpected (to some) issues, such as the fact that A*(B-C) is not the
same as A*B-A*C in floating point while it IS the same in integer domains.
Floating point is ALSO not the same in behavior from one implementation to
another, many implementations introducing unexpected behaviors in the lower
order bits that differ widely; including non-monotonicities in various functions
of various sizes, different support for features, and hidden rounding controls
that either aren't well documented and/or aren't even properly initialized at
startup and will have to be tracked down in the target.

Just supporting an "identical floating point format" is often enough NOT CLOSE
to supporting identical behaviors in operating on those formats. And though it
may look good at first blush, you may discover problems well after you've
started to ship product. And then it can become very expensive to remedy.

I spend a LOT of time carefully studying any floating point software package
before I dare to rely on it and I use what I learn about its behavior in my
design analysis for the algorithms. This extra work is one reason I often avoid
floating point entirely. It's hard to do and it takes time to verify the impact
upon equations and to assure one's self through theory that the bounds of the
errors are in the worst case entirely acceptable.

I can't say this about anyone here in particular, but it has been the result of
my long experience in this that programmers I've been exposed to simply do not
have the training (self- or otherwise) to do the analysis carefully, are almost
completely ignorant of the pitfalls in using floating point and are blindly
cavalier and overconfident about its safety, and that they instead depend on
"random" testing of the entire system to assure themselves that things are okay.

Of course, sometimes it is a trivial result that the use of the floating point
is entirely safe -- for example, say, in converting an integer fixed point
internal Fahrenheit value into a rounded/truncated Celsius value for display
(there are all-integer ways, but FP is ... convenient and readable.) But quite
often equations are developed and handed over to programmers by physicists or
engineers who are NOT cognizant of FP "gotchas" and implemented by programmers
who are also NOT cognizant of them and things "slip through the cracks" between
those wielding equations in a perfect mathematical world and those converting
them to coded sequential operations in an embedded processor.

So I tend to be one of those suggesting that unless you are competent at FP that
you stay away from it. And if your application requires a wide dynamic range
and you need to maintain a similar precision throughout, then by all means use
floating point but think carefully about its use. I consider getting the
algorithms working well on a PC a reasonably good idea to ferret out and
eliminate certain kinds of important errors. But I also do NOT think, assuming
that the same format (and having the same number of bits does NOT mean the 'same
format') is used, that it is then a necessary consequence that all is right with
the world. There are just too many vagaries to contend with and in the floating
point domain, they can rise up and bite you where you least expect it if you
haven't been bothered to think through them to eliminate their threats.

(1) if your application isn't floating-point heavy and most of it is in dealing
with the external world, timing, stuff like that... then perhaps the risks of
being ignorant are lower. But still, (a) it probably isn't really that much
work to just avoid the floating point altogether and (b) your foot print in
memory will benefit from not linking FP code, too, and (c) any testing you did
on the PC for those floating point operations should really be just duplicated
on the target where you *know* it applies (and it won't be so difficult to do,
since they are a small part of the overall application);

(2) if your application is floating-point heavy, then the risks of some
difference in handling is just that much more likely to accumulate into a
serious problem. For all that testing you did on the PC, which was decidedly
convenient and important to do, it still cannot be assumed to operate exactly
the same on your target (unless the target is a Pentium, I suppose.) And you'll
need to either carefully think through the differences or else duplicate the
work, perhaps.

Floating point is one of those very convenient Ginsu knives that you see doing
such wonderful things in the hands of a skilled practitioner. But in the hands
of someone who is unskilled and cavalier and ignorant of its dangers, while it
will often do what is expected and thus feed overconfidence in its use, it's
also much more likely to cut off one's finger as chop carrots.

Of course, time to market, etc., will impact choices and risks taken. Your
mileage may vary, etc. Just a word to the wise, is all.

Jon

bungalow_steve

unread,

Dec 14, 2004, 1:05:09 PM12/14/04

to

4. Because, much to my surprise, floating point on a TI '2812 is only
a
> few times slower (rather than 100x) than fixed point math.
>

Sorry but there is no way floating point emulation on a TI 2812 is only

bungalow_steve

unread,

Dec 14, 2004, 7:53:07 PM12/14/04

to

bungalow_steve

unread,

Dec 14, 2004, 7:45:12 PM12/14/04

to

No, I'm talking about a simple add is 100 times slower. Your saying
floating point is a "few times slower" then fixed point. Ok, I assume a
C5400 performs a 16 bit add in 1 cycle, so your saying in 2 to 3 cycles
(i.e., few times slower) it can perform the overhead of a subroutine
call, denormalize/normalize and take care of all the special conditions
and return a 32 bit result? Sorry, I can't see it, do you have an
assembly listing of a C5400 floating point add routine?

bungalow_steve

unread,

Dec 14, 2004, 11:51:40 PM12/14/04

to

The topic at hand is relative performance between emulated floating

point vs fixed point math executed on the same processor, not sure why
your talking about Pentiums vs whatever, that totally irrelevant to the
discussion. I can only guess you didn't understand the .pdf file I
referenced, or we are talking about two different topics. The reference
says that a floating point add takes 122 cycles, vs 1 cycle for fixed

point. This is one example of the 100 to 1 ratio I'm talking about. My

bungalow_steve

unread,

Dec 15, 2004, 12:00:36 PM12/15/04

to

I think of it more as a nonrecurring vs recurring expense tradeoff,
rewrite the code in fixed point, save on recurring costs, don't
rewrite, save on non recurring costs. Old business problem.

bungalow_steve

unread,

Dec 15, 2004, 11:23:09 AM12/15/04

to

And why add it, if everything coming in is integer and everything
going out is
> integer? Just stay in integer... if you can.

The problem is what if you developed/tested all your floating point

Nicholas O. Lindan

unread,

Dec 15, 2004, 4:29:04 PM12/15/04

to

"CBFalconer" <cbfal...@yahoo.com> wrote

> I published a [floating point package for the 8080 in DDJoCC&O] Its purpose

> was to supply dynamic range, and used a 16 bit significand with an 8 bit

> exponent. ... processed the majority of tests in a 1000 bed hospital...

I found a lot of older (and some current) medical applications use
floating point BCD for results calculation. Data was processed
serially a nibble/digit at a time, as in a 4-bit calculator CPU.

The math-pac was always a home brew and buggy - but then in consulting
all you get to see are other folk's bugs. Nobody hires a consultant to
come in an fix a success (although if I did government work I am sure
that would change).

I have never had a client give a rational reason for using BCD.
Lots of paranoia, but nothing rational.

--
Nicholas O. Lindan, Cleveland, Ohio
Consulting Engineer: Electronics; Informatics; Photonics.
Remove spaces etc. to reply: n o lindan at net com dot com
psst.. want to buy an f-stop timer? nolindan.com/da/fstop/

Nicholas O. Lindan

unread,

Dec 15, 2004, 4:33:08 PM12/15/04

to

> Cheap, fast(easy), good: pick any two.

Equivalent to saying to the client "How would you like your
project: late, over budget or buggy?"

Nicholas O. Lindan

unread,

Dec 15, 2004, 4:39:06 PM12/15/04

to

"Tim Wescott" <t...@wescottnospamdesign.com> wrote

>
> For light DSP on conventional processors I usually end up writing a
> little fixed-point math package that (a) allows multiplication as
> fractional numbers and (b) automatically saturates additions and
> subtractions. This is usually a much better fit for implementing
> filters and the like, yet is much faster than floating point on most
> machines (including the Pentium, oddly enough).

When the filter parameters are constant the fastest method is hardcoding
the math in assembly as sequences of shifts and adds.

Jim Stewart

unread,

Dec 15, 2004, 4:47:59 PM12/15/04

to

Nicholas O. Lindan wrote:
> "CBFalconer" <cbfal...@yahoo.com> wrote
>
>
>>I published a [floating point package for the 8080 in DDJoCC&O] Its purpose
>>was to supply dynamic range, and used a 16 bit significand with an 8 bit
>>exponent. ... processed the majority of tests in a 1000 bed hospital...
>
>
> I found a lot of older (and some current) medical applications use
> floating point BCD for results calculation. Data was processed
> serially a nibble/digit at a time, as in a 4-bit calculator CPU.
>
> The math-pac was always a home brew and buggy - but then in consulting
> all you get to see are other folk's bugs. Nobody hires a consultant to
> come in an fix a success (although if I did government work I am sure
> that would change).
>
> I have never had a client give a rational reason for using BCD.
> Lots of paranoia, but nothing rational.

If I were to guess, I'd say it comes down to
visualization or perhaps mental laziness.
Everyone can make the jump from ten fingers
to a BCD digit. And not having to write the
conversion routines saves half the work.

Tim Wescott

unread,

Dec 15, 2004, 5:03:58 PM12/15/04

to

Nicholas O. Lindan wrote:

> "Tim Wescott" <t...@wescottnospamdesign.com> wrote
>
>>For light DSP on conventional processors I usually end up writing a
>>little fixed-point math package that (a) allows multiplication as
>>fractional numbers and (b) automatically saturates additions and
>>subtractions. This is usually a much better fit for implementing
>>filters and the like, yet is much faster than floating point on most
>>machines (including the Pentium, oddly enough).
>
>
> When the filter parameters are constant the fastest method is hardcoding
> the math in assembly as sequences of shifts and adds.
>

True, and if I were working on products at high enough volumes, or small
enough processors to justify it that's just what I'd do. In fact that's
how my first few experiences with doing DSP in conventional processors went.

Aside from those early experiences I've always worked on things that
ship a few hundred units a year at best, and in the context of a large
SW engineering team of whom I'm the most accomplished at DSP. Given
those conditions it's a better economic tradeoff to buy a faster
processor and code in a high-level language -- that keeps the
engineering time down and gives me the hope that I can do other things
with my time than write numeric processing code.

On a DSP the way I've done it is to write a fast vector dot product and
a fast matrix multiply in assembly, then wrap that with C or C++ to
generate the gain vectors (or matrices). I get nearly all the
ease-of-use of the higher level language and nearly all the speed of
assembly, which is pretty nice.

CBFalconer

unread,

Dec 15, 2004, 5:46:19 PM12/15/04

to

I never gave up any rights - when originally published DDJ did not
pay anything. When they published their later book of "DDJ for the
year ..." I gave them further permission to reprint. I lost my
sources some years ago in a disk crash, although I had promulgated
them to some others before then. Now all I have is a hard copy
listing of the version used in my Pascal system, and possibly
faulty copies typed by a French gentleman in comp.os.cpm.

Whether DDJ has it available in anything other than scans I do not
know. There is no great demand for 8080 assembly code today,
especially since the Rabbit doesn't even implement the full 8080
instruction set. I believe one of the critical things it misses is
the XTHL instruction. So does the 8086, thus making it impossible
to preserve all registers at all times.

Talk in this thread of emulating FP processors seems ridiculous.
The FP processors themselves were attempts to speed up the FP
routines. Other methods included hardware instructions to ease
justification, multiplication, division, etc. Some systems even
broke up division by having a dividestep instruction. For example,
as eventually (not in the DDJ issue) implemented in my system,
16x16 -> 32 multiplication was done by two 8 x 16 -> 24 bit
operations, and a summation. This was about 50% faster.

If there is any real demand I can put it up for download on my
page, in the form I received it from Arobase. i.e. totally
unverified. I do not have facilities for scanning my hard copy.

bungalow_steve

unread,

Dec 15, 2004, 10:06:32 PM12/15/04

to

Yes it seems we are both talking about different topics, limitations of
newsgroup communication I suppose. See ya.

bungalow_steve

unread,

Dec 15, 2004, 10:21:14 PM12/15/04

to

Fixed point assembly vs floating point C code is what I am comparing.
I suppose your own floating point routine can beat the 100:1 quite
easily, but I need IEEE compliance (so simulations I run on a PC are
identical to the results when run in the embedded processor). Blackfins
are nice, I moved up to the SHARC recently, a pleasure to code in
assembly.

Paul Keinanen

unread,

Dec 16, 2004, 2:09:45 AM12/16/04

to

On 15 Dec 2004 19:21:14 -0800, "bungalow_steve"
<bungalo...@yahoo.com> wrote:

>I suppose your own floating point routine can beat the 100:1 quite
>easily, but I need IEEE compliance (so simulations I run on a PC are
>identical to the results when run in the embedded processor).

If that is your problem, why do you use IEEE floats on the PC
simulations ?

Using C++ it would be quite easy to overload the ?,-, *, / etc.
operators using your own floating point routines using your own
floating point format.

Paul

bungalow_steve

unread,

Dec 16, 2004, 11:57:17 AM12/16/04

to

Because I'm not the only one running or controlling the PC simulation.
Frequently a customer will come to me asking to implement a industrial
controller whose control logic has been tweaked for the last ten years
on a simulation done on PC using IEEE floats, they also give me a set
of test vectors generated from the simulation that I must use to verify
the operation of the controller. I have two options, take their
existing code and cross compile it to the target processor which
supports IEEE or cross compile it to a target processor that doesn't
and hope for the best. Is a risk reduction decision.

dalai lamah

unread,

Dec 16, 2004, 2:28:56 PM12/16/04

to

Un bel giorno bungalow_steve digitò:

It wasn't by mistake if I wrote "32-bit fixed point" and "generic
operations" and "C code". It's unfair to compare 16-bit fixed point with
float (float gives you much more resolution), and it's unfair to make
comparisons just by using sum operations.

I've just made a bechmark with a 2810 (100 MHz clock, code executed in
RAM):

long a;
float f=150.3;
long l=150300L;

for( a=0; a<10000000L; a++)
{
f += f/10.5;
}

for( a=0; a<10000000L; a++)
{
l += l/10500;
}

The long loop duration was 8 seconds; the float loop duration was 35
seconds.

--
asd

ChrisClearman

unread,

Feb 18, 2005, 2:59:37 AM2/18/05

to

For those of you wondering how the C28x (F2812) from TI handles
floating point so efficiently you can view a .ppt training on the
topic for free here:

http://ti-training.com/courses/CourseDescription.asp?iCSID=46234&DCMP=dsp_c2000_iqmath&HQS=On-LineTraining+BA+iqmatholtrellink

There are benchmarks and application examples.
Here is the free IQMath Library Itself
http://focus.ti.com/docs/toolsw/folders/print/sprc087.html

Chris Clearman
Texas Instruments
C2000 Digital Signal Controllers

joep

unread,

Feb 18, 2005, 1:04:45 PM2/18/05

to

IQMath is not floating point, its a set of fixed point routines that
handle some of the alignment issues that are present when using raw
fixed point math. You still have some dirty work like picking an
appropriate global "IQ" as a function of your signal dynamic range or
use as specific IQ for each operation. Most of the discussion here was
comparing ANSI C float vs fixed point.