hard vs. soft data

Mark Thorson

unread,

Dec 26, 2012, 5:43:56 PM12/26/12

to

Let's say we make a distinction between hard and
soft data. Hard data is anything whose integrity
must be preserved at all times because the
consequences of errors are likely to be fatal,
such as the contents of the program counter, stack
pointer, etc. Soft data is data where nobody will
care if it's a little bit off, and it's not fatal
if it's completely wrong as long as that doesn't
happen very often. Examples include audio and video
data, anything in the graphics pipeline, etc.

Would it make sense to make this distinction at the
hardware level? The idea is that soft data could be
handled more cheaply than hard data. For example,
arithmetic could be performed by analog circuits.
Analog would be much slower than digital, but the
power consumption of adding two analog quantities
would be much smaller than two binary integers.

Soft data could also use cheaper memory, such as
unprotected DRAM or somewhat worn blocks of flash
storage. If an alpha particle hits your data,
maybe the user will hear a tiny pop or see a
one-frame glitch, but usually they won't even see
or hear anything like that. If it occurs at a
rate of one error per million hours of operation,
you're more likely to encounter a software bug.

Another possibility is cheaper digital circuits.
At first, I thought of eliminating certain trickier
cases of carry that don't occur very often, but
that would introduce a bias that the user might
see or hear. In a rendering application, you might
see bands on curved surfaces corresponding to the
unsupported cases of carry. That won't do. Any
optimization for cheaper circuits must be unbiased,
preferably completely random.

You could make digital circuits cheaper and use
less power by making them smaller and lower voltage,
essentially removing some of the safety margin and
running closer to the noise floor. If you got an
error due to a thermal electron once a month, who
would notice that? You might be able to apply a
filter to remove pops and glitches that only occur
in one audio sample or one video frame, if it
happens frequently enough to matter.

There might be other techniques I haven't thought of
where you could make a multiplier, divider, or FPU
more cheaply if the designer has the option of
returning bad data in rare cases that are hard to
handle. Maybe it would return a NaN, though I'd
guess it would make more sense to return zero or
one of the operands. Being unbiased would probably
rule out many tempting circuit optimization techniques.

Paul A. Clayton

unread,

Dec 26, 2012, 7:55:21 PM12/26/12

to

On Wednesday, December 26, 2012 5:43:56 PM UTC-5, Mark Thorson wrote:
> Let's say we make a distinction between hard and
> soft data.

[snip]

> Would it make sense to make this distinction at the
> hardware level?

I am posting a quick response now, but I hope
to provide a bit more detail later.

The basic ideas have been looked at. A search
of the academic literature for "stochastic
computation" and "approximate computation" would
probably provide a number of papers proposing
similar optimizations.

Lyric Semiconductor used analog circuits for
ECC and (ISTR) proposed similar circuits for
Bayesian filtering/analysis.

ISTR, there was a proposal for embedded systems
that reduced error vulnerability by assigning
less critical data to less reliable memories
(to reduce costs or power/energy use). This
might have used variable refresh requirements
for DRAM or some other aspect. (Similar
power saving has been proposed for caches where
voltages could be lowered beyond the guaranteed
safe level and EDC/ECC used to detect [for clean
data] or correct [for dirty data] errors caused
by decreased voltage.)

Obviously predictors present some potential for
such optimizations (since they are already only
probably correct). (There has been at least one
paper on analog perceptron predictors.)

ISTR skimming a paper on the architectural
vulnerability of an Itanium implementation
(some errors are more critical than others).
One could use such information to tune
devices to make desired tradeoffs of reliability,
energy efficiency, etc.

Reduced precision computation is a similar
optimization.

One of the problems with analog computation
seems to be the weakness of the design and
validation tools.

I hope to do a little looking around at
some of the papers I have downloaded (and
perhaps do some googling) to provide at
least a few concrete items, but I hope the
above suffices as a "Yes, this idea has
been confirmed as an area of interest,
academically and even commercially."

Mark Thorson

unread,

Dec 26, 2012, 9:51:11 PM12/26/12

to

"Paul A. Clayton" wrote:
>
> On Wednesday, December 26, 2012 5:43:56 PM UTC-5, Mark Thorson wrote:
> > Let's say we make a distinction between hard and
> > soft data.
> [snip]
> > Would it make sense to make this distinction at the
> > hardware level?
>
> I am posting a quick response now, but I hope
> to provide a bit more detail later.

I meant to say "at the architecture level".
That is, you might have functional units,
memory regions, and maybe even registers
that are less reliable than those used
for hard data. This might appear in the
instruction set, at least when referring
to operations on soft data. Alternatively,
data might be tagged as soft, for example
if it is read from a region of the address
space assigned to soft memory. If one of
the operands is tagged as soft, it's okay
to use a soft FU on it.

Thanks for your thoughts -- I hadn't thought
about branch predictors. I see this raises
an issue with regard to deterministic
behavior -- could pose problems for device
test.

"You want to do WHAT?"

Quadibloc

unread,

Dec 27, 2012, 6:00:15 AM12/27/12

to

Most techniques of admitting error to save money aren't worth the
trouble. However, most existing microcomputers don't use all available
techniques to avoid error; for example, ECC memory is not common
outside of expensive servers or mainframes.

So we haven't forgotten that reliability is a tradeoff, it's just that
one can't save much in most cases even when throwing quite a bit of
reliability away - and so we avoid doing something which would have
the result of limiting the flexibility of what we could now use most
of our memory for.

Originally, analog arithmetic was much faster than digital, and it
could still be - however, doing it even with modest precision requires
different chip fabrication techniques, and that would eat up any
savings. It's easier to use lower precision.

John Savard

Michael Engel

unread,

Dec 27, 2012, 7:46:21 AM12/27/12

to nos...@sonic.net

Am Mittwoch, 26. Dezember 2012 23:43:56 UTC+1 schrieb Mark Thorson:
> Let's say we make a distinction between hard and
> soft data. Hard data is anything whose integrity
> must be preserved at all times because the
> consequences of errors are likely to be fatal,
> such as the contents of the program counter, stack
> pointer, etc. Soft data is data where nobody will
> care if it's a little bit off, and it's not fatal
> if it's completely wrong as long as that doesn't
> happen very often. Examples include audio and video
> data, anything in the graphics pipeline, etc.

As Paul A. Clayton already mentioned, there is quite a bit of research in that area already and it's quite an intriguing idea IMHO. As my team is doing research in that area, I thought I might throw in my 2 cents here. In fact, we consider most signal processing applications as an interesting target for investigating the possible reliability (or precision) vs. "x" tradeoffs.

> Would it make sense to make this distinction at the
> hardware level? The idea is that soft data could be
> handled more cheaply than hard data.

One idea that would be especially appropriate for embedded systems is not to restrict this relaxation of reliability/precision requirements to the hardware level. The fundamental problem is how to distinguish between what you call "hard" and "soft" data. One idea several research groups adopted is to introduce methods to describe this semantics of data on source code level.

In our project on flexible error handling (sorry for the shameless plug - see http://ls12-www.cs.tu-dortmund.de/daes/en/forschung/dependable-embedded-real-time-systems.html for more information), we call the categories "reliable" and "unreliable". Since for embedded systems we assume that the source code is available, we extended our C compiler to provide type qualifiers for reliable and unreliable data. Of course, no one expects a programmer to manually annotate each and every declaration of a data object. Thus, we use static analysis methods to propagate the qualifiers (safely) to unannotated objects and check the consistency of annotations.

Other projects follow a very similar approach, e.g. the EnerJ project by Adrian Sampson et al. at U Washington uses @Approximate type qualifiers for Java (http://sampa.cs.washington.edu/sampa/EnerJ).

Another idea is to annotate code sections instead of data objects. De Kruijf's Relax framework (http://pages.cs.wisc.edu/~karu/wiki/index.php/Pubs/B2hd-isca09relax) uses a mechanism similar to try/catch-blocks to provide specific error handling for the given code block, e.g., retry an operation or return a default result.

This list is by no means exhaustive, but it might give you an insight into some of the current research in this area. It will be interesting to see whether data or code annotations (or a combination of both) will be the more attractive solution.

> For example,
> arithmetic could be performed by analog circuits.
> Analog would be much slower than digital, but the
> power consumption of adding two analog quantities
> would be much smaller than two binary integers.

Calculations using analog components seem interesting - however, converting data from/to the analog domain seems to inflict a big overhead (D/A and A/D converters, building hybrid digital/analog chips, etc.). Based on a similar idea, there are approaches to build digital probabilistic arithmetic units, e.g., by our collaborators Krishna Palem and Vincent J. Mooney (http://news.rice.edu/2012/05/17/computing-experts-unveil-superefficient-inexact-chip/). Their PCMOS technology uses biased voltage scaling to control the error probabilities for given bits in an operation (in short, ensuring a significantly less error probability for MSBs than for LSBs).

> Soft data could also use cheaper memory, such as
> unprotected DRAM or somewhat worn blocks of flash
> storage. If an alpha particle hits your data,
> maybe the user will hear a tiny pop or see a
> one-frame glitch, but usually they won't even see
> or hear anything like that. If it occurs at a
> rate of one error per million hours of operation,
> you're more likely to encounter a software bug.

This mapping approach is one possible implementation. Another one (which we investigate) is to decide at run time which errors have to be corrected (and by which methods) in order to adhere to given constraints of a system. Of course, this approach is more suitable for software-based error handling.

> Another possibility is cheaper digital circuits.
> At first, I thought of eliminating certain trickier
> cases of carry that don't occur very often, but
> that would introduce a bias that the user might
> see or hear. In a rendering application, you might
> see bands on curved surfaces corresponding to the
> unsupported cases of carry. That won't do. Any
> optimization for cheaper circuits must be unbiased,
> preferably completely random.

PCMOS actually follows the opposite approach by ensuring that errors show up in less significant bits only. Of course, this approach works best for data which ensures that the MSBs actually contain relevant information - like floating point data (in fact, we showed that using PCMOS is difficult for embedded systems that rely on integer operations - see our ARCS2012 publication "Classification-based Improvement of Application Robustness and Quality of Service in Probabilistic Computer Systems").

> You could make digital circuits cheaper and use
> less power by making them smaller and lower voltage,
> essentially removing some of the safety margin and
> running closer to the noise floor. If you got an
> error due to a thermal electron once a month, who
> would notice that? You might be able to apply a
> filter to remove pops and glitches that only occur
> in one audio sample or one video frame, if it
> happens frequently enough to matter.

One of the interesting questions is in which cases the techniques discussed above really pay off. Cost tradeoffs may be interesting for (low-end) embedded applications, e.g. to save on cost for ECC memory. Other tradeoffs considered are energy (which IMHO would require this "probabilistic" approach to be implemented in significant parts of the architecture in order to provide significant gains) and real-time (if some errors can be ignored or handled quickly, it's easier to keep deadlines).

The techniques discussed should be of increasing relevance if the ITRS predictions for error rates of future semiconductors (with smaller feature sizes and near-threshold supply voltages) develop as expected - essentially, an exponential growth of error rates, leading to errors being the normal case instead of being a rare exception.

Best regards,
Michael Engel (TU Dortmund, Germany)

Paul A. Clayton

unread,

Dec 27, 2012, 11:51:40 AM12/27/12

to

On Wednesday, December 26, 2012 9:51:11 PM UTC-5, Mark Thorson wrote:
> "Paul A. Clayton" wrote:
>>
>> On Wednesday, December 26, 2012 5:43:56 PM UTC-5, Mark Thorson wrote:
>>> Let's say we make a distinction between hard and
>>> soft data.
>> [snip]
>>> Would it make sense to make this distinction at the
>>> hardware level?
>>
>> I am posting a quick response now, but I hope
>> to provide a bit more detail later.
>
> I meant to say "at the architecture level".
> That is, you might have functional units,
> memory regions, and maybe even registers
> that are less reliable than those used
> for hard data.

I suspect that in many cases, reliability requirements
will correlate with functional activity such that more
specialized hardware would be used. The dark silicon
idea--that there will be more transistors than can be
active and so specialization for lower utilization at
higher efficiency makes sense--would mesh well with
such cases.

However, I suspect that significant work will be done
to increase the flexibility of hardware because of the
costs of communication. I.e., the greater efficiency
of specialized hardware can be countered by the
greater cost of communicated data and control between
different units.

The _functional_ division between architecture and
microarchitecture is, of course, based less on
obvious distinctive aspects and more on availability
of information (compiler, software runtime,
hardware), sophistication and maturity of design
tools, etc. In (extremely abstract) theory, a
system similar to transactional memory with predictors
could handle assigning reliability aspects dynamically
without software assistance. The static and dynamic
hardware overhead of the predictors and rollback
mechanisms and the design complexity (particularly for
such an immature area) presumably makes such very
unattractive as an exclusive option. (However, in
cooperation with software, such might be made
useful eventually.)

> This might appear in the
> instruction set, at least when referring
> to operations on soft data. Alternatively,
> data might be tagged as soft, for example
> if it is read from a region of the address
> space assigned to soft memory. If one of
> the operands is tagged as soft, it's okay
> to use a soft FU on it.

For FU approximate computation, there seem to
be two approaches: deterministic and dependent
on arbitrary conditions (temperature, minor
voltage fluctuation, process variability, etc.).
A deterministic approach might be more friendly
to device testing (though residue techniques
and checker cores could provide reliability in
the presence of unreliable components), it
effectively just adjusts the logic table slightly
to allow a more energy-efficient sufficiently
approximate implementation.

(For storage one can adjust cell reliability [in
hardware design and/or with voltage and temperature
considerations] and ECC/EDC coverage.)

Michael Engel gave some links to some additional
information. I will add the following:

"Flikker: Saving DRAM Refresh-power through
Critical Data Partitioning" (2011; Song Liu et al.)
This paper proposes allocating less critical data
to DRAM that is refreshed less frequently and so
has a higher error rate. This idea could be
combined with Ravi K. Venkatesan et al.'s
Retention-Aware Placement in DRAM (RAPID) that
exploits DRAM retention variability (Their 2006
paper only used the variability to support very
low refresh rate when DRAM is not fully used.).

Reduced precision is similar in concept to
significance compression ("Very Low Power
Pipelines using Significance Compression", 2000,
Ramon Canal et al.). Significance compression
compresses redundancy in MSbits and is not lossy
while reduced precision tends to remove LSbits
and is lossy.

(Other lossless compression schemes have been
proposed, e.g., "Eliminating Energy of Same-
Content-Cell-Columns of On-Chip SRAM Arrays" (2011,
Bushra Ahsan et al. For some uses, lossy
compression techniques might be applied--reducing
precision is a relatively simple and effective form
of lossy compression.)

I seem to recall that an earlier AMD processor
used reduced precision multiply for the first step
in interative refinement for division/square root,
though this was (IIRC) for performance not energy
efficiency. Series calculations could perhaps
likewise exploit reduced precision for certain
terms.

Here is a list of some other papers somewhat
related to this subject that I happened to
encounter (most of which I have not yet read):

"Power Efficient Motion Estimation Using
Multiple Imprecise Metric Computations" (2007,
In Suk Chong and Antonio Ortega)

"Architecture Support for Disciplined Approximate
Programming" (2012, Hadi Esmaeilzadeh et al.)

"Shoestring: Probabilistic Soft Error Reliability
on the Cheap" (2010, Shuguang Feng et al.)--from
the abstract: "Shoestring is able to focus its
efforts on protecting statistically-vulnerable
portions of program code."

"Measuring Architectural Vulnerability Factors"
(2003, Shubhendu S. Mukherjee et al.)--mainly
interesting for how errors in different
microarchitectural structures have different
visibility in terms of program behavior.

"Software/Hardware Cooperative Approximate
Computation" (2011, Gennady Pekhimenko and
Kun Qian)--"The basic idea is to 1) identify
performance-critical events . . . whose
results can be predicted or ignored without
recovery and without degrading a level of
quality required by the user, and 2) value-
predict or ignore such events during dynamic
execution"

"Energy-Precision Tradeoffs in Mobile Graphics
Processing Units" (2008, Jeff Pool et al.)

"Probabilistic Counter Updates for Predictor
Hysteresis and Stratification" (2006, Nicholas
Riley and Craig Zilles)

"EnerJ: Approximate Data Types for Safe and
General Low-Power Computation" (2011, Adrian
Sampson et al.)--Michael Engel mentioned the
project behind this.

"Stochastic Computation" (2010, Naresh R.
Shanbhag et al.)--"This paper traces the roots
of stochastic computing from the Von Neumann
era into its current form."

"Eliminating Microarchitectural Dependency from
Architectural Vulnerability" (2009, Vilas
Sridharan and David R. Kaeli)--looks at the
program-based variability in error visibility.

"The Art of Deception: Adaptive Precision
Reduction for Area Efficient Physics
Acceleration" (2007, Thomas Y. Yeh et al.)

I do not know if any of the above would be
particularly helpful--I had not realized how
many papers I had downloaded and not read!--,
but added to Michael Engel's links such might
provide a starting place for further research.

> Thanks for your thoughts -- I hadn't thought
> about branch predictors.

Approximation works with a variety of predictors
(cache way predictors, prefetch engines, value
predictors, etc.) and not just branch predictors.

Renée St. Amant et al.'s "Low-Power, High-
Performance Analog Neural Branch Prediction"
(2008) used analog summation for a perceptron-
based branch predictor.

> I see this raises
> an issue with regard to deterministic
> behavior -- could pose problems for device
> test.
>
> "You want to do WHAT?"

Yes, this could be worse than the issues with
asynchronous logic.

(I hope the above was not too long and meandering.)

Paul A. Clayton

unread,

Dec 27, 2012, 11:58:41 AM12/27/12

to

On Thursday, December 27, 2012 7:46:21 AM UTC-5, Michael Engel wrote:
[snip]

> This list is by no means exhaustive, but it might
> give you an insight into some of the current research
> in this area. It will be interesting to see whether
> data or code annotations (or a combination of both)
> will be the more attractive solution.

Thanks for providing the links, particularly for
Probabilistic CMOS (of which I had read mentions but
I do not seem to have downloaded any papers yet).

There is *FAR* too much interesting stuff on the
Internet!

Joe keane

unread,

Dec 27, 2012, 7:27:01 PM12/27/12

to

In article <50DB7DAC...@sonic.net>,

Mark Thorson <nos...@sonic.net> wrote:
>Would it make sense to make this distinction at the hardware level?

''I put in "4 / 2" and it says "1.82"!''

''It *is* a phone. If you want a calculator, it's going to cost more.''

Robert Wessel

unread,

Dec 27, 2012, 7:54:57 PM12/27/12

to

On Wed, 26 Dec 2012 14:43:56 -0800, Mark Thorson <nos...@sonic.net>
wrote:

For certain applications, it's possible to increase the reliability of
the hardware through redundancy. So this may be more a performance
trade-off. Store your media files (which are largely accessed
sequentially anyhow), on low how, mediocre reliability media, and then
apply a significant amount of forward error correction to the stored
stream. To some extent, that's one major difference between consumer
and enterprise disk drives - there's substantially more error
correction stored in the latter. That, plus reduced recording
densities, leads to a substantially lower uncorrected read error rate.

IBM did something similar with "expanded storage" on mainframe systems
in the 80s and 90s. ES was accessible mainly in page increments
(there were, in fact, "Page In" and "Page Out" instructions to copy ES
pages to and from main memory), and was, at least on the larger*
boxes, implemented as slower and cheaper DRAM than main storage, but
with substantial additional error correction. This worked well for
paging and caching.

*On smaller boxes you could carve off a piece of the normal main
memory as ES, if you were so inclined.

MitchAlsup

unread,

Dec 30, 2012, 2:07:48 PM12/30/12

to nos...@sonic.net

On Wednesday, December 26, 2012 4:43:56 PM UTC-6, Mark Thorson wrote:
> Let's say we make a distinction between hard and
> soft data. Hard data is anything whose integrity
> must be preserved at all times because the
> consequences of errors are likely to be fatal,
> such as the contents of the program counter, stack
> pointer, etc. Soft data is data where nobody will
> care if it's a little bit off, and it's not fatal
> if it's completely wrong as long as that doesn't
> happen very often. Examples include audio and video
> data, anything in the graphics pipeline, etc.

I suspect this will go nowhere.

On the floating point side, even after a long string of
floating point arithmetic operations on a semiprecise
starting value, the argument to transcendental functions
must be considered exact. Thus, even if the input to the
sine function is 1E+15 (i.e. no significance wrt pi/4)
argument reductions must use 25+ decimal digits of Pi to
generate the starting point for the polynomial evaluation.

So, while I am generally agreeable to thought that the
underlying argument is sound, those who think more deeply
will find all sorts of things that will fail that should not.
Just ask anyone on the original IEEE754 comittee.

Mitch

nm...@cam.ac.uk

unread,

Dec 30, 2012, 3:45:05 PM12/30/12

to

In article <9655b8f3-3bcd-4770...@googlegroups.com>,

And then ask someone like me, who will explain why they were
completely cuckoo, and why the objective of 'exact' function
evaluation is at best misguided and at worst actively harmful.

Regards,
Nick Maclaren.

Terje Mathisen

unread,

Jan 1, 2013, 11:32:50 AM1/1/13

to

nm...@cam.ac.uk wrote:
> In article <9655b8f3-3bcd-4770...@googlegroups.com>,
> MitchAlsup <Mitch...@aol.com> wrote:
>> I suspect this will go nowhere.

I agree and I'd even be willing to put it a lot more strongly.

>>
>> On the floating point side, even after a long string of
>> floating point arithmetic operations on a semiprecise
>> starting value, the argument to transcendental functions
>> must be considered exact. Thus, even if the input to the
>> sine function is 1E+15 (i.e. no significance wrt pi/4)
>> argument reductions must use 25+ decimal digits of Pi to
>> generate the starting point for the polynomial evaluation.
>>
>> So, while I am generally agreeable to thought that the
>> underlying argument is sound, those who think more deeply
>> will find all sorts of things that will fail that should not.
>> Just ask anyone on the original IEEE754 comittee.
>
> And then ask someone like me, who will explain why they were
> completely cuckoo, and why the objective of 'exact' function
> evaluation is at best misguided and at worst actively harmful.

I'd be perfectly willing to accept a specification that says sin(1e15)
== 0 and cos(1e15) == 1.

Even better would of course be to specify the result as NaN...

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Paul A. Clayton

unread,

Jan 1, 2013, 1:44:01 PM1/1/13

to

On Tuesday, January 1, 2013 11:32:50 AM UTC-5, Terje Mathisen wrote:
> nm...@cam.ac.uk wrote:
>> In article <9655b8f3-3bcd-4770...@googlegroups.com>,
>> MitchAlsup <Mitch...@aol.com> wrote:
>>> I suspect this will go nowhere.
>
> I agree and I'd even be willing to put it a lot more strongly.

I disagree. While approximate computation may never
become adopted in general purpose computing, the use
of analog aspects in Lyric Semiconductor's error
correcting logic seems sufficient evidence to
demonstrate that there is real potential here.

For some embedded devices which can get away with
non-standard precisions, approximate computation
might be worthwhile. E.g., filtering sensor data
might require extremely low power and have better
results with approximate computation than with
simple reduced precision (and even reduced
precision would violate the expectations of many
programmers).

I could imagine approximate computation applying
well to a number of filtering problems. Even
for security-related filtering an approximate
method that generates more false negatives than
an exact implementation of the same method
might be preferred when an *implementable* (in
terms of cost, performance, power use, etc.)
exact method would be less effective.

There are certainly problems with use of approximate
computation (including the reduced importance of
the cost of computation itself vs. storage and
especially communication/control/synchronization),
but "go nowhere" seems a bit harsh.

Andy (Super) Glew

unread,

Jan 2, 2013, 1:14:46 PM1/2/13

to

On 1/1/2013 10:44 AM, Paul A. Clayton wrote:
> On Tuesday, January 1, 2013 11:32:50 AM UTC-5, Terje Mathisen wrote:
>> nm...@cam.ac.uk wrote:
>>> In article <9655b8f3-3bcd-4770...@googlegroups.com>,
>>> MitchAlsup <Mitch...@aol.com> wrote:
>>>> I suspect this will go nowhere.
>>
>> I agree and I'd even be willing to put it a lot more strongly.

I'm not so sure.

BRIEF

There is fairly good evidence that varying the PRECISION of
calculations, in both integer and floating point, is quite fruitful.

Varying precision necessarily involves varying accuracy.

The question then is whether there might ever be advantages in having
reduced accuracy, potentially randomized, probabilistic inaccuracy, in
calculations performed in higher precision formats.

DETAIL

Evidence that varying FP PRECISION is useful.

E.g. I think I saw a paper that said that more than half of the putative
advantage of custom and semi-custom logic, luike FPGAs, lies in using
only the minimum precision needed.

E.g. the proliferation of integer sizes (should not need explanation)

E.g. the proliferation of floating point sizes - the old standards 64B
and 32B, but also 40B, 24B, 16B, and even 8B. Yes, 8B and 16B seem
mainly to be used in memory, and are converted to 32B... but I cannot
help but wonder if an FP16*FB32+FP32 -> FP32 FMAC might have advantages.

http://en.wikipedia.org/wiki/Half-precision_floating-point_format
http://en.wikipedia.org/wiki/Minifloat

http://semipublic.comp-arch.net/wiki/Collection_of_Unusual_Datatypes_and_Formats
http://semipublic.comp-arch.net/wiki/Floating_Point_Formats

--
The content of this message is my personal opinion only. Although I am
an employee (currently of MIPS Technologies; in the past of companies
such as Intellectual Ventures and QIPS, Intel, AMD, Motorola, and
Gould), I reveal this only so that the reader may account for any
possible bias I may have towards my employer's products. The statements
I make here in no way represent my employers' positions on the issue,
nor am I authorized to speak on behalf of my employers, past or present.

Michael Engel

unread,

Jan 2, 2013, 7:07:48 PM1/2/13

to

Am Mittwoch, 2. Januar 2013 19:14:46 UTC+1 schrieb Andy (Super) Glew:
> On 1/1/2013 10:44 AM, Paul A. Clayton wrote:
> > On Tuesday, January 1, 2013 11:32:50 AM UTC-5, Terje Mathisen wrote:
> >> nm...@cam.ac.uk wrote:
> >>> In article <9655b8f3-3bcd-4770...@googlegroups.com>,
> >>> MitchAlsup <Mitch...@aol.com> wrote:>
> >>>> I suspect this will go nowhere.
>
> >> I agree and I'd even be willing to put it a lot more strongly.
>
> I'm not so sure.

Great to see that there are quite some contrasting opinions on this
topic by respected readers of comp.arch - for me, this is an indication
that there might still be some interesting opportunities for research
:-).

> BRIEF
>
> There is fairly good evidence that varying the PRECISION of
> calculations, in both integer and floating point, is quite fruitful.
>
> Varying precision necessarily involves varying accuracy.
>
> The question then is whether there might ever be advantages in having
> reduced accuracy, potentially randomized, probabilistic inaccuracy, in
> calculations performed in higher precision formats.

Perhaps one interesting parameter to consider is the granularity with
which the precision-vs.-whatever trade-off works. I suppose most readers
here know that in hardware synthesis, the use of bit width analysis
(with safe static as well as probabilistic approaches, again) is a
standard method to save on hardware resources, since buses, registers,
etc. can be implemented using fewer bits.

Of course, in hardware we have the liberty to vary the bit width in
increments of single bits. Exposing this to software in a general
purpose CPU seems difficult to me. Perhaps it would be interesting to
evaluate bit-serial CPU architectures here?

I suspect there are still valid reasons for providing different
precisions for operations in general purpose architectures. Most FPUs
today implement IEEE754 single as well as double precision - I suspect
this is not only to maintain standards conformance. A trade-off analysis
of precision in the given granularities vs. typical optimization objectives
should be interesting. I suspect there are already some publications in
this area, but haven't looked for related work in this specific area so far.

Another topic to consider here is IMHO the amount that the component(s)
to be optimized by reducing precision contributes to the overall value
(e.g., for energy). If a system has to consider the worst case for all
bit widths, only few components might actually be affected by the
optimizations - e.g., a CPU that implements a variable-precision ALU but
nevertheless has to provide buses, registers, etc., that cover the
worst case bit width required by software - and the achievable optimizations
might be minimal on a system-level scale. Thus, I expect larger benefits of
applying variable precision techniques in the embedded area, provided we
can apply worst-case analysis techniques to well-known software components.

-- Michael