Russ
> Native Formats:
> Binary32
> Binary64
> Binary128
>
> Decimal64
> Decimal128
>
> Additional Interchange formats:
> Binary16
>
> Decimal32
I think it's really too early to add decimal floating point types as
native types in Go. Few existing processors support them natively.
Most of the rest of this is library functions, most of which Go
already provides for the existing floating point types (I'm not sure
that Go has all the rounding functions they ask for).
Ian
Decimal64 : 64 bit decimal Floating pointDecimal128 128 bit decimal Floating pointThese are useful in financial applications, on x86 have an eficient software implementation by Intel, and on Power 6 and above have a hardware implementation.Float 128, for high precision calculations, will soon be available for hardware implementations.Complex256, Complex number made of two float 64'sInterval64Interval128Interval256Interval arithmetic types useful for scientific computing.On Sun, Mar 28, 2010 at 16:29, Alexis Shaw <alexi...@gmail.com> wrote:
yes, so you do not use IEEE 754 for floating point?
> The thing is that you want to have people that should be using a decimal
> type using it from the start, so that when decimal hardware becomes
> available, all the old software just requires a re-compile, not a re-write,
This assumes that decimal types are going to be widely used. Although
there are drafts adding them to the C and C++, I'm not aware that they
have been adopted into the language standards as yet. I don't think
there is any need for Go to be a leader in this area.
> Also there are a few things here that you cannot do via software libraries
> currently they are:
> Specify the rounding direction,
This one is a portability issue since not all processors permit it.
I suppose we could have a function which sets the rounding direction
and returns an error if it could not be set.
> Specify a programmer defined exception handler for Floating point
> exceptions.
At some point floating point exceptions will turn into a call to panic
with an argument which meets the runtime.Error interface.
> Fused multiply add operations.
Another portability issue. The interesting case with fused
multiply-add is when it should happen. Should the compiler
automatically generate it? Or should it only be available via a call
to an explicit math function.
> Defining Working precision. ie preferred width
In Go the types that you use imply the working precision. I don't see
a compelling need to provide any other mechanism.
I don't think anybody on the Google Go team is likely to work on any
of these issues any time soon. If anybody would like to write a
patch, please send a message with your proposal.
Ian
>> Decimal64 : 64 bit decimal Floating point
>> Decimal128 128 bit decimal Floating point
>>
>> These are useful in financial applications, on x86 have an eficient
>> software implementation by Intel, and on Power 6 and above have a hardware
>> implementation.
For those who are curious, the reason why financial applications in
particular are singled out as a place where this is useful is that the
high mucky-mucks of finance do get irritated when, instead of printing
out the "10.000000000 billion dollars" that the code was told to print,
it instead prints "9.999999999 billion dollars". That's the one and only
advantage of decimal floating-point: it doesn't do that. It's generally
no more accurate in computations; it's just conversion to and from
human-readable form that it does exactly. Financial applications are the
only ones where people care about this enough to pay for it; everyone
else who uses floating point just accepts that it is always going to be
slightly inaccurate, even if all you do is to input a variable and then
print it out.
I'm not sure why financial applications don't just use 64-bit fixed
point, calculating everything in numbers of cents, but maybe there's a
good reason.
Of course, as computation gets cheaper, what once only the big shots of
Wall Street could afford may find its way into everyone's spreadsheet:
even though it is only a very superficial improvement, it would reduce
tech support calls.
>> Float 128, for high precision calculations, will soon be available for
>> hardware implementations.
Now, that one is very useful -- even when hundreds of times slower than
double precision (as it has often been).
>> Complex256, Complex number made of two float 64's
Likewise useful, except of course it's two float 128s.
>> Interval64
>> Interval128
>> Interval256
>> Interval arithmetic types useful for scientific computing.
Uh, actually next-to-useless. Interval arithmetic is one of those things
that sounds good at first but blows up when you actually try it.
In any case, new basic types like decimal floats could be added at any
time, without having to change anything else in the language; there's no
hurry.
There are actually a couple of different variants of the decimal
floating-point types, which is another reason to wait-and-see on them.
(In one variant, the mantissa is a binary number; in the other, a
"densely packed decimal" number.)
--
Norman Yarvin http://yarchive.net
>> Fused multiply add operations.
>
>Another portability issue. The interesting case with fused
>multiply-add is when it should happen. Should the compiler
>automatically generate it? Or should it only be available via a call
>to an explicit math function.
Do those whenever you can. It is often a big speedup, and it increases
precision rather than decreasing it. If it makes a big difference in the
result of a floating-point program, then that program is doing an
ill-conditioned computation, and both results are wrong; printing out the
same number everywhere just gives the user a false sense of security.
By the way, even with binary floating point, doing that only prints
something wrong if you print out so many digits that you're pushing the
limits of your accuracy. If you print out all the digits in a
floating-point number, you might get 9.999999999999999 rather than
10.000000000000000; but if you print out only seven digits, the print
routine will round to the nearest seven-digit number, which is 10.00000.
> I'm not sure why financial applications don't just use
> 64-bit fixed point, calculating everything in numbers of
> cents, but maybe there's a good reason.
Or not. I had a finanicial programmer complain to me about
rounding in awk. He had trouble with the idea that the
rounding was done in _binary_ and what he was printing was
_decimal_.
I shan't name the large multinational bank involved in that
little exercise, but he can't have been a completely isolated
example if he was being _allowed_ to write in awk in the
first place.
(No, I don't, and won't, bank with that bank. I want one
where I am in happy ignorance about the state of their IT
department, and that wasn't the only problem I sorted for
them by a long shot.)
Seriously, the reason may be that for some currencies 64 bits
might not be enough. Although ignoring any country with hyper
inflation even 2^63 Yen is a lot of money, so I don't know
that that reason holds up very well.
> In any case, new basic types like decimal floats could be added at any
> time, without having to change anything else in the language; there's no
> hurry.
Especially as Go already insists that all numeric types are
distinct: no problem (as C would/will have) trying to work out
and squeeze in the "usual conversions" with their little
surprises.
I have no opinion whether "now" is the time (if someone's
willing to contribute code and the Go team approve their
proposal and eventual code perhaps it is) but it doesn't look
like it _needs_ to be now. IMHO, of course.
Contrary opinions cordially invited. :)
Regards,
Giles
So it may be a while before we can test the speed, accuracy and
precision
of that opcode on our laptops.
Actually, the reason that they get narky is because $10.10 is ten
dollars and ten cents (both integer) and NOT a real number. I.e. we're
talking about countable things and not points on the real line. To make
matters worse floating point numbers in computers are only a rough
approximation of real numbers. So your asking them to use an imprecise
representation of real numbers to represent integers.
Peter
I agree that if we have only one option, we should do fused
multiply-adds whenever possible, but it would also be beautiful to be
able to have actual IEEE arithmetic--which has strong guarantees about
what answers you get--at least as an option.
You're right that it shouldn't make a big difference in stable
algorithms, but it's still useful. For instance, it allows you to
check whether two different programs are doing *exactly* the same
computations, since they should get *exactly* the same answer. This
can be very useful when parallelizing (or optimizing) a program, but
alas, it's very hard to convince gcc (or, from what I hear, modern
CPUs) to abide by IEEE rules. I don't know if go could achieve this,
but if it could, that would be wonderful. The problem with testing in
the absence of reproducible arithmetic is that any change you make
could change your answers, so you have to distinguish between roundoff
error and bugs, which means that if there are small "edge effects" due
to bugs, they have to be smaller than whatever the "roundoff error"
is, which is well-nigh impossible to quantify.
Of course, aiming for IEEE arithmetic would also enable testing of the
correctness of the compiler on one platform against another, which
would also be nice. Users could then also test the correctness of
their code on one platform against another.
--
David Roundy
Those guarantees are exceedingly minor. To achieve them, IEEE arithmetic
has the misfeature of denormalized arithmetic, which in many
implementations acts to put a bunch of little performance bombs in your
code: if your code happens to do a few computations on numbers which are
close to underflow (normally those numbers being insignificant in the
larger computation, such that you don't care if they did underflow), it
suddenly runs half as fast, because those few computations each triggered
a trap into a software routine which did denormalized numbers, spending
hundreds of times more on each computation than is normal.
The people who designed those processors didn't implement denormalized
numbers that way out of stupidity, either; they knew that to implement
them in the fast path would require adding a couple of barrel shifters to
it, which would slow down all computations. To the question of "which
would you rather have: certain minor guarantees about the behavior of
arithemetic on values below 10^-308, or significantly-faster math on
numbers of ordinary size", the answer from people who do scientific
computation is almost always "Are you crazy?". The software traps to fix
up denormalized numbers were then added so that marketing could put a
check mark in the "IEEE arithmetic" box, and with the expectation that
people who cared about speed would turn them off.
>You're right that it shouldn't make a big difference in stable
>algorithms, but it's still useful. For instance, it allows you to
>check whether two different programs are doing *exactly* the same
>computations, since they should get *exactly* the same answer. This
>can be very useful when parallelizing (or optimizing) a program, but
>alas, it's very hard to convince gcc (or, from what I hear, modern
>CPUs) to abide by IEEE rules.
Have you actually seen examples of this being useful? I'd have thought
that it wouldn't be at all useful in practice, since even a case where
one programmer wrote
x = b + c + d
and the other wrote
x = d + c + b
would ruin it. Given the number of opportunities to write such lines
differently, the chance of a program *not* having at least one of them
written differently seems miniscule.
In any case, the usual way to deal with this is to just accept the
inaccuracy, subtract the results of the two implementations, and check
that their difference is within bounds. (For that, you have to know
error bounds for the algorithm; but if you don't know them, what are you
doing re-implementing it?)
> I don't know if go could achieve this,
>but if it could, that would be wonderful. The problem with testing in
>the absence of reproducible arithmetic is that any change you make
>could change your answers, so you have to distinguish between roundoff
>error and bugs, which means that if there are small "edge effects" due
>to bugs, they have to be smaller than whatever the "roundoff error"
>is, which is well-nigh impossible to quantify.
In the floating-point world, bugs that only produce errors of the same
size as roundoff errors aren't bugs.
(There's a rare exception to that for things like the library
implementations of sine and cosine, which are used so often that it's
worth sweating out the last tiny bit of accuracy.)
To unsubscribe from this group, send email to golang-nuts+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
Michael T. Jones
Chief Technology Advocate, Google Inc.
1600 Amphitheatre Parkway, Mountain View, California 94043
Email: m...@google.com Mobile: 408-507-8160 Fax: 650-649-1938
Organizing the world's information to make it universally accessible and useful
It certainly requires care, if you want to get identical output. But
the payoff is that you can guarantee that (on a given tested input)
your program gives the same output regardless of how it's been
parallelized (for all tested numbers of processors).
> In any case, the usual way to deal with this is to just accept the
> inaccuracy, subtract the results of the two implementations, and check
> that their difference is within bounds. (For that, you have to know
> error bounds for the algorithm; but if you don't know them, what are you
> doing re-implementing it?)
I know the usual way of doing this, and it's a royal pain, and it
doesn't work very well, since for non-trivial algorithms, you never
know the error bounds. At least, I've not heard of an algorithm with
known error bounds. As a concrete example, I don't have any idea how
you'd compute an error bound on the sum of an array of doubles. At
least, there's no way to compute this error bound without knowing the
values of those doubles.
>> I don't know if go could achieve this,
>>but if it could, that would be wonderful. The problem with testing in
>>the absence of reproducible arithmetic is that any change you make
>>could change your answers, so you have to distinguish between roundoff
>>error and bugs, which means that if there are small "edge effects" due
>>to bugs, they have to be smaller than whatever the "roundoff error"
>>is, which is well-nigh impossible to quantify.
>
> In the floating-point world, bugs that only produce errors of the same
> size as roundoff errors aren't bugs.
But the problem is that you can't tell whether on some other input,
the errors will be much larger than roundoff error. It's a *lot*
faster if you don't have to run your test cases for a long time, in
order to see if the errors get large.
In case you're wondering, the code I've used this approach on is meep,
a finite-difference time-domain program:
http://ab-initio.mit.edu/wiki/index.php/Meep
--
David Roundy
That's a pretty high degree of care. You can't even do things like the
usual way of adding up an array that has been distributed over
processors, which is to have each processor add up its own piece, and
then gather those partial sums up and total them -- because when you
change the number of processors, the pieces change, and thus so does the
order of summation. Instead you'd have to do something slower or more
complicated.
As for the more sophisticated forms of parallelization, in which the
algorithm actually gets altered to suit the number of processors, those
are entirely outlawed if you insist on identical output.
>> In any case, the usual way to deal with this is to just accept the
>> inaccuracy, subtract the results of the two implementations, and check
>> that their difference is within bounds. (For that, you have to know
>> error bounds for the algorithm; but if you don't know them, what are you
>> doing re-implementing it?)
>
>I know the usual way of doing this, and it's a royal pain, and it
>doesn't work very well, since for non-trivial algorithms, you never
>know the error bounds. At least, I've not heard of an algorithm with
>known error bounds. As a concrete example, I don't have any idea how
>you'd compute an error bound on the sum of an array of doubles. At
>least, there's no way to compute this error bound without knowing the
>values of those doubles.
You don't need a tight error bound, just one that is tight enough to
catch all realistic bugs. In that case, I'd print out the error divided
by the RMS value of the input array, and wouldn't worry if it were below
10^-12. Yes, that's a shoot-from-the-hip error bound; but this is
debugging, not theorem proving; absolute rigor isn't necessary.
>>> I don't know if go could achieve this,
>>>but if it could, that would be wonderful. The problem with testing in
>>>the absence of reproducible arithmetic is that any change you make
>>>could change your answers, so you have to distinguish between roundoff
>>>error and bugs, which means that if there are small "edge effects" due
>>>to bugs, they have to be smaller than whatever the "roundoff error"
>>>is, which is well-nigh impossible to quantify.
>>
>> In the floating-point world, bugs that only produce errors of the same
>> size as roundoff errors aren't bugs.
>
>But the problem is that you can't tell whether on some other input,
>the errors will be much larger than roundoff error. It's a *lot*
>faster if you don't have to run your test cases for a long time, in
>order to see if the errors get large.
Even the computational results coming out identical won't necessarily
save you from a bug of that size; it might have been too small to affect
even a single bit of the end result. So even if you take all the trouble
to do all your additions and multiplications in exactly the same order,
you still need test cases that exercise the full functionality of the
program, and make bugs really show themselves.
>In case you're wondering, the code I've used this approach on is meep,
>a finite-difference time-domain program:
>
>http://ab-initio.mit.edu/wiki/index.php/Meep
With codes like that, the worst part is validating them against physical
reality, or at least validating them against the theory they implement.
Validating one version of the program against another version is a piece
of cake in comparison.
Yes indeed. So you don't do this kind of check on things like that.
Fortunately, for most things that you might want to sum up, you can
check that the things that you were summing were correct. In cases
where you're summing only a few numbers, sorting them before summing
is a good idea anyhow. And, of course, you can instead simply check
the values of the numbers that are being summed.
> As for the more sophisticated forms of parallelization, in which the
> algorithm actually gets altered to suit the number of processors, those
> are entirely outlawed if you insist on identical output.
Right, I'm not arguing that it's a common situation, but it is
something I've run into, and I don't think it's unreasonable to want
to be able to predict the exact answers of my arithmetic. In integer
arithmetic, we'd never put up with unreproducible output, and it would
be beautiful be able to avoid it with floating point numbers.
> You don't need a tight error bound, just one that is tight enough to
> catch all realistic bugs. In that case, I'd print out the error divided
> by the RMS value of the input array, and wouldn't worry if it were below
> 10^-12. Yes, that's a shoot-from-the-hip error bound; but this is
> debugging, not theorem proving; absolute rigor isn't necessary.
The trouble is that this shoot-from-the-hip error bound monotonically
increases with time. You find that under unusual circumstances you'll
find an error that is 2e-12, so you increase it, and so on. Or if you
run the test case longer, you have to increase it. And in order to
ensure that any reasonable bug gives an error bigger than 2e-12, you
have to run with more different initial conditions (or sources), since
you need to ensure that the fractional contribution from the erroneous
condition (which often will be a corner case, literally, and therefore
tends to involve relatively small contributions).
You're right that you can test a code against itself even when its
results are stochastic, and with care you can get reasonable
confidence in the output. But for some codes (like meep) it's a whole
lot easier if you can just check that the answer is exactly the same.
When propagating the fields in FDTD, each field value depends on less
than a dozen other values, so it's not hard to sum them in the same
order.
>>But the problem is that you can't tell whether on some other input,
>>the errors will be much larger than roundoff error. It's a *lot*
>>faster if you don't have to run your test cases for a long time, in
>>order to see if the errors get large.
>
> Even the computational results coming out identical won't necessarily
> save you from a bug of that size; it might have been too small to affect
> even a single bit of the end result. So even if you take all the trouble
> to do all your additions and multiplications in exactly the same order,
> you still need test cases that exercise the full functionality of the
> program, and make bugs really show themselves.
True, you certainly need (in the case of FDTD) to run with sources
that break sufficient symmetry, and run long enough that your volume
fills up with non-zero values. But that's pretty easy, and it's
sufficient that you'd need something akin to a miracle to end up with
buggy code that gives bitwise-correct results.
>>In case you're wondering, the code I've used this approach on is meep,
>>a finite-difference time-domain program:
>>
>>http://ab-initio.mit.edu/wiki/index.php/Meep
>
> With codes like that, the worst part is validating them against physical
> reality, or at least validating them against the theory they implement.
> Validating one version of the program against another version is a piece
> of cake in comparison.
Validating against physical reality is irrelevant here, as Maxwell's
equations have been pretty well verified experimentally. Validating
that we're integrating Maxwell's equations properly is a bit of a
problem, since meep handles several dimensionalities and types of
materials, but it's not particularly hard, since Maxwell's equations
are a particularly simple set of equations. Validating that every
boundary condition is correct, that we handle symmetries properly,
that we do our communication just right, etc, is much harder.
I've been told by folks who are experienced in the field that there is
no such thing as a bug-free non-trivially-parallel code. I believe
that meep is bug-free in its parallelization, precisely because I did
this kind of checking, with this kind of precision.
--
David Roundy
>On Tue, Mar 30, 2010 at 3:48 PM, Norman Yarvin <yar...@yarchive.net> wrote:
>
>> That's a pretty high degree of care. You can't even do things like the
>> usual way of adding up an array that has been distributed over
>> processors, which is to have each processor add up its own piece, and
>> then gather those partial sums up and total them -- because when you
>> change the number of processors, the pieces change, and thus so does the
>> order of summation. Instead you'd have to do something slower or more
>> complicated.
>
>Yes indeed. So you don't do this kind of check on things like that.
>Fortunately, for most things that you might want to sum up, you can
>check that the things that you were summing were correct. In cases
>where you're summing only a few numbers, sorting them before summing
>is a good idea anyhow.
You've got to be kidding. Sorting before summing a good idea even if
you aren't worried about bitwise-identical results?
>> As for the more sophisticated forms of parallelization, in which the
>> algorithm actually gets altered to suit the number of processors, those
>> are entirely outlawed if you insist on identical output.
>
>Right, I'm not arguing that it's a common situation, but it is
>something I've run into, and I don't think it's unreasonable to want
>to be able to predict the exact answers of my arithmetic. In integer
>arithmetic, we'd never put up with unreproducible output, and it would
>be beautiful be able to avoid it with floating point numbers.
The difference is that people normally care about every last bit of
integers being exact. With floating point, the last few bits are
normally garbage. Trying to make that garbage be the same garbage in
every implementation goes against human nature: even if the standard
dictates it, implementations are going to be careless.
>> You don't need a tight error bound, just one that is tight enough to
>> catch all realistic bugs. In that case, I'd print out the error divided
>> by the RMS value of the input array, and wouldn't worry if it were below
>> 10^-12. Yes, that's a shoot-from-the-hip error bound; but this is
>> debugging, not theorem proving; absolute rigor isn't necessary.
>
>The trouble is that this shoot-from-the-hip error bound monotonically
>increases with time. You find that under unusual circumstances you'll
>find an error that is 2e-12, so you increase it, and so on. Or if you
>run the test case longer, you have to increase it. And in order to
>ensure that any reasonable bug gives an error bigger than 2e-12, you
>have to run with more different initial conditions (or sources), since
>you need to ensure that the fractional contribution from the erroneous
>condition (which often will be a corner case, literally, and therefore
>tends to involve relatively small contributions).
In that situation, I'd be injecting a big pulse of energy right next to
the corner anyway -- not only to thoroughly expose bugs hiding there, but
also so that I didn't have to wait for waves to propagate into the
corners, but rather could test the corner code with a very brief test
run. (The injection could be done in a very crude,
physically-unrealistic way, since the goal would just be to compare two
versions of the code.)
So the way you did it still seems to me like the hard way.
>I've been told by folks who are experienced in the field that there is
>no such thing as a bug-free non-trivially-parallel code. I believe
>that meep is bug-free in its parallelization, precisely because I did
>this kind of checking, with this kind of precision.
That seems overly optimistic. Saying that there's no such thing as a
bug-free code doesn't mean that every time you run one of the codes that
do exist, you'll hit a bug. Indeed, the reason such bugs continue to
exist is that they only get triggered by rare cases, and often aren't
repeatable. In parallel computing, there are all sorts of ways for
communication or synchronization to be done slightly wrong, such that
almost all of the time it works, but there are some rare sequences of
events which trigger a bug. The timing of those events depends not only
on the partitioning of the problem, but also on random events like which
processor got which interrupt. If you got all those places in the code
completely correct, it'd be due to the communication/synchronization
model you used, not the numerics -- because bitwise-exact testing doesn't
help when the bug isn't triggered by your test suite.
In any case, I'm not trying to argue that exactly-repeatable
floating-point calculations are provably useless, just that they're a lot
less important than other issues. For instance, I'd put having a good
syntax for multidimensional arrays as being about a million times more
important.