I can use
floats with a special rounding after each operation
scaled integers with a normalisation after each multiplication/division
ratios with a special rounding after each multiplication/division
making a CLOS class
???
any ideas on the subject ?
Marc Battyani
> What CL types are best suited for computing with fixed decimal values with
> exactly n places after the decimal point?
>
> I can use
> floats with a special rounding after each operation
floats are out since finite length decimal point numbers often have
non-finite binary expansions. 0.1 for example cannot be represented
exactly by floating point.
> scaled integers with a normalisation after each multiplication/division
> ratios with a special rounding after each
> multiplication/division
either of these can be made to work.
> making a CLOS class
> ???
a clos class might make a good wrapper.
but then again, just defining a bunch of functions might be enough.
> any ideas on the subject ?
just define your "unit" to be the smallest quantum and use integers.
e.g., if you want to calculate dollars to two decimal points, forget
dollars and just use cents. translate units on input and output but
do all internal math with cents.
--
J o h a n K u l l s t a m
[kull...@ne.mediaone.net]
Don't Fear the Penguin!
* Johan Kullstam <kull...@ne.mediaone.net>
| floats are out since finite length decimal point numbers often have
| non-finite binary expansions. 0.1 for example cannot be represented
| exactly by floating point.
but if he knows he has N places after the decimal point, that translates
directly to a factor (expt 10 N), which means he could do well with the
extra precision and speed of a double-float value relative to bignums.
#:Erik
> any ideas on the subject ?
1. Integers
You can do all your calculations with integers and maybe you just have
to normalize _after_ all your computations are done. Sometimes you need
to use functions that necessitate normalization (e.g., LOG), but it's
typical to use fix-point decimals (e.g., currency amounts) with linear
functions only, like multiplications or divisions (with ROUND) etc. that
you ask for. For example, a server app I'm working on puts in the
decimal point as late as HTML output rendering (even that is with
FORMAT, so numbers stay integers all the time).
Performance notes: ideally you should use fixnums or [UN]SIGNED-BYTE to
obtain compilation to native assembly operations. Bignum operations
tend to cons a lot, so if performance matters, try to avoid them.
2. Using floats (as floats)
The loss of accuracy is not always a problem, so given the knowledge of
your functions and the range of numbers, you may not have to normalize
after every operation. Bear in mind that if your numbers are
sufficiently big and/or you need many digits rigth to the decimal point,
there is _no_ normalization that possibly preserves accuracy, just
because of the fact that floats are fixed-size animals. Precision !=
accuracy. There was a long thread about it on comp.objects several
months ago.
3. Ratios
It is a reasonable alternative to undeclared integers, and I don't think
they're optimized either, unless we can somehow declare integers for
numerator and denominator separately. Theoretically it is the cleanest
solution. If you test its performance, let us know.
4. CLOS class
If you have a lot of these numbers, the sheer size overhead of instances
(or structs for this matter) rules out this solution. Unfortunately you
can't make arrays store unboxed instances. It is a nice solution if
performance does not matter, but then you could as well use ratios.
5. Using floats as MEDNUMs
MEDNUM is an integer which has 50-64 bits. You can use the valuable
50-something useful bits of double-floats for storing **integers**
without losing accuracy. The advantage over BIGNUM is that it is _very
fast_ when properly declared and does not cons either. The advantage
over native fix-length integers is that it allows for much bigger
numbers. The usefulness of this representation is implementation- and
version-dependent, as there may be 64 or 80 bit integer assembly
operations on platforms that future CL versions might utilize. The
problem with this method (as with fixed-size integers) is that it's
reliant on the size of input numbers and the (interim) results of your
calculations, but this is great for things like currency.
6. Bring your own bignum
If your operations are few and simple, you can implement fast,
non-consing bignums (for example, using the MSB to determine if the
remaining 31 bits make up the integer itself, an address or a row number
in an array, and maybe the number of bignum chunks). It's a lot of work
(if you use an array, bring your own GC too), and it will be isolated
from the rest of CL (you'll have to convert to a standard CL type to
pass on values to functions you haven't implemented for your own
bignums). Do it only if you have *really* exploited all your options -
maybe not even then.
Udv,
Robert
I'm already storing them in a scaled by (expt 10 N) form. I'm using
automatically generated around methods on accessors to do the scalings.
Thanks MOP!
Using double floats is a good idea if the scaled numbers fit in a
double-float.
There is even a #'FROUND function that scales, round and gives back a float
to do that.
Common Lisp is so great!
Marc Battyani
> 1. Integers
>
> You can do all your calculations with integers and maybe you just have
> to normalize _after_ all your computations are done. Sometimes you need
> to use functions that necessitate normalization (e.g., LOG), but it's
> typical to use fix-point decimals (e.g., currency amounts) with linear
> functions only, like multiplications or divisions (with ROUND) etc. that
> you ask for. For example, a server app I'm working on puts in the
> decimal point as late as HTML output rendering (even that is with
> FORMAT, so numbers stay integers all the time).
You have to normalize also after using multiplications by another scaled
integer and this normalization can be tedious to do when you can have
arbitrary expressions.
> Performance notes: ideally you should use fixnums or [UN]SIGNED-BYTE to
> obtain compilation to native assembly operations. Bignum operations
> tend to cons a lot, so if performance matters, try to avoid them.
>
> 2. Using floats (as floats)
>
> The loss of accuracy is not always a problem, so given the knowledge of
> your functions and the range of numbers, you may not have to normalize
> after every operation. Bear in mind that if your numbers are
> sufficiently big and/or you need many digits rigth to the decimal point,
> there is _no_ normalization that possibly preserves accuracy, just
> because of the fact that floats are fixed-size animals. Precision !=
> accuracy. There was a long thread about it on comp.objects several
> months ago.
You also have to round the result to the nth decimal digit. ie 2/3 => 0.667
if n = 3 and it's more time consuming than rounding to an integer value. But
it's the easiest representation to use.
> 3. Ratios
>
> It is a reasonable alternative to undeclared integers, and I don't think
> they're optimized either, unless we can somehow declare integers for
> numerator and denominator separately. Theoretically it is the cleanest
> solution. If you test its performance, let us know.
The trouble with ratios is that they will be converted to floats when used
with floats. And as you say the performance has to be tested. Anyway like
with the floats I will have to round them to the nth decimal digit ie 22/7
=> 314/100 => 157/50. which will impact the performance.
> 4. CLOS class
>
> If you have a lot of these numbers, the sheer size overhead of instances
> (or structs for this matter) rules out this solution. Unfortunately you
> can't make arrays store unboxed instances. It is a nice solution if
> performance does not matter, but then you could as well use ratios.
Yes, I will also have to shadow the CL arithmetics functions like cl:+
> 5. Using floats as MEDNUMs
>
> MEDNUM is an integer which has 50-64 bits. You can use the valuable
> 50-something useful bits of double-floats for storing **integers**
> without losing accuracy. The advantage over BIGNUM is that it is _very
> fast_ when properly declared and does not cons either. The advantage
> over native fix-length integers is that it allows for much bigger
> numbers. The usefulness of this representation is implementation- and
> version-dependent, as there may be 64 or 80 bit integer assembly
> operations on platforms that future CL versions might utilize. The
> problem with this method (as with fixed-size integers) is that it's
> reliant on the size of input numbers and the (interim) results of your
> calculations, but this is great for things like currency.
This one also suggested by Erik Naggum seems to be the way to go in my case.
> 6. Bring your own bignum
>
> If your operations are few and simple, you can implement fast,
> non-consing bignums (for example, using the MSB to determine if the
> remaining 31 bits make up the integer itself, an address or a row number
> in an array, and maybe the number of bignum chunks). It's a lot of work
> (if you use an array, bring your own GC too), and it will be isolated
> from the rest of CL (you'll have to convert to a standard CL type to
> pass on values to functions you haven't implemented for your own
> bignums). Do it only if you have *really* exploited all your options -
> maybe not even then.
This one seems a little bit too much for my use.
Is MEDNUM an official term or did you just invent it?
Marc Battyani
> Erik Naggum <er...@naggum.no> wrote in message
> news:31610812...@naggum.no...
> > * Marc Battyani
> > | What CL types are best suited for computing with fixed decimal values
> with
> > | exactly n places after the decimal point?
> > |
> > | I can use
> > | floats with a special rounding after each operation
> >
> > * Johan Kullstam <kull...@ne.mediaone.net>
> > | floats are out since finite length decimal point numbers often have
> > | non-finite binary expansions. 0.1 for example cannot be represented
> > | exactly by floating point.
> >
> > but if he knows he has N places after the decimal point, that translates
> > directly to a factor (expt 10 N), which means he could do well with the
> > extra precision and speed of a double-float value relative to
> > bignums.
yes. you are correct. i stand corrected. thanks.
> I'm already storing them in a scaled by (expt 10 N) form. I'm using
> automatically generated around methods on accessors to do the scalings.
> Thanks MOP!
sweet!
> Using double floats is a good idea if the scaled numbers fit in a
> double-float.
> There is even a #'FROUND function that scales, round and gives back a float
> to do that.
in addition, i have found that floating point math can be faster than
integer math in some cases. in a turbo code implementation i did for
work, i had to do a lot of multiply-accumulate operations in a bunch
of tight loops. the intel x86 cpu has very few (assembly accessible)
integer registers and the floating point unit can run somewhat in
parallel with the integer unit. what this meant was that by putting
the looping in the integer registers and doing the multiply-accumulate
in floating point it turned out to be faster (as benchmarks showed)
than doing it all with integers.
the important part about fixnums is not how big they are, but how they
interact with other implementation limits in the language. e.g., array
indices are all fixnums. allowing bignums for array indices is a bad
idea. it is therefore valuable to have a type which is already necessary
in the language explicitly available for similarly constrained operations.
#:Erik
> Is MEDNUM an official term or did you just invent it?
I'd guess that 70-90% of bignum use fits in 64 or somewhat fewer bits,
and that integers beyond 64 or 80 bits are mainly used for specialized
areas such as physics or math research. Even the numerical methods of
analysis (integration etc.) have to be happy with fixed-width
floating-point numbers.
It is rare that we want to measure things with a very high level of
accuracy (e.g., measuring body weight as the number of neutrons and
protons). I think 64 bits are enough to express the entire amount of
money in the world in Italian Liras and still allow a few tag bits.
OTOH, a lot of real-life, "mainstream" problems cannot be solved with
fixnums of some implementations or even worse, the ANSI lower limit of
16 bits.
The word MEDNUM (or for this matter, FIXNUM except for historical
reasons) should not show up in the ANSI CL specification, because it's
satisfactory to
- declare integers with (INTEGER X Y)
- query supported native and immediate integer size limits the way
MOST-*-FIXNUM functions do
- query upgraded types (e.g., to exploit room above the bare minimum)
-> UPGRADED-ARRAY-ELEMENT-TYPE
- rely on subtypes of (INTEGER -32768 32767) being immediate
In turn, the implementation would choose whatever representation it
wants (be it 30-bit fixnum, 64-bit integer, integerized float, bignum
etc.) as it does now.
Maybe people who need more than 50-64 bits for what they are doing would
care to comment. We make a separate category for nominations like
representing a 30MB text file as a bignum :-)
Robert
> It is rare that we want to measure things with a very high level of
> accuracy (e.g., measuring body weight as the number of neutrons and
> protons). I think 64 bits are enough to express the entire amount of
> money in the world in Italian Liras and still allow a few tag bits.
(Slightly off-topic) However that's not the only reason for
high-precision. Algorithmic stability is another (and I'd assume an
enormously dominant one for FP calculations).
--tim
The issues numerical analysts typically deal with are
1 numerical conditioning
2 numerical stability
The *problem* you are trying to solve may or may not be well conditioned.
The *algorithm* you use may or may not be stable.
For example, you may need to solve a system of linear equations
represented by A*x=y: find x given A and y where x and y are n-vectors
and A is an nxn matrix. If the condition number for A wrt inversion
is 10^p then you should expect to have about p digits of error
in the answer (in the Euclidean-norm sense), provided you are using
a numerically stable algorithm (e.g., Gaussian elimination w/ partial
pivoting). So if A has a condition number of 10^5 and you use
single precision w/ 8 digits, your answer will be good to ~3 digits.
If you use double precision w/ 16 digits, your answer will be good
to ~13 digits.
Matt
--
Matthew.R.Wette at jpl.nasa.gov -- I speak for myself, not for JPL.
> the important part about fixnums is not how big they are, but how they
> interact with other implementation limits in the language. e.g., array
> indices are all fixnums. allowing bignums for array indices is a bad
> idea. it is therefore valuable to have a type which is already necessary
> in the language explicitly available for similarly constrained operations.
Yes, and the similarity of constraints come from the size of the address
space - i.e., it does not make sense to allow larger in-memory arrays
than the whole address space itself.
Maximum array sizes are limited by things other than maximum image size
and the range of fixnums though, except maybe CMUCL.
Robert