Unum in Clojure/ClojureScript?

131 views
Skip to first unread message

Richard Davies

unread,
Nov 6, 2015, 6:07:32 PM11/6/15
to Numerical Clojure
I posted this on the Clojure group but it's more likely to be of interest here.

Unum is a number representation system that is a superset of IEEE integers and IEEE floats which avoids many problems (especially in floating point arithmetic) as it has no rounding, no overflow to infinity, no underflow to zero, and is safe to parallelize.

I was wondering if anyone has implemented or is interested in implementing this in Clojure/ClojureScript (or Java/JavaScript)?

There is an existing reference implementation written in Mathematica and it appears a port has already been done to Python: https://github.com/jrmuizel/pyunum

Mike Anderson

unread,
Nov 7, 2015, 2:46:46 AM11/7/15
to Numerical Clojure
Looks interesting, thanks for sharing!

I like the concept on unums, particularly:
- The variable-length representation, which might be more efficient than full 8-byte doubles for many computations
- The idea of encoding ... for numbers where rounding has taken place

Problems I see:
- Lack of hardware support means these will be slow (compared to primitive doubles etc.)
- I think the claim of the "end of error" is hyperbole. At best it is helping to mitigate some classes of error
- Not as good as other representations for certain cases (e.g. rationals which we already have in Clojure!)
- I've always found doubles sufficient (at least for my work). I suspect it is solving a non-problem for users.

I guess unums could be supported fairly quickly in Clojure solutions if there was a Java library that we could wrap. 

Richard Davies

unread,
Nov 7, 2015, 10:23:42 AM11/7/15
to Numerical Clojure
Hi Mike,

Thanks for responding! :) I highly recommend getting the book and working your way through as it will give you a much better response to your questions that I can. It is 400 pages and even though I am not a mathematician I am surprised and impressed by the ideas it contains and find it very accessible and surprisingly entertaining. I think the reviews on Amazon reflect my opinions:


With regards to some of your points:
- Agreed, hardware implementation will always be faster but one of the interesting points made in the book is because functions that are double precision can in many circumstances over-represent the accuracy of the actual data, this leads to many wasteful calculations for example in trig functions. Additionally, IEEE floating point arithmetic is not associative so can't be reliably parallelized which is another area of potentially significant performance gains with unums based functions.
- As to the "end of error" claim, I again recommend you read the book. In one sense the claim is true because the unum representation provides the most accurate representation of a real number that is possible with the given number of bits used. What is really interesting is how a unums avoid the errors that arise when using IEEE floats when propagating numbers across functions.
- Ratios are one solution but they are quite expensive as the have to maintain a common denominator which can itself overflow.

Even if you don't think that unums are practical for what you currently do, I'd still recommend you look into them a bit more because they are very elegant and understandable and solve a variety of problems that you may encounter. For example, have you ever used floats to solve polynomial equations? There is information loss "built-in" to the standard approach that is avoided with unums. This is just one example of many. At the very least they will make you think differently about how numbers should be encoded. I probably sound like a shill but I really do think this work is an important contribution to numerical computing.

I'm hacking away at some Java to see if I can replicate the examples. I'll post anything useful to github.

Regards,
Richard

Mike Anderson

unread,
Nov 16, 2015, 1:10:10 AM11/16/15
to Numerical Clojure


On Saturday, 7 November 2015 23:23:42 UTC+8, Richard Davies wrote:
Hi Mike,

Thanks for responding! :) I highly recommend getting the book and working your way through as it will give you a much better response to your questions that I can. It is 400 pages and even though I am not a mathematician I am surprised and impressed by the ideas it contains and find it very accessible and surprisingly entertaining. I think the reviews on Amazon reflect my opinions:


With regards to some of your points:
- Agreed, hardware implementation will always be faster but one of the interesting points made in the book is because functions that are double precision can in many circumstances over-represent the accuracy of the actual data, this leads to many wasteful calculations for example in trig functions. Additionally, IEEE floating point arithmetic is not associative so can't be reliably parallelized which is another area of potentially significant performance gains with unums based functions.
- As to the "end of error" claim, I again recommend you read the book. In one sense the claim is true because the unum representation provides the most accurate representation of a real number that is possible with the given number of bits used. What is really interesting is how a unums avoid the errors that arise when using IEEE floats when propagating numbers across functions.

I somewhat disagree with the above: at best it may be an efficient and accurate way of representing certain subsets of real numbers. It doesn't eliminate numerical error and it is definitely not always the best that is "possible", since the definition of "best" will depend on your problem domain. Unums definitely aren't the "end of error" for people working exclusively with rationals, for example.

There are also problem domains where error doesn't matter much: I do some stuff with machine learning for example where the numerical error in double-precision arithmetic is negligible (by 10+ orders of magnitude) compared to the random noise used in the training algorithms. 

My basic argument is that there are many different possible representations of numbers, and they all have different trade-offs. It's nice to have unums as a extra tool for sure - though personally I can't many areas where I'd have a practical use case for unums, even though I find the ideas interesting.
 
- Ratios are one solution but they are quite expensive as the have to maintain a common denominator which can itself overflow.

Unums are going to be quite expensive too without hardware support :-). And you can avoid overflow issues with ratios by using BigIntegers etc. 

Most significantly for the argument though, ratios eliminate important classes of numerical errors which unums don't address. Basically I think the "end of error" is a misleading title and an overly broad claim because it depends what sort of error you care about. 

 
Even if you don't think that unums are practical for what you currently do, I'd still recommend you look into them a bit more because they are very elegant and understandable and solve a variety of problems that you may encounter. For example, have you ever used floats to solve polynomial equations? There is information loss "built-in" to the standard approach that is avoided with unums. This is just one example of many. At the very least they will make you think differently about how numbers should be encoded. I probably sound like a shill but I really do think this work is an important contribution to numerical computing.

I'm hacking away at some Java to see if I can replicate the examples. I'll post anything useful to github.

Would be cool to see! I'm definitely not suggesting that unums are a bad idea, I just think it is important to recognise they aren't a panacea either. There is in fact no way to represent all reals accurately on a turing machine. Proof left as an exercise for the reader :-)
Reply all
Reply to author
Forward
0 new messages