Testing for compliance

Diego Masotti

unread,

Jun 1, 2023, 9:34:31 AM6/1/23

to Unum Computing

Hi,

I'm building a posit emulating library in the zig programming language.

I've implemented addition, multiplication, sqrt, div, conversion between float and integers, for all posit and quire types, and tested against the Cerlane library.

Now I'm tackling logarithms and trigonometry. How can I test for compliance?

Are there files available to download?

Thank you in advance.

John Gustafson

unread,

Jun 1, 2023, 12:37:21 PM6/1/23

to Diego Masotti, Unum Computing

There is a great need for a compliance test suite. It's possible that Theo Omtzigt has something close to what you want in his Universal Library.

My approach to date is to turn the precision down so far that exhaustive testing is possible of every input bit pattern, and compare the results with Mathematica as the oracle. I build a routine that converts a posit bit pattern p into its mathematical meaning (maintained by Mathematica), pToX(p), and another one (harder to write correctly) that converts any input x into its correctly-rounded posit value, xToP(p).

Suppose you are testing the exp function. For each possible input bit pattern posit_x, of which there are 2^nBits, use an oracle program like Mathematica to compute

posit_y = xToP(exp(pToX(posit_x))

and compare it with the bit pattern your library produces. You should be able to do 4 billion cases exhaustively (especially if you use multiple cores to do it in parallel) in a tractable amount of time, which will get you up to nBits = 32 (single-precision posits). To go beyond 32-bits, you'll want to test the hard-to-round cases only, and Vincent LeFèvre at the National Institute for Research in Computer Science and Control have compiled what those are for 64-bit IEEE floats for many transcendental functions; I suspect their approach to finding hard-to-round values can be repurposed for 64-bit posits.

I'm skeptical that people actually need 64-bit precision for transcendentals, though you do need more than 32-bit precision when using logs and exponentials to compute pow(x, y) for 32-bit posit x and y.

John

--
You received this message because you are subscribed to the Google Groups "Unum Computing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unum-computin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unum-computing/e7a10c8c-7a9d-4880-b46f-1cac2c22d8b0n%40googlegroups.com.

Diego Masotti

unread,

Jun 1, 2023, 4:24:09 PM6/1/23

to Unum Computing

I'm not familiar with Mathematica nor Haskell.

I know is a stretch, but can you provide a txt with the bitpatterns for sin cos ecc, at least for 16bit posit.

If not, can you link a Mathematica program for posit, as a starting point?

Diego

John Gustafson

unread,

Jun 1, 2023, 10:41:40 PM6/1/23

to Diego Masotti, Unum Computing

I'm happy to share the Mathematica template I use (as a PDF) in studying both IEEE floats and posits. It's not very well-commented… I'm working on that for my next book.

John

posit-Mathematica.pdf

Nathan Waivio

unread,

Jun 2, 2023, 6:10:02 PM6/2/23

to Unum Computing

I would also benefit with a source of golden data. I made a quick program to calculate the values of `sin` and it generates a 2.4MB CSV file, so that type of data would probably have to be stored on posithub.org

The first two values I got are:
0b1000000000000000,0b1000000000000000
0b1000000000000001,0b0000000000000000
...

Which makes me concerned that I have a loss of precision bug in the range reduction phase of my implementation of the sin the function.

If I use Wolfram Alpha I would get:
> sin(-72057594037927936)
-0.9669999630612707205376786386263821996011905277375724842...

So, I believe the correct P16 value should be:
0b1100000010000111
That is: -0.967041015625

I'll have to fix some bugs in my implementation.

Thanks,
Nathan.

John Gustafson

unread,

Jun 2, 2023, 9:53:41 PM6/2/23

to Nathan Waivio, Unum Computing

Nathan,

Your first value is correct, since sin(NaR) = NaR.

I can confirm that if your input is 0b1000000000000001 = –maxPos = –2^56 = –72057594037927936, then the sin of that is the number you got from Wolfram Alpha and the 16-bit posit (eS = 2) for that is 0b1100000010000111 or 1100000010000111 as I like to color-code it for legibility. It sounds like you may be new to the problem of argument reduction. You need to store the value of π to sufficient precision that your angle modulo π is correct to high precision, high enough that the result rounds correctly. This is a classic problem and some great shortcuts have been found for it because you only need a range of digits of π, not all of them. Beyond some large number, you're multiplying an integer times 2π and you can ignore the effect of that. Below some small value, it cannot affect the sine of the reduced angle. The window moves as you increase the magnitude.

It looks like you multiplied that big number by some representation of π, perhaps a double-precision float with only 53 bits of significand, and your modulo logic returned zero, so you got sin(posit) = 0.

While the Posit Standard includes Sin(posit) as a require function, just for compatibility with just about every existing environment, I strongly recommend that people use SinPi(posit) = Sin(posit × π). Outside some range, all posits are signed integers (no fractional parts) and the SinPi of those values is exactly zero, greatly simplifying the code and making it run faster. Those angle values are the ones you need for radix-2 and radix-4 Fast Fourier Transforms, anyway. It seem so ridiculous to think about going around the unit circle order 2^54 times and then wanting a precise value of a trig function, any trig function.

Don't reinvent the wheel… just look up the standard ways to do high-accuracy argument reduction for trig functions and apply them to the input values.

Once you get the reduced argument, I recommend you use the methods described in "An Approach to Generate Correctly Rounded Math Libraries for New Floating-Point Variants" by Jay Lim et al. They applied my "Minefield Method" with full automation, and they get perfectly-rounded functions that are uniformly fast for all inputs. While Intel's ICC math library and the gcc math library have many outputs that are not correctly rounded, the method of that paper not only got perfect rounding but also was faster than ICC.

If you want to discuss further, maybe we should go off the shared email and I can share my algorithms for SinPi(posit), written in Mathematica. Unlike the methods of Jay Lim, I use 100% integer operations (no reliance on the availability of fast floating-point arithmetic) so those should be even faster than the ones described in Lim's paper. They might be helpful just to get all the exception regions right, like posit values so small that sin(posit) can return posit as the correctly-rounded result.

I'll attach a plot of SinPi(posit) where posit ranges from 0 to 32767 as a bit string input (representing 0 to 2^56 as real numbers). You can see how organized and predictable the output becomes, as opposed to the chaotic and usually meaningless scatter plot you get with the Sin function (whether using posits or floats).

Best,

John

SinPi-PositivePosits.pdf

Diego Masotti

unread,

Jun 3, 2023, 4:42:44 AM6/3/23

to Unum Computing

I've found some difficulties working with Mathematica.

Searching around I found the GNU MPFR project. It's a arbitrary precision arithmetic library.

If you can integrate C function in your framework I suggest use this for testing. I found it fairly easy to use.

Also it provides a pdf witch describes the used algorithms.

Diego

Nathan Waivio

unread,

Jun 3, 2023, 4:54:19 PM6/3/23

to Unum Computing

My Haskell implementation is 100% Posit and doesn't use Floats.

The sine and cosine implementation was adapted from MathOnWeb

Step 0: Convert P16 to P32

Step 1: Reduce by Periodicity

Step 2: Use Symmetry

Step 3: Use Co-function

Step 4: Use Polynomial (from Jan J Tuma "Handbook of Numerical Calculations in Engineering")

Step 5: Convert P32 to P16

The issue I spotted was in Step 1 where I was thoughtlessly doing `posit / (2pi)`, and then discarding the integral part and keeping the fractional part in a separate step.

The improvement I made was to combine the previous computations and perform the operation with Rational numbers.

Attached are some images showing some improvement in accuracy on the ends.

The Issue:

Precision Loss Issue sin and cos P16.png

After Fusing the Periodicity step:

Thanks for pointing out Jay Lim's paper, I haven't seen that before, I'll review that and see what other improvements I can make.

Another change I'm considering is modifying Step 0 to convert P16 to P256, but that is a big hammer I'd rather not use.

Nathan.

MitchAlsup

unread,

Jun 3, 2023, 5:09:26 PM6/3/23

to Unum Computing

You have reached the point where you need Payne and Hanek argument reduction

do deal with the "great big" arguments. This is where you cannot rely on i = int( fp )

to create the right integer from a really big FP number because in creating the int,

the FP number gets rounded up <sometimes> and the resulting int no longer

accurately represents the closest integer to the argument modulo 2×pi--which the

polynomial depends.

If you plotted in <negative> log-base-2 your graphs would read directly in bits of

accuracy, instead of digits.

Reply all

Reply to author

Forward