Announcing Gosit, a posit library for go.

Ever since I went on a 2 day trip to London, where I read an article on my phone about unums, I always liked posits. Posits allow for generally more precise operations, while also having properly defined relationship operators. Only one NaR, no ±0 or ±inf. Posits only have 2 exception values, which makes handling them much more easy. This blogpost will go over Gosit and some of the decisions made in the development of it.
hall
For the rest of this blog post I will refer to the IEE754 representation as float, or traditional float.

What is a posit?

A posit is an approximation of a real number, much like traditional floats. There is a finite amount of precision in computers, you only have a certain amount of RAM available, and ideally you don’t want to use all of it just for one number. Usually programs use the 32 or 64 bits versions of IEE754, both because hardware allows it, and because that is approximately the precision needed. But these floats have some flaws with them, which I will talk about later. In essence posits aim to fix or resolve these problems as much as possible. Of course this means it wont be 100% compatible, but as long as you keep yourself to real numbers, most operations should result in similar numbers.

Posits have a number of parts which define it. From left to right (MSB to LSB) there are:

sign bit, indicated the sign (negate to get the positive value)
regime bits this represent a large multiplier, and is dynamic in size. Scale factor is (2^ES)^regime. regime can be both positive and negative with a special encoding
exponent field is much like the normal floating point exponent, but it is always positive. This is also always ES bits long. In the latest document ES is always 2, however this is not always the case in this library.
fraction field is also much like the normal floating point fraction.

Whats wrong with floats?

One of the problems with floats is that there are many representations of NaN. This creates problems for languages and libraries. How to sort a list of NaNs? Is one NaN bigger than the other? are they equal? You would think, they are all NaN, of course they should be equal! But this makes checking if 2 floats are equal harder. For example take 0b1111 1111 1000 0000 0000 0000 0000 0000 is NaN, but 0b1111 1111 1000 0000 0000 0000 0000 0001 is also NaN, checking if they are equal is more complex than just comparing the bits.
Similarly a problem exists in maps. If I store value a in a map with a NaN as key, and retrieve a value from the map with a different NaN key, should I receive a? If so this would complicate the maps just like it did with equality.
There is also some more minor things in floats, that most people wont have to deal with when using posits. For example there is ±0, and weird rounding modes.

Do posits solve these problems?

Mostly, yes. There are some issues, like there is no ±Inf which can be useful. But when working close to 1 Posits have better precision in mathematical operations, and in representation. When working with the very big and very small the opposite it true. There are contrived examples of Posits being worse, but the same can be (and has been) done vise versa. Both formats have ranges in which they are superior, and in which they are not.

Why I wrote the library

Just because. I wanted a better understanding how floating point numbers worked, and this seemed like a good starting point. Writing this library gave me a good understanding of how they work, and how you can operate on them.

Different kinds of posits

Posits come in many different forms, for a given size in bits, there is also a number (ES) which determines how many bits are reserved for the exponent. As of the latest official specification, this is always 2, but this library can generate for different ES as well. I wanted to support all possible ES values. For this there is one generic Posit type, Posit. This stores the ES in the type, and should work with all values under 32. This however is a bit slow, because ES will never change during the operation, and a series of operations. There already is a restriction on working with posits with different ES sizes. It is still not constant however. If ES were constant, the compiler could do a lot of optimisations. There are a lot of shifts by ES amount, adds etc that could all be evaluated AOT by the compiler. To make this happen I there is a go generate script which generates posit types for two predetermined sizes, 32ES2 and 16ES1. If you want to generate more yourself you can use that program to do it too. There is one big template, that handles all the sizes. This way adding a new one should be as easy as adding it to the list of generated types. There are some limitations, like go doesn’t support 128 bit ints which are needed for 64 bit operations, and I don’t think 8 bits would work. But at least you can specify any ES value for both 16 and 32 bits and it should work. Keep in mind that you may need to update the script with some constants for the script to generate all functions correctly.

Testing

Gosit used to be fuzzed against softposit-rs and the goal is to always fuzz for at least 24 hours before releasing a stable release, but this may vary depending on the changes made. Many bugs have been found using fuzzing, and these are added to the regular tests. In fact Gosit has found some bugs in softposit-rs, one is truncate truncating 0.00004 to 1 and the other is truncate truncating 3.0 (exact) to 2

I have since moved off fuzzing against the mostly broken rust port, and started using the original c version of much higher quality(#13).
In this MR I also improved the performance a bunch, so it should be possible to do exhaustive testing for all 16ES1 functions, and 32ES2 that only take one argument.

Performance!!

This is mostly a copy past of the Gosit readme.md

Gosit

Semi constant-time benchmarks are ran with the exect same bench cases to eliminate favouring one library over another because of coincidence.
These are rotated out every iteration. For sqrt, the absolute value is taken to avoid the fast-path.
However, exp() and log2 are very dependent on the magniture of the input. For this reason there are 3 versions of each benchmark, Tiny, Medium, and Big. Which lay on some natural semi natural boundries on which the performance characteristics probably change.
All benches use the recomended ES for the bitsize.

#Turbo boost has been disabled
cset shield --exec -- go test --run=X --bench=. -benchtime 30s # cset with 2 threads
BenchmarkAddSlow-10                     1000000000              14.33 ns/op
BenchmarkAdd32ESConst-10                1000000000              12.33 ns/op
BenchmarkAdd16ESConst-10                1000000000              11.74 ns/op
BenchmarkAddSlowGoposit-10              11298813              3463 ns/op
BenchmarkMulSlow-10                     1000000000              13.28 ns/op
BenchmarkMul32ESConst-10                1000000000              11.40 ns/op
BenchmarkMul16ESConst-10                1000000000              10.04 ns/op
BenchmarkMulSlowGoposit-10              10038568              3466 ns/op
BenchmarkDivSlow-10                     1000000000              15.19 ns/op
BenchmarkDiv32ESConst-10                1000000000              13.58 ns/op
BenchmarkDiv16ESConst-10                1000000000              11.89 ns/op
BenchmarkDivSlowGoposit-10               9835690              3661 ns/op
BenchmarkSqrtSlow-10                    1000000000              32.94 ns/op
BenchmarkSqrt32ESConst-10               928119480               38.82 ns/op
BenchmarkSqrt16ESConst-10               1000000000              16.54 ns/op
BenchmarkTruncSlow-10                   1000000000               3.747 ns/op
BenchmarkTrunc32ESConst-10              1000000000               3.388 ns/op
BenchmarkTrunc16ESConst-10              1000000000               3.739 ns/op
BenchmarkRoundSlow-10                   1000000000               6.574 ns/op
BenchmarkRound32ESConst-10              1000000000               5.746 ns/op
BenchmarkRound16ESConst-10              1000000000               5.079 ns/op
BenchmarkStringSlow-10                  147758671              251.7 ns/op
BenchmarkExp32Tiny-10                   1000000000               8.718 ns/op
BenchmarkExp32Medium-10                 257134201              140.9 ns/op
BenchmarkExp32Big-10                    1000000000              11.89 ns/op
BenchmarkExp32ConstESTiny-10            1000000000               8.321 ns/op
BenchmarkExp32ConstESMedium-10          262799006              137.3 ns/op
BenchmarkExp32ConstESBig-10             1000000000              11.85 ns/op
BenchmarkExp16ConstESTiny-10            1000000000               7.412 ns/op
BenchmarkExp16ConstESMedium-10          503256609               72.13 ns/op
BenchmarkExp16ConstESBig-10             1000000000               7.098 ns/op
BenchmarkLog232Tiny-10                  1000000000              12.77 ns/op
BenchmarkLog232Medium-10                570807181               63.49 ns/op
BenchmarkLog232Big-10                   589752162               60.92 ns/op
BenchmarkLog232ConstESTiny-10           1000000000              12.04 ns/op
BenchmarkLog232ConstESMedium-10         606771810               59.79 ns/op
BenchmarkLog232ConstESBig-10            630153366               57.70 ns/op
BenchmarkLog216ConstESTiny-10           1000000000               7.370 ns/op
BenchmarkLog216ConstESMedium-10         1000000000              34.29 ns/op
BenchmarkLog216ConstESBig-10            1000000000              20.42 ns/op
BenchmarkFromInt3232Slow-10             1000000000              10.03 ns/op
BenchmarkFromInt3232ConstES-10          1000000000               7.838 ns/op
BenchmarkFromInt3216ConstES-10          1000000000               5.431 ns/op
BenchmarkFromUint3232Slow-10            1000000000               8.697 ns/op
BenchmarkFromUint3232ConstES-10         1000000000               6.613 ns/op
BenchmarkFromUint3216ConstES-10         1000000000               5.427 ns/op
BenchmarkFromInt6432Slow-10             1000000000              10.10 ns/op
BenchmarkFromInt6432ConstES-10          1000000000               8.242 ns/op
BenchmarkFromInt6416ConstES-10          1000000000               5.619 ns/op
BenchmarkFromUint6432Slow-10            1000000000               8.985 ns/op
BenchmarkFromUint6432ConstES-10         1000000000               7.253 ns/op
BenchmarkFromUint6416ConstES-10         1000000000               5.582 ns/op
BenchmarkFromInt1616ConstES-10          1000000000              11.18 ns/op
BenchmarkFromUint1616ConstES-10         1000000000               7.202 ns/op

softposit

These are directly comparible to the go benchmarks as they use the same data.
As you can see, these are quite a bit slower than the rust version.

In the soft-posit readme it is recommended to enable optimisations in their makefile,
however this did not bring any significant improvement.

BenchmarkCAdd32ES2-10           1000000000              11.15 ns/op
BenchmarkCMul32ES2-10           1000000000              10.89 ns/op
BenchmarkCDiv32ES2-10           1000000000              17.83 ns/op
BenchmarkCSqrt32ES2-10          1000000000              10.57 ns/op
BenchmarkCTrunc32ES2-10         1000000000               7.333 ns/op

softposit-rs

No direct comparisons exist, but taking some averages from cargo bench on my machine.
(also using cset and no boost)
32P2:

Operation	ns/op
Add	8.4ns/op
-	7.9ns/op
*	7.1ns/op
/	11.1ns/op
sqrt	8.2ns/op
trunc	-
⌊x⌉	3.8ns/op

16P1

Operation	ns/op
Add	10.1ns/op
-	8.5ns/op
*	8.2ns/op
/	11.3ns/op
sqrt	9.0ns/op
trunc	-
⌊x⌉	5.2ns/op

Whats next

I have been working hard in improvements on the exponential function. Other hyperbolic functions such as sinh. cosh, and tanh are also coming. Trigonometric functions like sin, cos, and tan are also bound to arrive at some point.

Other improvements will be the code generation. Generating more complicated code and handling more ES and size values will require this.

As a possibility, quire might also be supported in the future.