Why not int128?

3,035 views
Skip to first unread message

jemeshsu

unread,
Sep 14, 2013, 2:37:54 AM9/14/13
to golan...@googlegroups.com

Is 128-bit integer type being considered or feasible to implement? Does it require some support from processor in order to implement it? Or there is some constraints by current computer architecture such as limited physical memory space?

When C is created, 8bit/16bit processor is the norm. 64bit seems like a realistic upper limit for integer type. Now 64-bit processor is the norm, even smartphone is moving to 64-bit. Isn't time now to move up this upper limit, to prepare for the future? I know most applications do not require int128 and there is big package. Just curious. Not sure if I'm asking a stupid question. 

David Symonds

unread,
Sep 14, 2013, 2:47:37 AM9/14/13
to jemeshsu, golang-nuts
The VAX supported 128-bit integer types, but there are no modern CPUs
that support them, and for good reason. The increase in bit size has
historically been tied to limitations on what the current bit sizes
were capable of. For instance, 16-bit microprocessors were largely
limited to 64 KB address spaces (modulo segmentation, etc.), which we
quickly grew out of. It was only 10-15 years ago that the limit of
32-bit addressing modes became a problem, since 4 GB address space was
plenty for most users. We have a long way to go before 64-bit address
spaces are too small (I would guess decades), and there really isn't a
huge benefit to having larger integer types beyond that (vector
instructions have proven to be much more effective).

Michael Jones

unread,
Sep 14, 2013, 11:44:21 AM9/14/13
to David Symonds, jemeshsu, golang-nuts
I have many programs that use multi-word integers for intensive computation. 

For my Go programs, I wrote a little library for Numbers that can be compile-time configured as native int64, multi-word but fixed precision, or unlimited precision using the math/big library. I have the math in assembler for the multiword cases. Generally the sizes needed are 2 to 5 ganged 64-bit words.

The resulting overall application speed is about 3x big for add/subtract, often more, though the advantage declines as the sizes get longer and longer or when the operations (divide, say) are much more complex. The actual operations are faster than this, though, given the large number of shared instructions for general tasks (loop management, tests).

I also wrote doubled precision (quad precision not finished) for floating point math. This uses two float64's and some judicious manipulations to get bit-exact representations of nearly doubled precision at about 1/4 the speed. 

Here's an example from a float128 test that computes powers of 3:

=== RUN TestPower
  0: 3**  0 = 3ff0000000000000 0000000000000000, +1.0000000000000000e+00 +0.0000000000000000e+00 +1.0000000000000000000000000000000e+00
  1: 3**  1 = 4008000000000000 0000000000000000, +3.0000000000000000e+00 +0.0000000000000000e+00 +3.0000000000000000000000000000000e+00
  2: 3**  2 = 4022000000000000 0000000000000000, +9.0000000000000000e+00 +0.0000000000000000e+00 +9.0000000000000000000000000000000e+00
  3: 3**  3 = 403b000000000000 0000000000000000, +2.7000000000000000e+01 +0.0000000000000000e+00 +2.7000000000000000000000000000000e+01
  4: 3**  4 = 4054400000000000 0000000000000000, +8.1000000000000000e+01 +0.0000000000000000e+00 +8.1000000000000000000000000000000e+01
  5: 3**  5 = 406e600000000000 0000000000000000, +2.4300000000000000e+02 +0.0000000000000000e+00 +2.4300000000000000000000000000000e+02
  6: 3**  6 = 4086c80000000000 0000000000000000, +7.2900000000000000e+02 +0.0000000000000000e+00 +7.2900000000000000000000000000000e+02
  7: 3**  7 = 40a1160000000000 0000000000000000, +2.1870000000000000e+03 +0.0000000000000000e+00 +2.1870000000000000000000000000000e+03
  8: 3**  8 = 40b9a10000000000 0000000000000000, +6.5610000000000000e+03 +0.0000000000000000e+00 +6.5610000000000000000000000000000e+03
  9: 3**  9 = 40d338c000000000 0000000000000000, +1.9683000000000000e+04 +0.0000000000000000e+00 +1.9683000000000000000000000000000e+04
 10: 3** 10 = 40ecd52000000000 0000000000000000, +5.9049000000000000e+04 +0.0000000000000000e+00 +5.9049000000000000000000000000000e+04
 11: 3** 11 = 41059fd800000000 0000000000000000, +1.7714700000000000e+05 +0.0000000000000000e+00 +1.7714700000000000000000000000000e+05
 12: 3** 12 = 412037e200000000 0000000000000000, +5.3144100000000000e+05 +0.0000000000000000e+00 +5.3144100000000000000000000000000e+05
 13: 3** 13 = 413853d300000000 0000000000000000, +1.5943230000000000e+06 +0.0000000000000000e+00 +1.5943230000000000000000000000000e+06
 14: 3** 14 = 41523ede40000000 0000000000000000, +4.7829690000000000e+06 +0.0000000000000000e+00 +4.7829690000000000000000000000000e+06
 15: 3** 15 = 416b5e4d60000000 0000000000000000, +1.4348907000000000e+07 +0.0000000000000000e+00 +1.4348907000000000000000000000000e+07
 16: 3** 16 = 418486ba08000000 0000000000000000, +4.3046721000000000e+07 +0.0000000000000000e+00 +4.3046721000000000000000000000000e+07
 17: 3** 17 = 419eca170c000000 0000000000000000, +1.2914016300000000e+08 +0.0000000000000000e+00 +1.2914016300000000000000000000000e+08
 18: 3** 18 = 41b7179149000000 0000000000000000, +3.8742048900000000e+08 +0.0000000000000000e+00 +3.8742048900000000000000000000000e+08
 19: 3** 19 = 41d151acf6c00000 0000000000000000, +1.1622614670000000e+09 +0.0000000000000000e+00 +1.1622614670000000000000000000000e+09
 20: 3** 20 = 41e9fa8372200000 0000000000000000, +3.4867844010000000e+09 +0.0000000000000000e+00 +3.4867844010000000000000000000000e+09
 21: 3** 21 = 42037be295980000 0000000000000000, +1.0460353203000000e+10 +0.0000000000000000e+00 +1.0460353203000000000000000000000e+10
 22: 3** 22 = 421d39d3e0640000 0000000000000000, +3.1381059609000000e+10 +0.0000000000000000e+00 +3.1381059609000000000000000000000e+10
 23: 3** 23 = 4235eb5ee84b0000 0000000000000000, +9.4143178827000000e+10 +0.0000000000000000e+00 +9.4143178827000000000000000000000e+10
 24: 3** 24 = 425070872e384000 0000000000000000, +2.8242953648100000e+11 +0.0000000000000000e+00 +2.8242953648100000000000000000000e+11
 25: 3** 25 = 4268a8cac5546000 0000000000000000, +8.4728860944300000e+11 +0.0000000000000000e+00 +8.4728860944300000000000000000000e+11
 26: 3** 26 = 42827e9813ff4800 0000000000000000, +2.5418658283290000e+12 +0.0000000000000000e+00 +2.5418658283290000000000000000000e+12
 27: 3** 27 = 429bbde41dfeec00 0000000000000000, +7.6255974849870000e+12 +0.0000000000000000e+00 +7.6255974849870000000000000000000e+12
 28: 3** 28 = 42b4ce6b167f3100 0000000000000000, +2.2876792454961000e+13 +0.0000000000000000e+00 +2.2876792454961000000000000000000e+13
 29: 3** 29 = 42cf35a0a1bec980 0000000000000000, +6.8630377364883000e+13 +0.0000000000000000e+00 +6.8630377364883000000000000000000e+13
 30: 3** 30 = 42e76838794f1720 0000000000000000, +2.0589113209464900e+14 +0.0000000000000000e+00 +2.0589113209464900000000000000000e+14
 31: 3** 31 = 43018e2a5afb5158 0000000000000000, +6.1767339628394700e+14 +0.0000000000000000e+00 +6.1767339628394700000000000000000e+14
 32: 3** 32 = 431a553f8878fa04 0000000000000000, +1.8530201888518410e+15 +0.0000000000000000e+00 +1.8530201888518410000000000000000e+15
 33: 3** 33 = 4333bfefa65abb83 0000000000000000, +5.5590605665555230e+15 +0.0000000000000000e+00 +5.5590605665555230000000000000000e+15
 34: 3** 34 = 434d9fe779881944 3ff0000000000000, +1.6677181699666568e+16 +1.0000000000000000e+00 +1.6677181699666569000000000000000e+16
 35: 3** 35 = 436637ed9b2612f3 4008000000000000, +5.0031545098999704e+16 +3.0000000000000000e+00 +5.0031545098999707000000000000000e+16
 36: 3** 36 = 4380a9f2345c8e37 c02e000000000000, +1.5009463529699914e+17 -1.5000000000000000e+01 +1.5009463529699912100000000000000e+17
 37: 3** 37 = 4398feeb4e8ad552 c02a000000000000, +4.5028390589099738e+17 -1.3000000000000000e+01 +4.5028390589099736300000000000000e+17
 38: 3** 38 = 43b2bf307ae81ffd 4056400000000000, +1.3508517176729920e+18 +8.9000000000000000e+01 +1.3508517176729920890000000000000e+18
 39: 3** 39 = 43cc1ec8b85c2ffc 4026000000000000, +4.0525551530189763e+18 +1.1000000000000000e+01 +4.0525551530189762670000000000000e+18
 40: 3** 40 = 43e517168a4523fd 4040800000000000, +1.2157665459056929e+19 +3.3000000000000000e+01 +1.2157665459056928801000000000000e+19
 41: 3** 41 = 43ffa2a1cf67b5fc c09e740000000000, +3.6472996377170788e+19 -1.9490000000000000e+03 +3.6472996377170786403000000000000e+19
 42: 3** 42 = 4417b9f95b8dc87d c0b6d70000000000, +1.0941898913151237e+20 -5.8470000000000000e+03 +1.0941898913151235920900000000000e+20
 43: 3** 43 = 4431cb7b04aa565d 40dedec000000000, +3.2825696739453705e+20 +3.1611000000000000e+04 +3.2825696739453707762700000000000e+20
 44: 3** 44 = 444ab13886ff818c 40dc9c4000000000, +9.8477090218361120e+20 +2.9297000000000000e+04 +9.8477090218361123288100000000000e+20
 45: 3** 45 = 446404ea653fa129 40f5753000000000, +2.9543127065508336e+21 +8.7891000000000000e+04 +2.9543127065508336986430000000000e+21
 46: 3** 46 = 447e075f97df71be c10fd03800000000, +8.8629381196525014e+21 -2.6061500000000000e+05 +8.8629381196525010959290000000000e+21
 47: 3** 47 = 44968587b1e7954e 413411eb00000000, +2.6588814358957502e+22 +1.3153070000000000e+06 +2.6588814358957503287787000000000e+22
 48: 3** 48 = 44b0e425c56daffb c150f28fc0000000, +7.9766443076872514e+22 -4.4426870000000000e+06 +7.9766443076872509863361000000000e+22
 49: 3** 49 = 44c95638a82487f8 414a50a180000000, +2.3929932923061753e+23 +3.4491550000000000e+06 +2.3929932923061752959008300000000e+23
 50: 3** 50 = 44e300aa7e1b65fa 4163bc7920000000, +7.1789798769185258e+23 +1.0347465000000000e+07 +7.1789798769185258877024900000000e+23
 51: 3** 51 = 44fc80ffbd2918f7 417d9ab5b0000000, +2.1536939630755577e+24 +3.1042395000000000e+07 +2.1536939630755577663107470000000e+24
 52: 3** 52 = 451560bfcdded2b9 41b58d0211000000, +6.4610818892266729e+24 +3.6156264100000000e+08 +6.4610818892266732989322410000000e+24
 53: 3** 53 = 4530088fda671e0b 4164e0c660000000, +1.9383245667680020e+25 +1.0946099000000000e+07 +1.9383245667680019896796723000000e+25
 54: 3** 54 = 45480cd7c79aad11 c1efc15dace00000, +5.8149737003040064e+25 -4.2621289990000000e+09 +5.8149737003040059690390169000000e+25
 55: 3** 55 = 456209a1d5b401cc 42082ef9be580000, +1.7444921100912017e+26 +1.2983416779000000e+10 +1.7444921100912017907117050700000e+26
 56: 3** 56 = 457b0e72c08e02b3 c21bb989627c0000, +5.2334763302736057e+26 -2.9769226399000000e+10 +5.2334763302736053721351152100000e+26
 57: 3** 57 = 45944ad6106a8206 c2132c9c27740000, +1.5700428990820816e+27 -2.0588202461000000e+10 +1.5700428990820816116405345630000e+27
 58: 3** 58 = 45ae7041189fc309 c22cc2ea3b2e0000, +4.7101286972462449e+27 -6.1764607383000000e+10 +4.7101286972462448349216036890000e+27
 59: 3** 59 = 45c6d430d277d247 c265648beb18a000, +1.4130386091738735e+28 -7.3504963603700000e+11 +1.4130386091738734504764811067000e+28
 60: 3** 60 = 45e11f249dd9ddb5 c1f6d1e0a4f00000, +4.2391158275216204e+28 -6.1256525590000000e+09 +4.2391158275216203514294433201000e+28
 61: 3** 61 = 45f9aeb6ecc6cc8f 429feee297844c00, +1.2717347482564860e+29 +8.7777160645310000e+12 +1.2717347482564861054288329960300e+29
 62: 3** 62 = 461343093195196c c2b80cd60e5cc700, +3.8152042447694586e+29 -2.6443409939655000e+13 +3.8152042447694583162864989880900e+29
 63: 3** 63 = 462ce48dca5fa621 42cbecbeea74d580, +1.1445612734308374e+30 +6.1407258536363000e+13 +1.1445612734308374948859496964270e+30
 64: 3** 64 = 4645ab6a57c7bc99 42c3c63cbf5e8080, +3.4336838202925124e+30 +4.3484287253761000e+13 +3.4336838202925124846578490892810e+30
 65: 3** 65 = 4660408fc1d5cd73 c2f895a9383c8fd0, +1.0301051460877538e+31 -4.3249709166002900e+14 +1.0301051460877537453973547267843e+31
--- PASS: TestPower (0.00 seconds)

The columns after the equals sign are hex values for the two int64s, their floating point values, and the combined Float128 value. You can see by direct inspection that a float64 is exact up to 3**33 (5,559,060,566,555,523) but cannot represent 3**34 precisely. The Float128 does represent this one by storing the error in the second value. This perfection continues all the way to 3**65 (10,301,051,460,877,537,453,973,547,267,843) but becomes approximate after that. This is typical of the 32 decimal digits that two doubles provide when used as a pair.

Because I find "a little more than the biggest int" and "a little more than the biggest float" so valuable, I've ALWAYS imagined an ideal programming language would provide direct support for all hardware types and integrated software support for basic operations on ganged versions of the largest int and float types. If I ever build my language, it will do this. As it happens though, the selection/winnowing/voting process for new things usually looks at what people do at present, builds a histogram, and uses that to understand where to put emphasis. That's why such power is not native in Go or any other base language. The ability to "hijack" C++ to do this magically, or overlay Python with NumPy to do the same is why these tools are used by so many researchers for such tasks. Alas...

Michael

P.S. One good reason for double the largest int and double the largest float is that it makes writing some library routines, and some tests as well, incredibly easier.



--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

Alexei Sholik

unread,
Sep 14, 2013, 12:58:17 PM9/14/13
to Michael Jones, golang-nuts
Because I find "a little more than the biggest int" and "a little more than the biggest float" so valuable, I've ALWAYS imagined an ideal programming language would provide direct support for all hardware types and integrated software support for basic operations on ganged versions of the largest int and float types. If I ever build my language, it will do this.

Could you please clarify the meaning of "ganged"? Are you saying that having a "middle int" type (like bigint, but fixed number of digits) allows for more performance because the algorithms can be tuned based on the fixed size of those ints?

Btw, you might know this, but Julia[1] provides native Int128 and Uint128 types (in addition to native types and big ints).

Best regards
Alexei Sholik

Michael Jones

unread,
Sep 14, 2013, 1:59:45 PM9/14/13
to Alexei Sholik, golang-nuts
Hi Alexi,

Yes, I mean just what you say. Big-Int systems spend time in allocation, time/space/pure hell finding temporary working storage, and then time doing math that inherently must iterate along the length of the operands. A Fixed-Int system will know from the start what the size of the values are, can safely allocate temporary storage on the stack as needed for temporary values, and for smaller sizes can use many CPU-assists for precisely the tasks at hand.

Did not know about Julia and its int128 types. Good job there!

Michael

Alexei Sholik

unread,
Sep 14, 2013, 2:31:40 PM9/14/13
to Michael Jones, golang-nuts
Very nicely explained. Thanks, Michael!

David DENG

unread,
Sep 14, 2013, 8:11:02 PM9/14/13
to golan...@googlegroups.com
The type system of Go make you the ability to define you own Int128 type:

type Int128 struct {
H int64
L uint64
}

Methods like (String/Add/Sub/Mul/Div/Mod) can be defined. Though not so cool as a built-in type, it does not make the language risking support something not necessary.

David

Nigel Tao

unread,
Sep 14, 2013, 8:13:12 PM9/14/13
to Michael Jones, David Symonds, jemeshsu, golang-nuts
On Sun, Sep 15, 2013 at 1:44 AM, Michael Jones <m...@google.com> wrote:
> Because I find "a little more than the biggest int" and "a little more than
> the biggest float" so valuable...
>
> P.S. One good reason for double the largest int and double the largest float
> is that it makes writing some library routines, and some tests as well,
> incredibly easier.

Out of curiousity, can you give some examples? Or am I being dumb, and
is the powers of 3 thing an example?

Michael Jones

unread,
Sep 14, 2013, 8:27:26 PM9/14/13
to David DENG, golang-nuts

I essentially do that. My "complaint" is that 128 bit add is two instructions for the CPU. The overhead of the function calls and the rest of the baggage are far greater in execution time than the productive work. That means that a 1 month program can easily take 3 months. I don't like two months of dumb. That said, I never expect it but I do wish for it. :-)

--

Michael Jones

unread,
Sep 14, 2013, 8:46:44 PM9/14/13
to Nigel Tao, golang-nuts, jemeshsu, David Symonds

No, the power of three thing is just a test that's easy to validate with DC or Mathematica. :-)

Examples for float128 abound in evaluating mathematical functions (logs, trigs, ...) to +/-1 unit in the last place (ULP), int128 is good for dancing around overflow in integer functions such as "a*b/c", when a*b would overflow yet the result is OK. Try to figure out a way to do this all with same-sized ints! (See PS)

Float128 via doubled precision is no panacea though, since the exponent range is unchanged so the float64 library function hypot() can't be trivially done using my library and computing in high precision. Using IEEE 128s would make it easy...

P.S. If you want to gain a deep respect for these issues, here is my homework question. Part one: For various sizes of int, evaluate c=a*b for all a,b in 0..max int and determine the percentage that overflow. Part two: what percentage of your own code thinks about overflow. Part three: if you had a handy int128 type, how would that change things?

Reply all
Reply to author
Forward
0 new messages