Transition to Buffers

13 views
Skip to first unread message

Brooks Boyd

unread,
Jan 27, 2014, 4:34:27 PM1/27/14
to Bitcoin...@googlegroups.com
As we've decided to use Buffers as the core binary manipulation method, one fundamental piece looks like it needs to change first, before all others. JP, you created the "bigi" module for dealing with big integers, but it looks like you designed it to work with byte arrays rather than buffers, both for internal manipulation and for input/output. Should we update that module to use Buffers, or is there another generic biginteger class we should move to? JP could you update that library to use buffers, since you're most familiar with it's inner workings (which are pretty abstract), or should I take a crack at it?

Brooks

Sidney Zhang

unread,
Jan 30, 2014, 2:34:48 AM1/30/14
to Bitcoin...@googlegroups.com
+1

JP Richardson

unread,
Feb 3, 2014, 1:05:46 PM2/3/14
to Bitcoin...@googlegroups.com
I'm definitely in favor of moving to Buffers, that's for sure. A few things:

1. The "bigi" module is code that was mostly written by Tom Wu, see here http://www-cs-students.stanford.edu/~tjw/jsbn/ for more details.

2. With that being stated, does it make sense for us to work our way up i.e. convert eckey, ecdsa, big, etc first? Or, work our way down: start with the interface to CryptoCoin? I'm thinking work our way up. Another major concern is the lack of tests everywhere. Before the conversion of any library, I think it makes sense to put tests in place. I guess we just really need to establish our priorities. Finding a replacement BigInteger library that uses Buffers might be the easiest. But since CryptoCoin depends upon on the current BigInteger library almost everywhere, it may not be the easiest.

No easy answers on this one. Thoughts?


--
You received this message because you are subscribed to the Google Groups "Bitcoin Hackers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to BitcoinHacker...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Follow me on Twitter: http://twitter.com/jprichardson

Sidney Zhang

unread,
Feb 3, 2014, 6:27:36 PM2/3/14
to Bitcoin...@googlegroups.com
A thought I have on this is that we should be writing some integration tests the goes across multiple modules.

Brooks Boyd

unread,
Feb 3, 2014, 9:46:57 PM2/3/14
to Bitcoin...@googlegroups.com
I'd be in favor of working our way up; that way new versions could be published as binary compatibility breaks, and the higher-up libraries will continue to work, since they'll still be linked to the prior version until they get upgraded too.

I agree with creating tests first, then test the new implementation to ensure it still works. I'll do some looking at other biginteger libraries (or just the concept of biginteger math in javascript) and see what other options are out there that don't use a C library (which would break with browserify.

Brooks

Sidney Zhang

unread,
Feb 3, 2014, 10:09:16 PM2/3/14
to Bitcoin...@googlegroups.com
another thought is that we should also set up continuous integration so that tests can run whenever someone pushes code.

Brooks Boyd

unread,
Feb 4, 2014, 10:05:33 AM2/4/14
to Bitcoin...@googlegroups.com
I took a look at a few discussions about big integers in Javascript, and found that most are based on older foundations of big integer math (namely Tom Wu’s BigInteger, which was written in 2005, so doesn't have Typed Arrays or any other optimizations for modern Javascript). There's a writeup of a few common libraries over at http://www.joseprio.com/blog/2013/04/27/biginteger-libraries-for-js/.

Thinking mainly in the NodeJS world, I looked for Node modules that use Buffers for big integers, namely doing big integer math, and found to my surprise found that Dominic (who recently found us) has a Buffer math module he created! Dominic's module does a bunch of math functions on buffers.

The current "bigi" library has these public methods: https://github.com/cryptocoinjs/bigi/blob/master/lib/bigi.js#L1149-1181 but are all of them needed? 

The "ecurve" library looks like it uses:

add
divide
equals
mod
modInverse
multiply
negate
shiftLeft
signum
square
subtract
testBit

Dominic's library is missing negate and testBit, and generally can't deal with negative numbers. Dominic, care to chime in about your library and whether you've thought about supporting negative numbers? Negative numbers would be the big requirement for ecurve math, I think.

For AES (JP's suggested he was looking into in the near future), XOR operations are needed, which aren't in that library either.

So, can Dominic's library be extended to add the missing functionality needed, or can anyone else find any other Buffer-based math libraries?

Brooks

Brooks Boyd

unread,
Feb 4, 2014, 10:28:06 AM2/4/14
to Bitcoin...@googlegroups.com
As I refreshed myself on how signed integers are often stored, I ran across the note about "two's compliment" signed number representation that the fundamental addition, subtraction, and multiplication operations are the same as working with unsigned numbers, as long as overflow is wrapped rather than extends. So, if the buffer math module were extended to take an additional parameter for the math functions, to indicate whether the user wants the results to wrap (inputs will be padded to be the same length, and if output goes above or below that length it wraps around), or overflow (output could be more bits than the input; subtraction resulting in negative numbers fails), then the math module doesn't have to care about negative numbers and can leave it up to the implementation to deal with two's compliment math to deal with signed numbers. How does that sound?

Brooks

JP Richardson

unread,
Feb 5, 2014, 12:12:16 AM2/5/14
to Bitcoin...@googlegroups.com
Nice work on isolating the methods. Before we proceed, I think we should identify what are our own priorities for the libraries and find out which overlap and maybe we can then establish development priorities for the whole?


--
You received this message because you are subscribed to the Google Groups "Bitcoin Hackers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to BitcoinHacker...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Brooks Boyd

unread,
Feb 5, 2014, 12:25:25 AM2/5/14
to JP Richardson, Bitcoin...@googlegroups.com
Which are you referring to as "the libraries" (biginteger libraries, or the whole cryptocoinjs set of libraries) we need to prioritize?

JP Richardson

unread,
Feb 8, 2014, 1:12:51 AM2/8/14
to Bitcoin...@googlegroups.com
Yes. Also, list our own priorities and perhaps identify any overlap.

Brooks Boyd

unread,
Feb 8, 2014, 9:12:38 AM2/8/14
to Bitcoin...@googlegroups.com
My priorities are working toward a working network node, one that doesn't have wallet or mining capabilities (would be good for a web services back-end, like blockexplorer/blockchain). I've got the network components built up (p2p-node, p2p-manager, btc-p2p), and the next step would be getting a blockchain manager, but that requires validating blocks/transactions, which requires Elliptic Curve stuff. So, my priority is getting that chain of libraries (ecurve > ecurve-names > ecdsa > eckey) moved to Buffers, so they work with the P2P modules that are already using buffers.

Sidney Zhang

unread,
Feb 10, 2014, 1:36:48 PM2/10/14
to Bitcoin...@googlegroups.com
Hey Brooks,

I have already made a first pass at validating blocks + validating transactions and (scriptinterpreter is also maintained)

Brooks Boyd

unread,
Feb 10, 2014, 6:28:41 PM2/10/14
to Bitcoin...@googlegroups.com
Yes, but you're relying on "bigi" and "ecdsa", which aren't buffer-based yet, which is why I'm starting at the bottom to get those converted, so we don't have to work in two systems at once. Make sense?

Brooks

Sidney Zhang

unread,
Feb 10, 2014, 6:38:49 PM2/10/14
to Bitcoin...@googlegroups.com
Oh absolutely.

The reason I mentioned it is because you mentioned you were going to write a module for validations. 

Definitely replace those lower libs. and once we find replacements, I will update the validation module + scriptinterpreter accordingly

Brooks Boyd

unread,
May 20, 2014, 9:46:07 AM5/20/14
to Bitcoin...@googlegroups.com
I'm circling back to this after a few months on other projects, and delving deeper into the math-buffer library by Dominic, and found while it works in most use-cases that cryptocoinjs needs, the performance is way slower than the bigi module. I put together a basic test at https://gist.github.com/MidnightLightning/34441e726413aaef1767

Starting with an arbitrary large number (0x4A96B5688EF57328 from the ecurve test suite) and multiplying/dividing it several times (100 rounds, outputting a timestamp every ten rounds). At the final iterations, math-buffer is taking ~30ms to do the last ten "multiply" iterations, while bigi only takes less than 1ms. Dividing is the most inefficient, with ~130ms for the last ten iterations with math-buffer, while bigi still only takes 1ms. 

If there's a way to learn from the bigi implementation how to make those (much) more efficient, I think it could be a good library to use. In thinking further about the negative numbers, I realized all subtraction in the ecurve module is "mod Q", so should wrap around and never be negative anyway, so a little convenience function (subMod()) can handle that (if result would be negative, add Q before subtracting).

One place that might allow for some easy performance gains is checking for trailing zeroes: I noticed that at the end of the multiplication tests, the final number has a whole bunch of zero-bytes padded on the end of it (stored as 108 bytes, with only 50 bytes needed to store the number, so 58 empty bytes on the end of it). If those extra bytes are causing needless processor cycling, that should probably be eliminated.

Brooks

JP Richardson

unread,
May 20, 2014, 11:26:41 AM5/20/14
to Bitcoin...@googlegroups.com
Hey Brooks, glad you're back!

A few weeks ago, I spent some time looking at the internals of bigi. I think the performance gains come from leveraging the fact that JavaScript can handle integers up to 53 bits (because of IEEE doubles). Since buffers are backed by a byte array, any arithmetic is using 8 bit numbers as opposed to the much larger 53 bits. This in turn will cause the processor to do more arithmetic operations for the same computation. At least that's the hypothesis that I can come up with.

Overall, I think it makes sense to not transition bigi to buffers. It helps that bigi now has the methods "fromBuffer" and "toBuffer". But for the rest of libraries, where it makes sense, transition to buffers. Internally, some libraries like the crypto ones it will not make sense to convert to buffers because of the same reason stated above. However, I think it definitely makes sense to transition almost all object and method interfaces to use buffers.

-JP


--
You received this message because you are subscribed to the Google Groups "Bitcoin Hackers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to BitcoinHacker...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Simple & Secure Bitcoin Wallet: https://www.coinbolt.com 

Brooks Boyd

unread,
May 20, 2014, 12:09:27 PM5/20/14
to Bitcoin...@googlegroups.com
On Tuesday, May 20, 2014 10:26:41 AM UTC-5, JP Richardson wrote:
Hey Brooks, glad you're back!

A few weeks ago, I spent some time looking at the internals of bigi. I think the performance gains come from leveraging the fact that JavaScript can handle integers up to 53 bits (because of IEEE doubles). Since buffers are backed by a byte array, any arithmetic is using 8 bit numbers as opposed to the much larger 53 bits. This in turn will cause the processor to do more arithmetic operations for the same computation. At least that's the hypothesis that I can come up with.

Overall, I think it makes sense to not transition bigi to buffers. It helps that bigi now has the methods "fromBuffer" and "toBuffer". But for the rest of libraries, where it makes sense, transition to buffers. Internally, some libraries like the crypto ones it will not make sense to convert to buffers because of the same reason stated above. However, I think it definitely makes sense to transition almost all object and method interfaces to use buffers.

-JP


Well, buffer math doesn't have to use 8-bit numbers; it has read/write methods for working with larger numbers, so possibly instead of iterating over the byte array as:

for (var i = 0; i < buff.length; i++) {
  var operand = buff[i];
  // Do math
  output[i] = outbyte;
}

Doing something like:

for (var i = 0; i < buff.length; i += 4) {
  var operand = buff.readInt32LE(i, true);
  // Do math
  output.writeUInt32LE(outvalue, i);
}

I know we did some speed comparisons between the array-access methods of Buffers (buff[i]) vs readInt8(), and array-access was faster, but I wonder if the slightly slower access speed for getting a 32-bit number out of the buffer will be offset by the speed savings in doing math on 32-bit numbers instead of 8-bit. 

The built-in read/write methods only go in powers-of-two increments, so the only thing above 32-bits is 64-bits, which is more than that 53-bit limit. However, I think you could make a custom function to grab a 32-bit and a 16-bit in order, to make a 48-bit read/write:

function readInt48LE(buf, offset) {
  return buff.readInt32LE(offset) + (buff.readInt16LE(offset+4) << 0xffffffff);
}

which would get closer to that ideal 53-bit?

Brooks 

JP Richardson

unread,
May 22, 2014, 5:06:12 PM5/22/14
to Bitcoin...@googlegroups.com
Sorry, I don't think that I expressed my thoughts as clearly as I wanted.

Ok, so we know that `bigi` must be backed by something to store the data that represents a big integer. Currently, it's backed by a regular JS Array of integers whose max size is 53 bits (err, might be 52 in this case). The alternative, and sounds like what you may be proposing is to back `bigi` with a Buffer. 

I'm positing that backing `bigi` with a Buffer will be slower in both Node.js and the browser. This is because in both cases, Buffer is backed by an 8-bit byte array (https://github.com/joyent/node/blob/master/lib/buffer.js) and (https://github.com/feross/buffer/blob/master/index.js). Yes, you can use Buffer and access 32 bit integers, but as you can see in both cases, you're doing four sequential reads (negligible on speed), bit shifting (maybe not a huge impact), and math on 32 bit integers as opposed to just doing math on 53 bit integers. Now JITs may play a role, but I'm still betting that the Array of 53 bit numbers will outperform in every case.

I could be wrong though, but after looking at it from this perspective, it seemed to me that my time would be better spent integrating Buffers elsewhere. That's all.

Thoughts?


--
You received this message because you are subscribed to the Google Groups "Bitcoin Hackers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to BitcoinHacker...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brooks Boyd

unread,
May 22, 2014, 6:12:12 PM5/22/14
to Bitcoin...@googlegroups.com
Ah, good point; I thought Node Buffers were backed by Typed Arrays instead of vanilla ones, but those links show otherwise, and my initial forays into trying that method don't show any speed increase. So yes, I agree in keeping bigi based on custom arrays-of-53-bit-numbers, and encouraging "toBuffer"/"fromBuffer" as the main way to create one.

Brooks


On Thursday, May 22, 2014 4:06:12 PM UTC-5, JP Richardson wrote:
Sorry, I don't think that I expressed my thoughts as clearly as I wanted.

Ok, so we know that `bigi` must be backed by something to store the data that represents a big integer. Currently, it's backed by a regular JS Array of integers whose max size is 53 bits (err, might be 52 in this case). The alternative, and sounds like what you may be proposing is to back `bigi` with a Buffer. 

I'm positing that backing `bigi` with a Buffer will be slower in both Node.js and the browser. This is because in both cases, Buffer is backed by an 8-bit byte array (https://github.com/joyent/node/blob/master/lib/buffer.js) and (https://github.com/feross/buffer/blob/master/index.js). Yes, you can use Buffer and access 32 bit integers, but as you can see in both cases, you're doing four sequential reads (negligible on speed), bit shifting (maybe not a huge impact), and math on 32 bit integers as opposed to just doing math on 53 bit integers. Now JITs may play a role, but I'm still betting that the Array of 53 bit numbers will outperform in every case.

I could be wrong though, but after looking at it from this perspective, it seemed to me that my time would be better spent integrating Buffers elsewhere. That's all.

Thoughts?
Reply all
Reply to author
Forward
0 new messages