Two's compliment encoding in the standard library

1,779 views
Skip to first unread message

Ben Hood

unread,
Feb 26, 2014, 7:16:37 PM2/26/14
to golang-nuts
Hi all,

I've recently made use of

func parseBigInt(bytes []byte) *big.Int

from encoding/asn1/asn1.go and its counterpart

func marshalBigInt(out *forkableWriter, n *big.Int) (err error)

from encoding/asn1/marshal.go

to do some two's compliment packing/unpacking.

Maybe I've overlooked something, but is anybody aware of exported
variant of these routines in the standard library?

Cheers,

Ben

egon

unread,
Feb 27, 2014, 12:10:58 AM2/27/14
to golan...@googlegroups.com


On Thursday, February 27, 2014 2:16:37 AM UTC+2, Ben Hood wrote:
Hi all,

I've recently made use of

func parseBigInt(bytes []byte) *big.Int

v := big.NewInt(0)
v.SetBytes(bytes)
bytes := v.Bytes()

Not sure why you would want to use those from asn1 instead?

+ egon

Ben Hood

unread,
Feb 27, 2014, 2:20:37 AM2/27/14
to egon, golang-nuts
On Thu, Feb 27, 2014 at 5:10 AM, egon <egon...@gmail.com> wrote:
> v := big.NewInt(0)
> v.SetBytes(bytes)
> bytes := v.Bytes()

Are you saying that this is how to read a two's complimented encoded
varint into a big.Int?

> Not sure why you would want to use those from asn1 instead?

For encoding I started out by using the standard library:

http://golang.org/pkg/math/big/#Int.Bytes

But then found out that this doesn't encode a varint in the same way
that other languages do, so I needed to find an interoperable
solution.

For decoding I started out by using:

http://golang.org/pkg/math/big/#Int.SetBytes

But found that this didn't produce the same value as when other
languages deserialize the same bytes.

So I began to look around for a two's compliment encoding for a
variant in Golang, and the only thing I found that worked
interoperably was the routines used internally in the asn1 module. So
because this worked, I decided to run with this solution for now.
However, if you're telling me that I'm missing a trick, please do let
me know.

egon

unread,
Feb 27, 2014, 2:42:51 AM2/27/14
to golan...@googlegroups.com, egon


On Thursday, February 27, 2014 9:20:37 AM UTC+2, Ben Hood wrote:
On Thu, Feb 27, 2014 at 5:10 AM, egon <egon...@gmail.com> wrote:
> v := big.NewInt(0)
> v.SetBytes(bytes)
> bytes := v.Bytes()

Are you saying that this is how to read a two's complimented encoded
varint into a big.Int?


That is one way to do it. If you just want to serialize to disk and later load it then it's quite useful, but indeed for inter-op it's probably not the best solution.
 
> Not sure why you would want to use those from asn1 instead?

For encoding I started out by using the standard library:

http://golang.org/pkg/math/big/#Int.Bytes

But then found out that this doesn't encode a varint in the same way
that other languages do, so I needed to find an interoperable
solution.

For decoding I started out by using:

http://golang.org/pkg/math/big/#Int.SetBytes

But found that this didn't produce the same value as when other
languages deserialize the same bytes.

So I began to look around for a two's compliment encoding for a
variant in Golang, and the only thing I found that worked
interoperably was the routines used internally in the asn1 module. So
because this worked, I decided to run with this solution for now.
However, if you're telling me that I'm missing a trick, please do let
me know.

I guess you are missing the specific "encoding". As you've probably noticed that http://golang.org/pkg/encoding/ contains several different encodings... there are many different ways to convert data into bytes. So, you should select some specific encoding and use that in both languages. For example, with asn1:

when serializing
v := big.NewInt(2041)
bytes, err := asn1.Marshal(v)
if err != nil { return err }

when unserializing
v := big.NewInt(0)
err := asn1.Unmarshal(bytes, v)
if err != nil { return err }

And when using the bytes in an other language, use an ASN1 package, also.

Of course, I'm not sure what are you transferring or what is the intended purpose, so some other format like json, xml, protobuf could be more suitable.

+ egon

Ben Hood

unread,
Feb 27, 2014, 4:03:20 AM2/27/14
to egon, golang-nuts
On Thu, Feb 27, 2014 at 7:42 AM, egon <egon...@gmail.com> wrote:
> Of course, I'm not sure what are you transferring or what is the intended
> purpose, so some other format like json, xml, protobuf could be more
> suitable.

My use case is to implement encoding support for decimal types in the
Go driver for Cassandra CQL: https://github.com/gocql/gocql

The wire format for decimal type is loosely defined by Cassandra in
the sense that there is no spec, but the server uses Java BigDecimal
and all other language drivers are interoperable with this. So no
concrete spec, but I have a bunch of integration tests in different
languages.

To be clear, I believe that I have solved this issue using inf.Dec -
my running code that passes all of the integration tests is on this
branch: https://github.com/relops/gocql/compare/dec_inf

So under the assumption that the other maintainers of gocql are happy
with the dependency on inf.Dec, the issue is solved.

The reason for this particular thread on go-nuts was to question a
portion of the approach that I took to achieve compatible binary
encoding. Basically I lifted the implementation of some internal
functions from the asn1 module, although my CQL driver has nothing to
do with asn1.

So from the perspective of this discussion thread, I was just
interested in a library routine to do two's compliment
encoding/decoding for the unscaled value of a big.Inf.

Michael Jones

unread,
Feb 27, 2014, 9:02:25 AM2/27/14
to Ben Hood, golang-nuts, egon

I still don't understand your goal. A quick check of Cassandra documents suggests its BIGINT is a 64 bit int, and its VARINT is just an unlimited precision integer. Why do you need scale factors and various decimal libraries?

If you will show us the incoming bytestream format and describe your processing goals, you'll get better answers.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ben Hood

unread,
Feb 27, 2014, 12:04:53 PM2/27/14
to Michael Jones, golang-nuts, egon
Hi Michael,

To clarify my position, I've actually managed to knock out a working
implementation of the decimal type in CQL (as per this pull request
https://github.com/gocql/gocql/pull/120/files) using inf.Dec and some
code lifted from the asn1 module.

The purpose of this particular discussion thread was to ask whether
hijacking some un-exported code from the standard library is a good
thing to do or not.

On Thu, Feb 27, 2014 at 2:02 PM, Michael Jones <m...@google.com> wrote:
> I still don't understand your goal. A quick check of Cassandra documents
> suggests its BIGINT is a 64 bit int, and its VARINT is just an unlimited
> precision integer. Why do you need scale factors and various decimal
> libraries?

The type that I'm implementing is called decimal in Cassandra. The
wire format of this type is not specified explicitly, so the nearest
thing to a specification is this comment from one of the Cassandra
team:

http://mail-archives.apache.org/mod_mbox/cassandra-user/201402.mbox/%3CCAKkz8Q2afoH307ZRucox1cXFiG89DYWJV1jVxkeR18wPUxQ%3DGg%40mail.gmail.com%3E

Basically this says you should be compatible with Java BigDecimal:

"Regarding the decimal, it does uses 4 bytes for the scale and the rest for
the bytes of the unscaled value (i.e. a variable length integer) as Peter
mentioned. the actual value being the "unscaled value" * 10^-"scale""


> If you will show us the incoming bytestream format and describe your
> processing goals, you'll get better answers.

My goal is to provide marshaling and unmarshaling of CQL decimal type
columns in Cassandra for the gocql client driver in Go.

Functionally speaking, I have already achieved my goal, albeit with
the use of some unexported code from the asn1 module to get two's
compliment encoding.

So the reason why I started this discussion was to ask whether I was
missing a trick and whether there is a better way to do two's
compliment encoding.

Here are some bytestream examples taken from the test cases of other
CQL drivers that implement the decimal type (all of which I managed to
include into a test case for gocql):

1. Ruby: https://github.com/iconara/cql-rb/blob/master/spec/cql/protocol/decoding_spec.rb

it 'decodes a negative decimal' do
buffer = ByteBuffer.new("\x00\x00\x00\x12\xF2\xD8\x02\xB6R\x7F\x99\xEE\x98#\x99\xA9V")
Decoding.read_decimal!(buffer).should ==
BigDecimal.new('-1042342234234.123423435647768234')
end

it 'decodes a positive decimal with only fractions' do
buffer = ByteBuffer.new("\x00\x00\x00\x13*\xF8\xC4\xDF\xEB]o")
Decoding.read_decimal!(buffer).should ==
BigDecimal.new('0.0012095473475870063')
end

it 'decodes a negative decimal with only fractions' do
buffer = ByteBuffer.new("\x00\x00\x00\x13\xD5\a;\x20\x14\xA2\x91")
Decoding.read_decimal!(buffer).should ==
BigDecimal.new('-0.0012095473475870063')
end

2. Python: https://github.com/datastax/python-driver/blob/ce9d2c8bde208f44427242fb92fef182f23a6643/tests/unit/test_marshalling.py

('\x00\x00\x00\r\nJ\x04"^\x91\x04\x8a\xb1\x18\xfe', 'DecimalType',
Decimal('1243878957943.1234124191998')),
('\x00\x00\x00\x06\xe5\xde]\x98Y', 'DecimalType', Decimal('-112233.441191')),
('\x00\x00\x00\x14\x00\xfa\xce', 'DecimalType',
Decimal('0.00000000000000064206')),
('\x00\x00\x00\x14\xff\x052', 'DecimalType',
Decimal('-0.00000000000000064206')),
('\xff\xff\xff\x9c\x00\xfa\xce', 'DecimalType', Decimal('64206e100')),


Does this make sense?

Cheers,

Ben

Donovan Hide

unread,
Feb 27, 2014, 12:22:53 PM2/27/14
to Ben Hood, Michael Jones, golang-nuts, egon
Two's complement decoding was included in the example I gave in the example in the other thread. You just need to cast a uint32 to a int32:


It should be very easy to convert the above into the BinaryMarshaler and BinaryUnmarshaler interfaces, no need to rip out bits of other packages:



Donovan Hide

unread,
Feb 27, 2014, 12:25:45 PM2/27/14
to Ben Hood, Michael Jones, golang-nuts, egon
Before any spec quoter gets on my case, I meant convert a uint32 to a int32, not cast :-)

Ben Hood

unread,
Feb 27, 2014, 1:32:05 PM2/27/14
to Donovan Hide, Michael Jones, golang-nuts, egon
On Thu, Feb 27, 2014 at 5:22 PM, Donovan Hide <donov...@gmail.com> wrote:
> Two's complement decoding was included in the example I gave in the example
> in the other thread. You just need to cast a uint32 to a int32:
>
> http://play.golang.org/p/mEp1iNBLWt
>
> It should be very easy to convert the above into the BinaryMarshaler and
> BinaryUnmarshaler interfaces, no need to rip out bits of other packages:

OK, so I've modified your play to put it side by side with the code
that produces the correct encoding as a sepcification:

http://play.golang.org/p/IsoWJ0IVHZ

I've put your code in func roundTrip2().

However, the way that I have done this has caused an issue with the
method you are describing - is it clear what my mistake is?

Donovan Hide

unread,
Feb 27, 2014, 1:45:21 PM2/27/14
to Ben Hood, Michael Jones, golang-nuts, egon
Your code is very confusing. Write a simple test case for a set of numbers that you want to round trip. Don't write the code that implements the test. Post that set of tests.

Before you were talking about rational numbers, i.e. fractional/decimal, but now you seem to be dealing with integers. A two's complement encoded int64 is never going to be comparable to a possibly infinite length integer.

You can't expect everybody to start reading github repositories to help you, just ask a simple question with tests and the answer will be a lot simpler than you are making it :-)

Donovan Hide

unread,
Feb 27, 2014, 3:24:29 PM2/27/14
to Ben Hood, Michael Jones, golang-nuts, egon
I take it back, it is more complex than you'd imagine :-)

https://github.com/datastax/python-driver/search?q=varint_pack&ref=cmdform

Based on the Python code (including pseudo-tests):


Needs more tests; don't use this without more thorough testing :-)

Ben Hood

unread,
Feb 27, 2014, 3:52:24 PM2/27/14
to Donovan Hide, Michael Jones, golang-nuts, egon
On Thu, Feb 27, 2014 at 6:45 PM, Donovan Hide <donov...@gmail.com> wrote:
> Your code is very confusing. Write a simple test case for a set of numbers
> that you want to round trip. Don't write the code that implements the test.
> Post that set of tests.

OK, here's a test case with just the specification, leaving the
implementation blank: http://play.golang.org/p/3YWOtrO49x

> Before you were talking about rational numbers, i.e. fractional/decimal, but
> now you seem to be dealing with integers. A two's complement encoded int64
> is never going to be comparable to a possibly infinite length integer.

That's a fair point and I'm sorry for being so confusing. When I
started this discussion thread I tried to reduce my question down to
the crux of the matter so that people didn't need to filter out the
surrounding context. But the way I've tried to do that has made it
more confusing. Sorry about that.

> You can't expect everybody to start reading github repositories to help you,
> just ask a simple question with tests and the answer will be a lot simpler
> than you are making it :-)

I do appreciate that people's time is valuable and that they can't
devote lots of it to answering random questions like mine - I'm sorry
if the impression arose that I expected people to go out of their way
to help me.

Ben Hood

unread,
Feb 27, 2014, 3:59:06 PM2/27/14
to Donovan Hide, Michael Jones, golang-nuts, egon
On Thu, Feb 27, 2014 at 8:24 PM, Donovan Hide <donov...@gmail.com> wrote:
> I take it back, it is more complex than you'd imagine :-)
>
> https://github.com/datastax/python-driver/search?q=varint_pack&ref=cmdform
> https://github.com/datastax/csharp-driver/blob/master/BigDecimalSamples/IKVMDecimalTypeAdapter.cs
>
> Based on the Python code (including pseudo-tests):
>
> http://play.golang.org/p/Rsy_gNNz2p
>
> Needs more tests; don't use this without more thorough testing :-)

I didn't see that you had posted this before my last response - so
please ignore my last post.

Thanks very much for this - it looks like a compact implementation,
more compact than the one I lifted from the asn1 module. Hopefully we
can get this to work properly with all of the other test cases in the
gocql driver.

So I guess that the answer to my very first question on this thread is
that there isn't an exported implementation of two's compliment in the
standard library - you'll need to use your own.

Thanks a lot for taking the time to knock this example up.

egon

unread,
Feb 27, 2014, 11:42:01 PM2/27/14
to golan...@googlegroups.com, Donovan Hide, Michael Jones, egon


On Thursday, February 27, 2014 10:52:24 PM UTC+2, Ben Hood wrote:
On Thu, Feb 27, 2014 at 6:45 PM, Donovan Hide <donov...@gmail.com> wrote:
> Your code is very confusing. Write a simple test case for a set of numbers
> that you want to round trip. Don't write the code that implements the test.
> Post that set of tests.

OK, here's a test case with just the specification, leaving the
implementation blank: http://play.golang.org/p/3YWOtrO49x

> Before you were talking about rational numbers, i.e. fractional/decimal, but
> now you seem to be dealing with integers. A two's complement encoded int64
> is never going to be comparable to a possibly infinite length integer.

That's a fair point and I'm sorry for being so confusing. When I
started this discussion thread I tried to reduce my question down to
the crux of the matter so that people didn't need to filter out the
surrounding context. But the way I've tried to do that has made it
more confusing. Sorry about that.

Just as a general tip, don't reduce the problem... but do give the TL;DR; version. http://stackoverflow.com/help/how-to-ask http://msmvps.com/blogs/jon_skeet/archive/2010/08/29/writing-the-perfect-question.aspx

The main problem with just trying to answer the immediate problem is that the actual problem might be several levels highers. For example, let's say you run into some problem when using the reflect package. So the immediate problem would be "how do I do X in reflect package"... whereas the actual problem might be that you don't know how to use the language and using the reflect package can be avoided.

As a general rule, at least have answers for the "5 whys" for the immediate problem. For example, why do you need the reflect package... because then I can do generic data structure X. Why do you need generic data structure X, because I need to do this with Y and Z. Why .... etc. And eventually the answer to the problem could just be a simple "use a map instead".
 
I know how easy it is do fall into the trap of "I know better" and "I'm taking the right approach"... so also assume that the people answering are 1000x smarter than you and can provide solutions that you never imagined (even if it may not be true).

+ egon

Ben Hood

unread,
Feb 28, 2014, 2:45:48 AM2/28/14
to egon, golang-nuts, Donovan Hide, Michael Jones
Hey Egon,

On Fri, Feb 28, 2014 at 4:42 AM, egon <egon...@gmail.com> wrote:

> Just as a general tip, don't reduce the problem... but do give the TL;DR;

Thanks very much for the feedback - much appreciated.

So in this instance I maybe I could have formulated it is:

---

TL;DR; Is there a better way to encode a big.Int so that it is binary
compatible with BigDecimal in Java than copying an internal function
from the asn1 module?

The context is I'm trying to implement support for the Cassandra CQL
decimal type with the gocql CQL driver for Go. Despite being not
formally defined, other language clients have taken the approach of
providing an encoding that is compatible with the way Java BigDecimal
serializes itself. A BigDecimal in Java is represented by a
(BigInteger * 10 ^ - scale). The wire encoding of this is 32 bit
signed integer following by a signed variable length integer using
two's compliment encoding. On the Go side I have elected to use
inf.Dec (http://godoc.org/speter.net/go/exp/math/dec/inf) because I
believe it provides an equivalent representation of BigDecimal in Go.
inf.Dec has two components: an unscaled value represented by a big.Int
and an int32 scale field. To encode this, I'm encoding the int32 into
a byte representation and then appending the byte representation of
the big.Int. The big.Int needs to be encoded using two's compliment.
By default, big.Int does not use two's compliment. So what I have done
is to re-use some internal functions I found in the asn1 module to
encode a signed variable length integer using two's compliment. So
I've solved the functional issue and all of my integration tests pass,
but copying code from a standard library feels wrong. Is there an
idiomatic way to achieve the same goal? Am I doing something wrong?

---

Does that sound like better formulation than the my original question?

> I know how easy it is do fall into the trap of "I know better" and "I'm
> taking the right approach"... so also assume that the people answering are
> 1000x smarter than you and can provide solutions that you never imagined
> (even if it may not be true).

I agree that humility is always a good personal characteristic to have.

I thought my original formulation of "Maybe I've overlooked something,
..." in the OP indicated some form of humility on by behalf. On
reflection, maybe that statement comes across as "I know better, but
I've still got a problem". Coming across as being not very sympathetic
and doesn't encourage people to help you.

In any case, what I've managed to do is to confuse and annoy people
who are just trying to help. I'm sorry for being so clumsy.

Ben

egon

unread,
Feb 28, 2014, 3:10:40 AM2/28/14
to golan...@googlegroups.com, egon, Donovan Hide, Michael Jones
Yes. 

(some newlines in the second paragraph would make it easier to read :) )


> I know how easy it is do fall into the trap of "I know better" and "I'm
> taking the right approach"... so also assume that the people answering are
> 1000x smarter than you and can provide solutions that you never imagined
> (even if it may not be true).

I agree that humility is always a good personal characteristic to have.

I thought my original formulation of "Maybe I've overlooked something,
..." in the OP indicated some form of humility on by behalf. On
reflection, maybe that statement comes across as "I know better, but
I've still got a problem". Coming across as being not very sympathetic
and doesn't encourage people to help you.

I didn't mean that about the humility... it's about assumptions leading to information removal. Sometimes this can happen because you have solved the problem N times already in an other language, which means that it's easy to assume that it is the correct way to do things... because you assume that it's the correct approach you don't even consider that it might be the actual cause of the problem and it may seem irrelevant to the problem.


In any case, what I've managed to do is to confuse and annoy people
who are just trying to help. I'm sorry for being so clumsy.

We all live and learn :)... 

I usually mention these kinds of points so that the next time they ask better questions, which means they get better answers, they get answers faster and it's also easier to answer them.


Ben

roger peppe

unread,
Feb 28, 2014, 5:23:36 AM2/28/14
to Ben Hood, Donovan Hide, Michael Jones, golang-nuts, egon
On 27 February 2014 20:59, Ben Hood <0x6e...@gmail.com> wrote:
> On Thu, Feb 27, 2014 at 8:24 PM, Donovan Hide <donov...@gmail.com> wrote:
>> I take it back, it is more complex than you'd imagine :-)
>>
>> https://github.com/datastax/python-driver/search?q=varint_pack&ref=cmdform
>> https://github.com/datastax/csharp-driver/blob/master/BigDecimalSamples/IKVMDecimalTypeAdapter.cs
>>
>> Based on the Python code (including pseudo-tests):
>>
>> http://play.golang.org/p/Rsy_gNNz2p
>>
>> Needs more tests; don't use this without more thorough testing :-)
>
> I didn't see that you had posted this before my last response - so
> please ignore my last post.
>
> Thanks very much for this - it looks like a compact implementation,
> more compact than the one I lifted from the asn1 module. Hopefully we
> can get this to work properly with all of the other test cases in the
> gocql driver.

I think it can be a little more compact (and efficient) still.
The bit length calculation seems to be doing quite a lot
of work that's not necessary.

Something like this, perhaps?

http://play.golang.org/p/ADlfKNZYfZ

I think it works better without the error returns from the
Marshal and Unmarshal functions too - interpreting an
empty slice as zero seems a reasonable trade-off.

roger peppe

unread,
Feb 28, 2014, 6:32:50 AM2/28/14
to Ben Hood, Donovan Hide, Michael Jones, golang-nuts, egon
Actually, I think it's nicer without the CassandraInteger type
at all - just as two functions:

http://play.golang.org/p/9_jvuqAR7j

Donovan Hide

unread,
Feb 28, 2014, 7:00:06 AM2/28/14
to roger peppe, Ben Hood, Michael Jones, golang-nuts, egon
> Something like this, perhaps?
>
> http://play.golang.org/p/ADlfKNZYfZ
>
> I think it works better without the error returns from the
> Marshal and Unmarshal functions too - interpreting an
> empty slice as zero seems a reasonable trade-off.

LGTM :-)

Ben Hood

unread,
Feb 28, 2014, 8:11:14 AM2/28/14
to roger peppe, Donovan Hide, Michael Jones, golang-nuts, egon
On Fri, Feb 28, 2014 at 11:32 AM, roger peppe <rogp...@gmail.com> wrote:
> Actually, I think it's nicer without the CassandraInteger type
> at all - just as two functions:
>
> http://play.golang.org/p/9_jvuqAR7j

Very cool!

For completeness' sake I've added all of the spec data points from my
test suite to the top of your test and they all pass:
http://play.golang.org/p/nJZgCOuu9H

Thanks for the example :-)

roger peppe

unread,
Feb 28, 2014, 8:43:00 AM2/28/14
to Ben Hood, Donovan Hide, Michael Jones, golang-nuts, egon
You're welcome. BTW if performance is an issue, this version might well
be faster: http://play.golang.org/p/t3CTllKpDJ

Ben Hood

unread,
Feb 28, 2014, 8:51:08 AM2/28/14
to roger peppe, Donovan Hide, Michael Jones, golang-nuts, egon
On Fri, Feb 28, 2014 at 1:43 PM, roger peppe <rogp...@gmail.com> wrote:
> You're welcome. BTW if performance is an issue, this version might well
> be faster: http://play.golang.org/p/t3CTllKpDJ

Very cool - why is that faster BTW?

roger peppe

unread,
Feb 28, 2014, 9:07:02 AM2/28/14
to Ben Hood, Donovan Hide, Michael Jones, golang-nuts, egon
It's not, as it happens (or only 5% if we remove the Set call).
Bad intuition :-)

Michael Jones

unread,
Feb 28, 2014, 10:48:20 AM2/28/14
to roger peppe, Ben Hood, Donovan Hide, golang-nuts, egon
This is a great outcome. Clear statement of the goal, and quick iteration to a fast and simple implementation.
--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765
Reply all
Reply to author
Forward
0 new messages