Question about Uvarint

231 views
Skip to first unread message

R. Men

unread,
Oct 3, 2025, 11:01:30 AM (4 days ago) Oct 3
to golang-nuts
Hello, I'm new to the language and I have a question about uvarint decoding, hoping someone can clarify. Looking at the the source code, it seems to assume that data is littleEndian and converts to bigEndian. What if my data is already bigEndian? 

Here's the current problem I'm trying to resolve
buf{0x81, 0x47}  ----> 1000 0001 , 0100 0111

I'm expecting      ---->  x000 0001 , x100 0111
                              ---->  0000 0000 , 1100 0111 
                              ----> 0xC7
but instead I'm getting
                              -----> x100 0111 , x000 0001
                              -----> 0010 0011 , 1000 0001
                              -----> 0x2381
Is there a variant of uvarint that preserves the order or some solution to this?
Thank you for the help!

Jan Mercl

unread,
Oct 3, 2025, 11:10:33 AM (4 days ago) Oct 3
to R. Men, golang-nuts
On Fri, Oct 3, 2025 at 5:01 PM R. Men <rafael.m...@gmail.com> wrote:

> Hello, I'm new to the language and I have a question about uvarint decoding, hoping someone can clarify. Looking at the the source code, it seems to assume that data is littleEndian and converts to bigEndian. What if my data is already bigEndian?

Uvarint encoding is a variable length byte sequence. It has no endianess.

peterGo

unread,
Oct 3, 2025, 12:23:14 PM (4 days ago) Oct 3
to golang-nuts

Jan Mercl

unread,
Oct 3, 2025, 1:47:43 PM (4 days ago) Oct 3
to Rafael Mendoza, golang-nuts
On Fri, Oct 3, 2025 at 7:13 PM Rafael Mendoza
<rafael.m...@gmail.com> wrote:

> Hi Jan, thanks for the reply but perhaps I didn't explain correctly. I didn't mean to say the varint itself had any endianess but rather the decoded uint itself when the uint is larger than 1 byte. For >=2 bytes uint, Uvarint function is taking the most significant byte and placing it as the least significant byte before the concatenation step. So the resulting uint (in this case uint16) is different than the expected result.

It's hard to say much without seeing your code. Here's some code and
what it produces on LE/BE targets:

package main

import (
"encoding/binary"
"fmt"
"runtime"
"unsafe"
)

func main() {
fmt.Printf("goos=%s goarch=%s\n", runtime.GOOS, runtime.GOARCH)
buf := [...]byte{0x11, 0x22, 0x33, 0x44}
fmt.Printf("buf=|% x|\n", buf)
fmt.Printf("unsafe uint32 at buf=%#0x\n",
*(*uint32)(unsafe.Pointer(&buf[0])))
s := buf[:]
num := uint64(0x1234)
s = s[:binary.PutUvarint(s, num)]
fmt.Printf("s=|% x|\n", s)
u, n := binary.Uvarint(s)
fmt.Printf("u=%#0x n=%v\n", u, n)
}

----
jnml@3900x:~/tmp/uvarint$ go run main.go
goos=linux goarch=amd64
buf=|11 22 33 44|
unsafe uint32 at buf=0x44332211
s=|b4 24|
u=0x1234 n=2
jnml@3900x:~/tmp/uvarint$
----
jnml@linux-s390x:~/tmp/uvarint$ go run main.go
goos=linux goarch=s390x
buf=|11 22 33 44|
unsafe uint32 at buf=0x11223344
s=|b4 24|
u=0x1234 n=2
jnml@linux-s390x:~/tmp/uvarint$
----

The encoding buffer `s` always contains the same byte sequence and
that sequence always decodes to the same number.

HTH

-j

Brian Candler

unread,
Oct 3, 2025, 3:09:13 PM (4 days ago) Oct 3
to golang-nuts
> So the resulting uint (in this case uint16) is different than the expected result.

Can you give the input, the output you get, and the output that you expected to get instead?

peterGo

unread,
Oct 3, 2025, 5:06:52 PM (4 days ago) Oct 3
to golang-nuts
Package encoding/binary types Varint and Uvarint use the protobuf Varint wire format, which is little-endian.

package encoding/binary       
https://pkg.go.dev/encoding/binary@latest        

Encoding: Explains how Protocol Buffers encodes data to files or to the wire.        
https://protobuf.dev/programming-guides/encoding/        

"These 7-bit payloads are in little-endian order."    


On Friday, October 3, 2025 at 11:01:30 AM UTC-4 R. Men wrote:

R. Men

unread,
Oct 3, 2025, 8:31:25 PM (3 days ago) Oct 3
to golang-nuts
Sure, I'll share my code and what I'm trying to do. Thank you all for the help so far. My program reads the sql table's metadata to determine the type and length of each column in the table. These values are encoded as varint of unsigned bigendian integers. I already validated the expected values match the tables's actual data type/size.

package main

import (
"encoding/binary"
"fmt"
)

func main() {
// SQLite format 3, sample DB file record header
//Expected:          7        23      27       27      1         199
//                        |-------| |-------| |-------| |-------| |------| |----------------|
inputs := []byte{0x07, 0x17, 0x1b, 0x1b, 0x01, 0x81, 0x47}
offset := 0
for remaining := len(inputs); remaining > 0; {
d, n := binary.Uvarint(inputs[offset:])
if n <= 0 {
break
}

remaining -= n
offset += n
fmt.Println(d, n)

// Actual output
// 7 1
// 23        1
// 27 1
// 27 1
// 1 1
// 9089   2
}
}

I now see why I get the 9089 figure after looking at Uvarint source code (https://cs.opensource.google/go/go/+/refs/tags/go1.25.1:src/encoding/binary/varint.go):

func Uvarint(buf []byte) (uint64, int) {
var x uint64
var s uint
for i, b := range buf {
if i == MaxVarintLen64 {
// Catch byte reads past MaxVarintLen64.
// See issue https://golang.org/issues/41185
return 0, -(i + 1) // overflow
}
if b < 0x80 {
if i == MaxVarintLen64-1 && b > 1 {
return 0, -(i + 1) // overflow
}
return x | uint64(b)<<s, i + 1
}
x |= uint64(b&0x7f) << s  
s += 7
}
return 0, 0
}

Here I see the bits after the first byte are left-shifted by 7 before concatenating and left-padding.
My solution so far has been to create custom uvarint function that performs the left-shift before the concat, preserving the byte order. 

func Uvarint(buf []byte) (uint64, int) {
var x uint64
var s uint
for i, b := range buf {
if i == MaxVarintLen64 {
// Catch byte reads past MaxVarintLen64.
// See issue https://golang.org/issues/41185
return 0, -(i + 1) // overflow
}
if b < 0x80 {
if i == MaxVarintLen64-1 && b > 1 {
return 0, -(i + 1) // overflow
}
x <<= s
return x | uint64(b), i + 1
}
x <<= s
x |= uint64(b&0x7f)
s += 7
}
return 0, 0
}

I would prefer to use the go library's functions if at all possible rather than make my own but so far I haven't found alternatives or even discussions on this topic. If anything's unclear let me know. Cheers.

Brian Candler

unread,
Oct 4, 2025, 3:45:26 AM (3 days ago) Oct 4
to golang-nuts
So in short, you are saying that the byte sequence 0x81, 0x47 written by SQLite decodes by binary.Uvarint to 9089, but you wanted it to decode to 199.

What this means is: the encoding that SQLite has chosen to use is *not* the varint as defined by protobuf (and implemented by the Go standard library). And therefore, you do indeed need to write your own custom decoding function.

The SQLite file format is defined here: https://www.sqlite.org/fileformat.html

A variable-length integer or "varint" is a static Huffman encoding of 64-bit twos-complement integers that uses less space for small positive values. A varint is between 1 and 9 bytes in length. The varint consists of either zero or more bytes which have the high-order bit set followed by a single byte with the high-order bit clear, or nine bytes, whichever is shorter. The lower seven bits of each of the first eight bytes and all 8 bits of the ninth byte are used to reconstruct the 64-bit twos-complement integer. Varints are big-endian: bits taken from the earlier byte of the varint are more significant than bits taken from the later bytes.

Brian Candler

unread,
Oct 4, 2025, 3:57:39 AM (3 days ago) Oct 4
to golang-nuts
Also note: your code doesn't work for values more than 2 bytes anyway. Here are a few test cases.

R. Men

unread,
Oct 4, 2025, 5:42:59 AM (3 days ago) Oct 4
to golang-nuts
Hi Brian,

Yes, it seems I'll have to go the custom function route, given their non-standard encoding. Thanks for confirming, and really appreciate those tests. I fixed my code to handle >2 byte ints and special case for the 9th byte (for which SQLite encoding treats all bits as data). Leaving here in case anyone else is interested. Have a good weekend!

package main

import "fmt"

const MaxVarintLen64 = 9


func Uvarint(buf []byte) (uint64, int) {
var x uint64
var s uint = 7

for i, b := range buf {
if i == MaxVarintLen64 {
// Catch byte reads past MaxVarintLen64.
// See issue https://golang.org/issues/41185
return 0, -(i + 1) // overflow
}
if i == MaxVarintLen64-1 && b > 1 {
x <<= s + 1

return x | uint64(b), i + 1
}

if b < 0x80 {

x <<= s
return x | uint64(b), i + 1
}
x <<= s
x |= uint64(b & 0x7f)
}
return 0, 0
}

func main() {
fmt.Println(Uvarint([]byte{0x81, 0x47}))                                           // should return 199, 2
fmt.Println(Uvarint([]byte{0xff, 0xff, 0x7f}))                                     // should return 2097151 (=0x1fffff), 3
fmt.Println(Uvarint([]byte{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f}))       // should return 72057594037927935 (=0xffffffffffffff), 8
fmt.Println(Uvarint([]byte{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff})) // should return 18446744073709551615 (=0xffffffffffffffff), 9

Brian Candler

unread,
Oct 4, 2025, 7:29:03 AM (3 days ago) Oct 4
to golang-nuts
Here is a test case for which your function still doesn't work:
fmt.Println(Uvarint([]byte{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x00})) // should return 18446744073709551360 (=0xffffffffffffff00), 9

If you have read 8 bytes with the top bit set, then you must *unconditionally* consume all 8 bits of the 9th byte, regardless of its value.

R. Men

unread,
Oct 4, 2025, 1:09:15 PM (3 days ago) Oct 4
to golang-nuts
You're quite right. I need to remove the b > 1 conditional. Will need to create thorough test cases to make sure it complies the sqlite formatting  and handles these cases. I can see now, even if there was big-endian uvarint() implementation I would still need to write my own, given sqlite 9-byte optimisation. 

Robert Engels

unread,
Oct 6, 2025, 1:11:34 PM (16 hours ago) Oct 6
to R. Men, golang-nuts

The sql VarInt is different than the Go varint. This will be an issue handled by the driver. 
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/84756c8b-a319-4ab8-8bfb-01375e2e5e88n%40googlegroups.com.

Robert Engels

unread,
Oct 6, 2025, 1:17:27 PM (16 hours ago) Oct 6
to XinG XinG, R. Men, golang-nuts
Sending an rfc link with no background isn’t worthwhile. 

On Oct 6, 2025, at 12:14 PM, XinG XinG <ultra...@gmail.com> wrote:





Reply all
Reply to author
Forward
0 new messages