Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Python-announce] tencdec: A number array to/from bytes high performance encoder/decoder

6 views
Skip to first unread message

Facundo Batista

unread,
Jan 2, 2023, 3:01:41 PM1/2/23
to
# What is tencdec

A number array to/from bytes high performance encoder/decoder.

It gets a list of monotonic increasing integers and can encode it to a
byte object very fast in a compressed form using deltas.

Then you may store that byte object in a DB or whatever, and when you
need the list of integers back, you just decode it.


https://github.com/facundobatista/tencdec


Example:

```
>>> numbers = [0, 1, 2, 3, 4, 28, 87, 87, 500, 501, 507, 2313]
>>> enc = tencdec.encode(numbers)
>>> enc
b'\x00\x01\x01\x01\x01\x18;\x00\x9d\x03\x01\x06\x8e\x0e'
>>> dec = tencdec.decode(enc)
>>> numbers == dec
True
```

And it's very fast!

Using the numbers from the example above, `timeit` shows around 2
microseconds to encode or decode (in a AMD Ryzen 7 PRO 4750U CPU):

```
$ python3 -m timeit -s "import tencdec; numbers = [0, 1, 2, 3, 4, 28,
87, 87, 500, 501, 507, 2313]" "tencdec.encode(numbers)"
100000 loops, best of 5: 2.28 usec per loop
$ python3 -m timeit -s "import tencdec; e = tencdec.encode([0, 1, 2,
3, 4, 28, 87, 87, 500, 501, 507, 2313])" "tencdec.decode(e)"
100000 loops, best of 5: 2.42 usec per loop
```

The restriction are that numbers need to be integers (else encoding
will crash with `TypeError`) and monotonic increasing positive (this
is verified, otherwise it gets into an infinite loop, but with an
`assert` so you may disable the verification running Python with `-O`
if you are already sure that list of numbers is ok).

Note that there are no external dependencies for this. It's just
Python 3 and its standard library.


--
. Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org.ar/
Twitter: @facundobatista
0 new messages