ANN: Python-Blosc2 4.0 is out!

2 views
Skip to first unread message

Francesc Alted

unread,
7:11 AM (6 hours ago) 7:11 AM
to Blosc, pyd...@googlegroups.com, pytabl...@googlegroups.com
Announcing Python-Blosc2 4.0.0
===============================

This is a major version release where we have accelerated computation via multithreading using the [miniexpr library](https://github.com/Blosc/miniexpr/tree/main). We have also changed the wheel layout to comply with PEP 427 and added support for the [blosc2-openzl plugin](https://github.com/Blosc/blosc2-openzl).

You can think of Python-Blosc2 4.x as an extension of NumPy/numexpr that:

- Can deal with NDArray compressed objects using first-class codecs & filters.
- Performs many kind of math expressions, including reductions, indexing...
- Supports multi-threading and SIMD acceleration (via numexpr/miniexpr).
- Can operate with data from other libraries (like PyTables, h5py, Zarr, Dask, etc).
- Supports NumPy ufunc mechanism: mix and match NumPy and Blosc2 computations.
- Integrates with Numba and Cython via UDFs (User Defined Functions).
- Adheres to modern array API standard conventions (https://data-apis.org/array-api/).
- Can perform linear algebra operations (like ``blosc2.tensordot()``).

Have a glimpse at the kind of acceleration that the new miniexpr engine can get (using an Ubuntu 24.04 box with an i9-13900K CPU and 32 GB of RAM):

In [1]: import numpy as np
In [2]: import blosc2
In [3]: import numexpr as ne

In [4]: %time a = np.linspace(0., 1., int(1e9), dtype=np.float32)
CPU times: user 1.41 s, sys: 1.14 s, total: 2.54 s
Wall time: 2.54 s
In [5]: %time b2a = blosc2.linspace(0., 1., int(1e9), dtype=np.float32)
CPU times: user 6.89 s, sys: 776 ms, total: 7.67 s
Wall time: 2.2 s

In [6]: %time np.sum(np.sin(a + 0.5))
CPU times: user 1.05 s, sys: 435 ms, total: 1.48 s
Wall time: 1.48 s
Out[6]: np.float32(8.068454e+08)
In [7]: %time ne.evaluate("sum(sin(a + 0.5))")
CPU times: user 3.79 s, sys: 41.9 ms, total: 3.83 s
Wall time: 3.83 s
Out[7]: array(8.0684536e+08)
In [8]: %time blosc2.evaluate("sum(sin(a + 0.5))")
CPU times: user 2.96 s, sys: 459 ms, total: 3.42 s
Wall time: 683 ms   # 2.2x faster than NumPy and  6.8x faster than NumExpr
Out[8]: np.float32(8.068454e+08)

In [9]: %time blosc2.evaluate("sum(sin(b2a + 0.5))")
CPU times: user 3.55 s, sys: 31.4 ms, total: 3.58 s
Wall time: 176 ms   # 8.4x faster than NumPy and  21.7x faster than NumExpr
Out[9]: np.float32(8.068453e+08)

In [10]: %time np.sum(np.sin(b2a + 0.5))    # blosc2 array support numpy ufunc/array interfaces
CPU times: user 3.46 s, sys: 53.2 ms, total: 3.52 s
Wall time: 174 ms
Out[10]: np.float32(8.068453e+08)

Here you can see blosc2.evaluate() acting as a drop-in replacement for numexpr.evaluate(); it supports parallelized reductions and transparently working with both native NumPy and Blosc2 arrays.  It achieves way better performance by making a more effective use of the cache system in modern CPUs.  See the rational at: https://ironarray.io/blog/miniexpr-powered-blosc2.

In addition, Python-Blosc2 can work transparently with data either in-memory, on disk or in the network.  In these days where memory prices have skyrocketed, compression is an important subject, most specially if this does not necessarily mean a drop in performance --and we are commited to increase the scenarios where this is the case.

More info: https://blosc.org/python-blosc2/

Cheers,

--
Francesc Alted, on behalf of the Blosc2 development team
Reply all
Reply to author
Forward
0 new messages