python-blosc assumptions while reading compressed data

4 views
Skip to first unread message

Eli Stevens

unread,
Jun 11, 2020, 4:41:31 PM6/11/20
to blosc
Hello,

I am writing compressed data via the C API. I have a fixed block size (256k) and a buffer size of 256k + min header size.  When I send in 384k of random data, I take the first 256k, compress it, and write it plus the header out, then do the same with the remaining 128k. That all seems to round-trip nicely.

I then write out that stream of data to a file, and would like to be able to read this data using the python API. It seems that the python API requires that the bytestring passed in contain only one chunk of data, and will not do any sort of iterative "read the header; decompress, read the next header, repeat" behavior. Is that correct?

In order to read the data, will I have to implement the "read header, check cbytes, slice bytes to length; pass to python-blosc" loop myself? Or is there some way to do this that I'm not seeing?

Thanks,
Eli


Eli Stevens

unread,
Jun 11, 2020, 6:13:02 PM6/11/20
to blosc
Follow up: from http://python-blosc.blosc.org/reference.html it doesn't seem like there are python bindings for blosc_cbuffer_sizes, so it doesn't seem like it's possible to parse this data from Python.

Am I missing something?

Cheers,
Eli

Francesc Alted

unread,
Jun 12, 2020, 6:44:14 AM6/12/20
to Blosc
Hi Eli,

Blosc 1.x series only has support for compressing/decompressing a single chunk, i.e. there is not a frame format defined.  This will change in Blosc 2.x series, where a proper frame format is being defined (https://github.com/Blosc/c-blosc2/blob/master/README_FRAME_FORMAT.rst).  It is still in beta though.

Regarding a wrapper for blosc_cbuffer_sizes() in python-blosc, it should be straightforward to implement it.  Feel free to file a ticket about it (PRs are welcome too).

Cheers

--
You received this message because you are subscribed to the Google Groups "blosc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blosc+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blosc/fcacc6b1-9233-4458-8638-074dcedc134eo%40googlegroups.com.


--
Francesc Alted
Reply all
Reply to author
Forward
0 new messages