Issue regarding LZ4 Compression

128 views
Skip to first unread message

Alphin Thomas

unread,
Feb 1, 2024, 11:48:22 AM2/1/24
to LZ4c

I've given an input (.bin) file of 32KB. My intention is to compress the the given input file using LZ4_compress_fast(source, destination, source_size, destination_size, acceleration) function. I've passed the arguments to the functions accordingly. Actual Size of the given input data is 32768 bytes. After Compression, I've received Compressed Size of 32898 bytes. The size of the output after compression has been increased. Then I've given this compressed output data of 32898 bytes for Decompression. The Decompression was successful and I've got the the actual data of 32768 bytes.

Why does this data not been compressed and resulted in data increase after compression ?

Why does this data not been compressed and resulted in data increase after compression ?

ashra ivy

unread,
Feb 1, 2024, 11:52:10 PM2/1/24
to LZ4c
Not every data can be compressed. Just think of  "12345678". LZ4 cannot compress it as there are no repeating patterns. So it stores them as literals.
You should learn about compression if this is not clear why.

Alphin Thomas

unread,
Feb 2, 2024, 12:02:07 AM2/2/24
to lz...@googlegroups.com
Hi 

I've given some data to compress. I've attached the data and it's given below 
unsigned char input_data_hex[] = { 0xd2, 0xb2, 0xd2, 0xb3, 0xd2, 0xb4, 0xd2, 0xb5, 0xd2, 0xb6, 0xd2, 0xb7, 0xd2, 0xb8, 0xd2, 0xb9, 0xd2, 0xba, 0xd2, 0xbb, 0xd2, 0xbc, 0xd2, 0xbd, 0xd2, 0xbe, 0xd2, 0xbf, 0xd2, 0xc0, 0xd2, 0xc1, 0xd2, 0xc2, 0xd2, 0xc3, 0xd2, 0xc4, 0xd2, 0xc5, 0xd2, 0xc6, 0xd2, 0xc7, 0xd2, 0xc8, 0xd2, 0xc9, 0xd2, 0xca, 0xd2, 0xcb, 0xd2, 0xcc, 0xd2, 0xcd, 0xd2, 0xce, 0xd2, 0xcf, 0xd2, 0xd0, 0xd2, 0xd1, 0xd2, 0xd2, 0xd2, 0xd3, 0xd2, 0xd4, 0xd2, 0xd5, 0xd2, 0xd6, 0xd2, 0xd7, 0xd2, 0xd8, 0xd2, 0xd9 }; As you can see, the above data contains repetitive patterns. Still I can't compress the above data using LZ4.
What could be the problem?
Is there any other solution to compress these data

--
Vous recevez ce message, car vous êtes abonné à un sujet dans le groupe Google Groupes "LZ4c".
Pour vous désabonner de ce sujet, visitez le site https://groups.google.com/d/topic/lz4c/ZUPNH_Yw03A/unsubscribe.
Pour vous désabonner de ce groupe et de tous ses sujets, envoyez un e-mail à l'adresse lz4c+uns...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/lz4c/e2f9ce22-a6cb-4e22-bd04-c43aa669e2a4n%40googlegroups.com.

ashra ivy

unread,
Feb 2, 2024, 1:17:55 AM2/2/24
to LZ4c
I see no repetion. It is always 0xd2, <any>. If you want to compress this, do your own "compression". For example, do not store 0xd2.
Or try another packer. There are plenty arround.

ashra ivy

unread,
Feb 2, 2024, 1:23:23 AM2/2/24
to LZ4c


alphin...@gmail.com schrieb am Freitag, 2. Februar 2024 um 06:02:07 UTC+1:

Alphin Thomas

unread,
Feb 2, 2024, 1:34:51 AM2/2/24
to lz...@googlegroups.com
Hi
uint8_t data_ptr[32 *1024]

for (size_t i = 0; i < BUFFER_SIZE; ++i) {
        data_ptr[i] = rand();
    }
int compressed_size = LZ4_compress_fast((const char*)data_ptr, compressed_data, input_size, max_compressed_size, acceleration);
Also , why can't I compress this program?
 I've passed the rand() function and stored it into the data_ptr buffer. Obviously there will be repetition because the data type of data_ptr is uint8_t i'e from 0 to 255. 
Still that data has not been compressed.
What could be the reason for this?


Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "LZ4c".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse lz4c+uns...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/lz4c/1435ee80-97da-427d-b3c4-0e1d3199f37dn%40googlegroups.com.

ashra ivy

unread,
Feb 2, 2024, 1:45:21 AM2/2/24
to LZ4c
Please read how LZ4 works, then you will understand.

Takayuki Matsuoka

unread,
Feb 2, 2024, 3:53:58 PM2/2/24
to LZ4c
Hi Alphin,

I think the following r/ELI5 post may help you to get the intuition of
what you are trying to:

ELI5: Why is it mathematically impossible to compress random numbers?
https://www.reddit.com/r/explainlikeimfive/comments/avojnw/


But.  Yes, you can compress rand() a.k.a. pseudo random number generator.
Since rand() is a (highly possibly tiny) program, you can transmit the code itself
and its seed instead of transmitting verbatim pseudo random numbers as is.


And in the big picture, there're so many (im-) possibility to compress the data:
https://mattmahoney.net/dc/dce.html#Section_14


But again, as ashra said, the main intention of lz4 (and LZ77 variants)
is not to implement general/meta/invincible compression algorithm.  It exploits the
structure (redundancy) of the artifical/human-made data.


Finally, just an FYI, recently we have  similar but different discussion:
https://github.com/lz4/lz4/discussions/1354


Hope this helps,
Reply all
Reply to author
Forward
0 new messages