LZ4 tool with sparse files and block discards

289 views
Skip to first unread message

Neil Wilson

unread,
Feb 10, 2015, 5:46:36 AM2/10/15
to lz...@googlegroups.com
I'm using the lz4 cli tool to compress and decompress raw virtual machine images. One of the features of such files is that they tend to have a lot of zeros in them.

At the moment the cli tool writes out those zeros as it does with any other data. However it would be useful if there was some way of detecting a block of zeros in the compression stream and changing that into either a seek on a normal file to create a sparse file, or a block discard or block zero out call for a block device [1] (or continue to write out the zeros if we're writing to a stream or character device). 

At the moment I'm working around the lack of zero detection by piping the cli tool into 'cp', ie. lz4 -cd <source file> | cp --sparse=always /dev/stdin <target file> but of course that means the zeros still have to be written and read which seems a bit of a waste of time if you're after high speed. 

Is there a way to detect a zero block so that it can be treated as a special case and would such a facility be considered useful for inclusion in the lz4 cli tool? I'm happy to try and write the patch to the code if so. 


Rgs

NeilW


Yann Collet

unread,
Feb 10, 2015, 8:37:42 AM2/10/15
to lz...@googlegroups.com
Hi Neil


This look like a file-system specific approach.

The frame format doesn't define a "zero-filled block" signal.
So, currently, detecting that a block is "filled with zeroes" is an external process, which is not performed by lz4 cli.

I've no experience regarding sparse file techniques.
My guess is that it is likely FS specific, with potentially some portability issues.
It is always difficult to balance between optimisation, portability and maintenance.

Of course, you can prove me wrong, and show that it can be done in a efficient manner, without jeopardising portability.

If it can be done, the important question seems to be : does this complexity improve performance enough ?


Regards

Takayuki Matsuoka

unread,
Feb 10, 2015, 9:11:52 AM2/10/15
to lz...@googlegroups.com
Hi guys.
Personally, I always believe "tar S". But it seems interesting problem for me.

@Neil, could you provide your own benchmark ?
I'm not sure about your operations, but it may look like the following commands :

dd if=/dev/zero of=my-4gb-file seek=4G bs=1 count=1
lz4 -9 -f my-4gb-file my-4gb-file.lz4
time lz4 -d -f my-4gb-file.lz4 my-4gb-file.lz4.out
time lz4 -d -c my-4gb-file.lz4 | cp /dev/stdin my-4gb-file.lz4.cp.out
time lz4 -d -c my-4gb-file.lz4 | cp --sparse=always /dev/stdin my-4gb-file.lz4.sparse.out


@Yann, just an FYI, aside from this topic.

These are simple and generic examples to make sparse file with standard C library which works both on *nix and Windows :
http://www.unixguide.net/unix/sparse_file.shtml
http://stackoverflow.com/a/11909465/2132223

Also, xz has fairly simple and generic method to support sparse file
in src/xz/file_io.c, see io_write(), is_sparse()
http://git.tukaani.org/?p=xz.git;a=blob_plain;f=src/xz/file_io.c;hb=HEAD

I'm not sure about performance gain, but it is platform independent and works well on the platform which have sparse file capability.


Hope this helps.

Yann Collet

unread,
Feb 10, 2015, 9:29:34 AM2/10/15
to lz...@googlegroups.com
This is very interesting.
Thanks for the excellent informative links Takayuki.

So, an fseek() is enough to transparently trigger "sparse write" on file systems which do support it, without bothering those which do not ?
Impressive, this is trivial, and indeed very portable.

I'm just a bit curious : is fseek() guaranteed to always produce zero-filled space, no risk of "garbage-filled" space instead ?


Rgds

Neil Wilson

unread,
Feb 10, 2015, 9:35:34 AM2/10/15
to lz...@googlegroups.com
From POSIX.

"The fseek() function shall allow the file-position indicator to be set beyond the end of existing data in the file. If data is later written at this point, subsequent reads of data in the gap shall return bytes with the value 0 until data is actually written into the gap."

Neil Wilson

unread,
Feb 10, 2015, 9:49:59 AM2/10/15
to lz...@googlegroups.com
A slightly faster 'is_empty' perhaps if there is no easy way of checking for blanks other than scanning the decompression buffer. 

int is_empty(char *buf, size_t size)
{
    return buf[0] == 0 && !memcmp(buf, buf + 1, size - 1);
}

Yann Collet

unread,
Feb 10, 2015, 9:58:31 AM2/10/15
to lz...@googlegroups.com
Excellent. This looks great.
I'll try a few benchmarking exercise too to optimize the detector, this is already a great start.

Neil Wilson

unread,
Feb 10, 2015, 10:35:59 AM2/10/15
to lz...@googlegroups.com
The operations I'm doing are on large VM images. Let me describe what I have and then do the benchmarks on that. 

First take the latest Ubuntu Trusty image (trusty-server-cloudimg-amd64-disk1.img) which is in QEMU qcow format, convert to raw and expand to 80G. 

qemu-img convert -O raw trusty-server-cloudimg-amd64-disk1.img testy.img
truncate --size 80G testy.img

which gives this on the filesystem

# ls -lash testy.img
807M -rw-r--r--. 1 root root 80G Feb 10 14:51 testy.img

After compressing with lz4 as you suggest I get the following results when decompressing to new images. 

# time lz4 -d -f testy.img.lz4 new-image
Successfully decoded 85899345920 bytes                                         

real 3m34.750s
user 1m48.547s
sys 1m21.370s

# time lz4 -c -d testy.img.lz4 | cp /dev/stdin new-image.cp

real 5m13.574s
user 2m30.285s
sys 3m12.671s

# time lz4 -c -d testy.img.lz4 | cp --sparse=always /dev/stdin new-image.sparse

real 3m21.534s
user 2m42.877s
sys 1m15.336s

Compare that with the sparse aware conversion from qemu-img

# time qemu-img convert -O raw ~/trusty-server-cloudimg-amd64-disk1.img new-image.qemu && truncate --size 80G new-image.qemu

real 0m9.174s
user 0m7.576s
sys 0m1.314s

Even if you read the entire 80G raw file it's faster because the writes are optimised

# time qemu-img convert -O raw testy.img new-image.qemu2
real 1m19.341s
user 0m15.849s
sys 1m1.690s

Now qemu-img is of course designed for doing that sort of thing from its own image format, but it cannot stream in the way lz4 can. I'm hoping that we can get lz4 down to somewhere near the qemu-img times for this sort of use case. That would save a lot of write bandwidth on the servers which helps save wear on the SSDs.


On Tuesday, 10 February 2015 14:11:52 UTC, Takayuki Matsuoka wrote:

Yann Collet

unread,
Feb 10, 2015, 4:45:32 PM2/10/15
to lz...@googlegroups.com

Takayuki Matsuoka

unread,
Feb 10, 2015, 9:13:29 PM2/10/15
to lz...@googlegroups.com
Thanks Neil !
I'm moving to issue 155 : https://code.google.com/p/lz4/issues/detail?id=155

Yann Collet

unread,
Feb 11, 2015, 1:00:47 AM2/11/15
to lz...@googlegroups.com
Oh yes, sorry, my previous link was incorrect. It's issue 155, as described by Takayuki.
Reply all
Reply to author
Forward
0 new messages