What can the static library do in terms of compression without Zlib or other external libraries?

30 views
Skip to first unread message

Johnathan Baird

unread,
Aug 29, 2018, 12:04:38 PM8/29/18
to libarchive-discuss
Hi,

I'm trying to write a simple program that wraps up some files into an archive and compresses the archive.

I don't really care what format is used as long as the compression is relatively effective. What formats does the library support without the help of external libraries? I've tried running through all the calls of form archive_write_set_compression_* one by one although the only one that seems to not return an error code, and produce a file with smaller file size than the original files is archive_write_set_compression_compress. Are any others supposed to work without external library support? Does it matter that these functions seemed to be marked as deprecated?

One thing the GitHub wiki documentation doesn't explain very well is what the difference is between the archive_write_set_compression_*, archive_write_add_filter_*, and archive_write_set_format_* calls... so I'm kind of just calling things at random and observing the output's file size which is pretty time consuming.

I'm not experienced with compression/decompression so a lot of this is new to me. Any guidance would be appreciated.

Thanks,
Johnathan

Tim Kientzle

unread,
Sep 1, 2018, 6:34:42 PM9/1/18
to Johnathan Baird, libarchiv...@googlegroups.com

> On Aug 29, 2018, at 9:04 AM, Johnathan Baird <janim...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to write a simple program that wraps up some files into an archive and compresses the archive.
>
> I don't really care what format is used as long as the compression is relatively effective. What formats does the library support without the help of external libraries? I've tried running through all the calls of form archive_write_set_compression_* one by one although the only one that seems to not return an error code, and produce a file with smaller file size than the original files is archive_write_set_compression_compress. Are any others supposed to work without external library support?

Writing good compression code requires very specialized expertise. For that reason, libarchive relies primarily on external library implementations developed by experts in those formats. At a minimum, you should try to get the Zlib compression library, which is used for "deflate" and "gzip" compression and is also used within the "zip" format. (Note, we've actually discussed making libarchive require Zlib, since it's so common.)

> Does it matter that these functions seemed to be marked as deprecated?

The "set_compression" versions exist only for compatibility with libarchive 2.x. New code should use the "add_filter" equivalents.

>
> One thing the GitHub wiki documentation doesn't explain very well is what the difference is between the archive_write_set_compression_*, archive_write_add_filter_*, and archive_write_set_format_* calls... so I'm kind of just calling things at random and observing the output's file size which is pretty time consuming.

Generally, libarchive creates archives by:
* Combining "entries" into a single "archive"
* Applying one or more "filters" to process the result

There are a variety of filters that can be combined for different purposes. Some are for compression, some are used for other purposes.

>
> I'm not experienced with compression/decompression so a lot of this is new to me. Any guidance would be appreciated.

Unless you have pretty specialized requirements, I recommend people use either:

* Zip format without any additional libarchive filters. By default, Zip compresses each entry individually (using Zlib's "deflate" algorithm), which usually provides reasonable compression. Zip is widely supported, well-documented, and very stable format. If you intend to share the resulting archives with other people, there is a very good chance they already have the software they'll need to unpack Zip archives. (macOS and Windows can both expand a Zip archive by just double-clicking it.)

* Tar format with Gzip or Bzip2 compression filter. Tar combines entries without compression; the subsequent compression works over all the data together, which is usually a little smaller. The resulting tar.gz or tar.bz2 output is also standard and well-supported on most every platform.

Libarchive supports many other combinations, of course. Some provide much better compression than the above but are less well-known.

Tim

P.S. Tar format with Compress filter is about the best you can do with libarchive without any additional libraries.







Joerg Sonnenberger

unread,
Sep 1, 2018, 6:54:31 PM9/1/18
to janim...@gmail.com, libarchiv...@googlegroups.com
On Wed, Aug 29, 2018 at 6:04 PM Johnathan Baird <janim...@gmail.com> wrote:
I don't really care what format is used as long as the compression is relatively effective. What formats does the library support without the help of external libraries? I've tried running through all the calls of form archive_write_set_compression_* one by one although the only one that seems to not return an error code, and produce a file with smaller file size than the original files is archive_write_set_compression_compress. Are any others supposed to work without external library support? Does it matter that these functions seemed to be marked as deprecated?

Besides the answer from Tim, let me slightly expand on why compress(1) support exist in libarchive in the current form. The reason is simply that there is no standalone compress(1) library to depend on. As such, the only way to create the output is to include the code here. There is very little reason to not depend on zlib, it is quite small actually.

Joerg
Reply all
Reply to author
Forward
0 new messages