ANNOUNCE: shorten-1.00 - lossless compression for waveform files

0 views
Skip to first unread message

Tony Robinson

unread,
Mar 30, 1993, 10:30:39 AM3/30/93
to
Shorten is a lossless compressor for waveform files such as speech. It is
available from svr-ftp.eng.cam.ac.uk in directory misc as shorten-1.00.shar.

The following changes have been made in version 1.00

Compression and decompression speed are several times faster

Multiple channel files are better supported

Full linear predictive coding is supported which gives slightly
better compression but slower execution times

Support is provided for embedded applications

It has been tested on many UNIX machines and has even been known to
compile under DOS.

The man page is appended for further details.


Tony Robinson

------------------------------------------------------------------------------
Tony Robinson Phone +44 223 332815 Fax +44 223 332662 Email a...@eng.cam.ac.uk
Cambridge University Engineering Department, Trumpington Street, Cambridge, UK
--------"Steve and Mary run and play in the park and pool respectively"-------


SHORTEN(1) USER COMMANDS SHORTEN(1)

NAME
shorten - lossless compression for audio files

SYNOPSIS
shorten [ -xh ] [ -a align bytes ] [ -b block size ] [ -c
channels ] [ -m blocks ] [ -p prediction order ] [ -t file
type ] [ -v version ] [ input file ] [ output file ]

DESCRIPTION
shorten reduces the size of audio files using Huffman coding
of prediction residuals. If only one file name is speci-
fied, the compressed file is written to standard output. If
no file names are specified, standard input is read.
Filenames can be replaced by "-".

The amount of compression obtained depends on the nature of
the audio signal. Those composing of low frequencies and
low amplitudes give the best compression. Compression is
generally better than that obtained by general purpose
compression utilities applied to audio files.

OPTIONS
-a align bytes
Specify the number of bytes needed to be copied verba-
tim before compression begins according to the speci-
fied file type. This is useful if header information
is not a multiple of the number bytes per sample.

-b block size
Specify the number of samples to be grouped into a
block for processing. Within a block the signal ele-
ments are expected to have the same spectral charac-
teristics. The default option works reasonably well.

-c channels
Specify the number of independent interwoven channels.
For two signals, a(t) and b(t) the original data format
is assumed to be a(0),b(0),a(1),b(1)...

-h Give a short message specifying usage options.

-m blocks
Specify the number of past blocks to be used to esti-
mate the mean of the signal. The default value of zero
disables this prediction and the mean is assumed to lie
in the middle of the range of the relevant data type
(i.e. at zero for signed quantities).

-p prediction order
Specify the order of the linear predictive filter. The
default value of zero disables the use of linear pred-
iction and a polynomial interpolation method is used
instead. The use of the linear predictive filter gen-
erally results in a small improvement in compression
ratio at the expense of execution time. Compression
time is linear in the specified order whilst decompres-
sion time is about twice that of the default polynomial
interpolation.

-t file type
Gives the type of the sound sample file as one of
{au,s8,u8,s16,u16,s16x,u16x,s16hl,u16hl,s16lh,u16lh}.
au is the file type of ulaw encoded files. All the
other types have initial s or u for signed or unsigned
data, followed by 8 or 16 as the number of bits per
sample. No further extension means the data is in the
natural byte order, a trailing x specfies byte swapped
data, hl explitly states the byte order as high byte
followed by low byte and hl the converse.

-v version
Specify the binary format version number. At the
moment, version 1 can write version 0 files, although
continuation of this feature is not guarenteed. It
will always be possible to unpack files packed with a
lower version number.

-x Extract. Reconstruct the original file. All other
command line options are ignored.

METHODOLOGY
shorten works by blocking the signal, making a model of each
block in order to remove temporal redundancy, then Huffman
coding the prediction residual.

Blocking
The signal is read in a block of about 128 or 256 samples,
and converted to ints with expected mean of zero. Sample-
wise-interleaved data is converted to separate channels,
which are assumed independent.

Decorrelation
Four functions are computed, corresponding to the signal,
difference signal, second and third order differences. The
one with the lowest variance is coded. The variance is
measured by summing absolute values for speed and to avoid
overflow.

Compression
It is assumed the signal has the Lapaclian probability den-
sity function of exp(-abs(x)). There is a computationally
efficient way of mapping this density to huffman codes, The
code is in two parts, a run of zeros a bounding one and a
fixed number of bits mantissa. The number of leading zeros
gives the offset from zero. Signed numbers are stored by
calling the function for unsigned numbers with the sign in
the lowest bit. Some examples for a 2 bit mantissa:

100 0
101 1
110 2
111 3
0100 4
0111 7
00100 8
0000100 16

INTERNAL FILE STRUCTURE
The structure of a compressed file is:

1) four bytes for a magic number

2) one byte for a format version

3) The command line options

4) a repeating sequence of <command> <args> ...

SEE ALSO
compress(1),pack(1).

DIAGNOSTICS
Exit status is normally 0. A warning is issued if the file
is not properly aligned, i.e. a whole number of records
could not be read at the end of the file.

BUGS
No check is made for inceasing file size, but valid speech
files generally achieve some compression. Even compressing
a file of random bytes (which represents the worst case
audio file, that of maximum amplitude white noise) only
results in a small increase in the file length (about 6%).

In versions prior to 1.0, not enough space was allocated
when the specified blocksize was more than the default.
This could have caused data corruption problems. This has
now been fixed.

Future enchancements that are likely to improve the compres-
sion are pitch period tracking and arithmetic coding.
Please mail me with bugs, bug fixes and any other sugges-
tions.

AVAILABILITY
The latest version can be obtained by anonymous FTP from
svr-ftp.eng.cam.ac.uk, directory misc.

AUTHOR
Copyright (C) 1992,1993 by Tony Robinson (a...@eng.cam.ac.uk)

The aim of the copying and usage restrictions is that is I
want people to be able to use the software freely, modify if
they like, but not sell it or claim it is their own work.

Thanks to Dolf Grunbauer, Kris Huber and the NIST gang for
providing motivation and valuable feedback.

Reply all
Reply to author
Forward
0 new messages