New audio compression technique?

0 views
Skip to first unread message

G. Orme

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to
Hi,
I've been working on a way of compressing music files, and wonder if
anyone has heard of a technique like this. Also I'd be interested in getting
a quote for building a prototype if it's feasible.
As you know music contains a lot of redundant data. For example, the
frequncies are usually restricted to a music scale, and we can make a fair
copy of a song by using MIDI, much smaller than MP3. Another thing that is
common is that the instruments and voices rarely change much, and effects
like reverb also rarely change.
The idea then is to make a MIDI like transcription of the song, and
record samples of the instruments, and perhaps different sounds of the
singers, approximately the same reverb, etc. One then plays this back on a
sampler program, one that plays back samples according to MIDI song scores.
One then has a reasonable imitation of a popular song which instead of being
say 4 meg is say 400K including samples and MIDI data. One then compares the
two waveforms, and subtracts one from the other, placing this information in
third waveform file. it should then be that if one added this subtracted
waveform to the played MIDI file, one should hear a close approximation to
the original. One then saves this subtracted file as an MP3.
When someone downloads a song they download samples, MIDI, and this
difference waveform. The sampler program plays the samples with the MIDI
data, adds the subtracted waveform and plays a possibly good copy of the
original. it's hard to estimate how well this should work, but here goes. A
MIDI file with samples should be 10% of the size of the MP3 from my
experience. Say then the MIDI song is consistently about 90% faithful to the
original (add any figure you think is more accurate here), then one would
expect the subtracted waveform to be about 10% of the size of the original,
also sent as an MP3 (perhaps less compressible). One might then have a
system to make a song five times smaller than an MP3.

Errol Smith

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to
On Sun, 30 Apr 2000 06:14:31 GMT, "G. Orme" <ano...@marsattack.com>
wrote:

> The idea then is to make a MIDI like transcription of the song, and
>record samples of the instruments, and perhaps different sounds of the
>singers, approximately the same reverb, etc. One then plays this back on a
>sampler program, one that plays back samples according to MIDI song scores.
>One then has a reasonable imitation of a popular song which instead of being
>say 4 meg is say 400K including samples and MIDI data. One then compares the

This sounds like the amiga-originated MOD format, and it's more
recent PC based variations like S3M, XM, IT etc. They are essentially
samples with sequencing/effects information. Find yourself a module
player and get yourself some MOD's from www.hornet.org or a similar
repository.

Errol Smith
errol <at> ros (dot) com [period] au

G. Orme

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to

Errol Smith <em...@see.signature.com> wrote in message
news:390bf85c...@news.ros.com.au...

G. These programs seem to be sampler programs that play MIDI, but don't do
the difference file between it and the original.

D.A.Kopf

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to G. Orme
G. Orme wrote:
<snip>

> MIDI file with samples should be 10% of the size of the MP3 from my
> experience. Say then the MIDI song is consistently about 90% faithful to the
> original (add any figure you think is more accurate here), then one would
> expect the subtracted waveform to be about 10% of the size of the original,
> also sent as an MP3 (perhaps less compressible). One might then have a
> system to make a song five times smaller than an MP3.

A 90% faithful-in-amplitude waveform probably involves average real space
differences of 10 bits per sample with a flattish uncompressible distribution.
The correction would end up being larger than the mp3 file! Might work
decently in fourier space, but you'd need quite a bit more processing compared
to mp3 playback.


G. Orme

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to

D.A.Kopf <d...@dakx.com> wrote in message news:390C3903...@dakx.com...

G. It's hard to say how accurate it might be. Instruments might be very
close, and voices less so. The subtracted file should still sound like a
song, as it is still the same notes but with slightly different overtones
and much softer. Because the method is so different from mp3 shouldn't the
subtracted file still compress?

>

Robert Sefton

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to
I have a feeling your difference file would not compress any better
in MP3 than any other musical track. You need to find a way to take
advantage of the correlation with the midi file. MP3 won't do that.
You might get better results using some kind of difference coding
combined with run-length + Huffman coding. But that method would not
have any perceptual coding capability to take advantage of dropping
the information you don't need. Keep thinking - it's possible you
could come up with a scheme with decent compression, but likely it
will require more encoding and decoding horsepower than a comparably
sized MP3 file.


"G. Orme" <ano...@marsattack.com> wrote in message
news:CzZO4.8290$v85....@news-server.bigpond.net.au...

G. Orme

unread,
May 1, 2000, 3:00:00 AM5/1/00
to

Robert Sefton <rse...@nextstate.com> wrote in message
news:8eiqjo$1l1b$1...@thoth.cts.com...

> I have a feeling your difference file would not compress any better
> in MP3 than any other musical track.

G. I'm not sure what you mean here. I think if the difference file only
compressed the same as MP3, say 12:1, the that would be OK. Another
possibility is compressing it more as the quality difference would not be as
noticeable being only the difference file to get it to a manageable size, to
make up for difficulty in compressing it.

You need to find a way to take
> advantage of the correlation with the midi file. MP3 won't do that.
> You might get better results using some kind of difference coding
> combined with run-length + Huffman coding. But that method would not
> have any perceptual coding capability to take advantage of dropping
> the information you don't need. Keep thinking - it's possible you
> could come up with a scheme with decent compression, but likely it
> will require more encoding and decoding horsepower than a comparably
> sized MP3 file.

G. One can do this I think with a similar principle to Huffman encoding.
Take common samples to play with MIDI of a good size, then more unusual
samples of smaller size to also play with MIDI to cover for example parts of
distortion guitar, vowels or consonants the singer uses, then smaller again
to cover say various overtones, or a pattern of claps approximating
applause. The total might have 100 different samples ranging from the most
common sounds to the most rare, each additional sample cutting down the size
the difference file has to be. A program may be able to automate this in a
similar way to normal compression, in that it searches for repetitive or
similar waveforms, and waveforms similar but at at certain frequencies, that
is the music scale. It then takes these samples and does a Huffman tree to
determine which sample sizes are most economical to send with the MIDI file
to make the difference file smaller. In a way this format is highly
compressible because songs by their nature have the same voices and
instruments throughout, which are unable to change much. For example special
effects such as reverb, chorus, and overdrive distortion usually don't
change, and these can be simulated by PC sampler programs reasonably closely
in real time, again making the difference file smaller. MIDI has a lot of
controller information unused, and of course one could design a custom MIDI
for the job. For example, portmonteau makes notes run into each other like a
musical saw or hawaiian guitar, and also can mimic human voice note changes,
and can be sent as part of MIDI. Bending notes as guitarists like to do is
also in MIDI.

Reply all
Reply to author
Forward
0 new messages