Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Transcoding - why so bad?

5 views
Skip to first unread message

Nick Jeffery

unread,
Mar 11, 2003, 9:09:09 PM3/11/03
to
Now then, I've always just accepted on face value that transcoding is
bad. But *why*?

You apply a psychoacoustic model to some audio, and it gets rid of the
bits it can safely do so, and continue with your encoding.

Then you extract it, and wish to subsequently encode it again.

You apply the same psychoacoustic model to the audio -- but the bits it
could throw away aren't there anyway, so it could continue with the
encoding and should come out with an identical result.


What've I missed?

--
Nick Jeffery.

Aztech

unread,
Mar 11, 2003, 9:36:37 PM3/11/03
to
"Nick Jeffery" <n.s.j...@durham.ac.uk> wrote in message
news:b4m4o3$80o$2...@sirius.dur.ac.uk...

Print this message, photocopy said message, photocopy the photocopy... etc, even
try it on a digital copier ;)

You assume that everything is equal, whilst in reality there's different
encoders, psychoacoustic models, parameters (e.g. bitrate), codecs, frame
boundaries, any combination of which can mismatch very easily.

Each cascade magnifies the shortcomings of the last one, if you're throwing
things away (over 10:1) then approximating it back and doing the same over again
it causes problems. Try encoding something toolame > lame > lame, throw xing
into the mix if you're really sadistic.

Az.

Nick Jeffery

unread,
Mar 11, 2003, 9:51:27 PM3/11/03
to

What about if you were doing:

lame 192 -> lame 192 -> lame 192 -> lame 192 -> lame 192 using the same
psymodel, etc?

Does Xing do really mental stuff over 15kHz still? I've not used it
since 1999...

--
Nick Jeffery.

Mikeapollo

unread,
Mar 11, 2003, 10:06:01 PM3/11/03
to

"Nick Jeffery" <n.s.j...@durham.ac.uk> wrote in message
news:b4m77d$8mc$1...@sirius.dur.ac.uk...

The result may surprise you.. It will sound extremely "harsh" - the techies
will explain why ;-)

Think of it like the difference between cloning a datasource and copying a
datasource.

You can easily clone a clone - nothing happens, but when you transcode
something you are effectively making a copy of it... Hence weaknesses in the
original "copy" will be highlighted, and these will be highlighed again and
again the more generations you create...

Mikeapollo

Nick Granger-Brown

unread,
Mar 12, 2003, 3:24:48 AM3/12/03
to
--> Try this - the semi techy answer. (BTW. I have no knowledge of the
compression used, this is a general description).

There are three broad categories of transformation. I will use the
analogy of a journey to describe them.

One to one - you move from one place to another and can move back to
exactly where you came from. If you wander away from where you land
moving back can take you in a totally different place from where you
started. The transformation is not bad in itself but adding noise is a
big problem.

One to many - you move from one place to a general area. Knowing
approximately where you moved to is all you need to know exactly where
you came from. This is really useful for error correction but as every
point you can start from needs to be an area in the place you are going
to you have to move somewhere a whole lot bigger than where you came
from for this to work. IE this is expansion not compression.

Many to one - you move from a general area to a specific place. Knowing
where you are now only tells you approximately where you came from, you
cannot get back there with certainty. This is compression, the kind of
processing used to get more music into a digital channel. In the
proces you loose information and the process is irreversible.

It may be possible to define a comression method mathematically which is
many-to-one the first time you use it and then one-to-one when applied
subsequently. That would allow you to apply the coding over and over
again without gross distortion. This is real life though and these
mathematical toys rely in infinite bandwidth, infinite sampling rates
and zero noise. It does not work in practice.

The reality of any real world comression is that however it is defined
*only* to affect the non audible parts of the signal there is always
some change to the audible part. The assumption of the designers is
that it is applied once and the signal going in is perfect. Applying a
compression many times erodes the signal and the changes made outside
the audible range move in and become more evident each time the tranform
is applied.

There *is* lossless compression in the digital world but that is
completely different from the kind of tranformation used to make music
fit the ear.

--> Or this - a cheesy analogy
I often put the cheese back in the fridge with the end unwrapped and it
dries up. My wife hates this as the end becomes completely inedible
after a couple of days.
Now suppose I visit the fridge every evening, take the cheese out and
think - nobody is going to eat the rind - so I cut it off very thinly
and put the cheese back. The cheese is never going to have that inedible
hard end but after a few days it is going to be noticeably smaller
however thinly I take the cut and eventually all I will have to show for
the cheese are the pieces of rind I threw away.

David Robinson

unread,
Mar 12, 2003, 4:53:05 AM3/12/03
to
Nick Jeffery <n.s.j...@durham.ac.uk> wrote in message news:<b4m4o3$80o$2...@sirius.dur.ac.uk>...

What you've missed is real explanation of psychoacoustic based coding
- what you quote is the commonly accepted "noddy" version which has
little to do with reality.


It's true that mp2 encoders will sometimes dump entire frequency bands
(usually high ones), and many other encoders will apply a low pass
filter, which effectively removes several frequency bands. But apart
from these two exceptions, audio codecs DO NOT REMOVE anything.

In mp2 you have a bank of 32 linearly spaced band-pass filters. We
look at the output of these filters in time-domain blocks.

The output of each filter is decimated. This means that only every nth
sample is kept. In digital sampling, Nyquist theory says that you need
a sampling rate greater than twice the bandwidth of the signal.
Nyquist's limit is usually quoted as "twice the highest frequency you
wish to store" - but this is only true if the lowest frequency is near
DC - in general, twice the bandwidth is correct. So, with 32 filters,
the bandwidth at the output of each is vastly reduced, and we can dump
most of the samples at the output.

This hasn't gained us anything yet, because, even in a perfect system,
you still need as many samples as you started with to represent the
audio - just in 32 bands instead of 1.

Where the "lossy" part comes in is in what we do with those samples.
The importance of each band is calculated by the psychoacoustic model,
and then the bit allocation algorithm determines how many bits are to
be allocated to each band. So, each sample in a given band will be
re-quantized down to (say) 2-bits, or 4-bits of whatever, depending on
how audible that band is predicted to be, and hence how much noise can
be allowed. Fewer bits = more noise. A scale factor is applied to each
band, so that the amplitude of the samples is scaled before
quantization to use all the bits allocated. This scale factor is sent
with the quantised bits.

You multiplex the bits and scalefactors, and send them on their way.
The decoder reverses the job, unscaling, padding, filtering, and
finally adding all 32 bands together, giving you back a complete audio
signal.

BUT in that audio signal, there's lots of quantisation noise;
different amounts in each of the 32 bands. It should be inaudible, but
it's still there. If you send this through an audio codec again (even
the same codec), you have two problems:
1. You add more noise. Even without any clever psychoacoustics,
conceptually, if you add some (inaudible) noise to a signal, and then
add more and more and more noise (via transcoding repeatedly) there
will come a point where this noise becomes audible.
2. The quantisation noise doesn't fall back into the frequency bands
and time-domain blocks where it came from. The filter bands overlap,
so some of the noise from each band will breakthrough into the higher
and lower frequency bands. The time domain blocks also overlap, and
there's no mechanism to make sure that the will line up in the same
way when re-encoded.

In the second encoding cycle, you might expect that, after the filter
bank, you would find that all the samples were neatly quantised -
since they were at this point in the first encoding cycle, and in a
nice digital system, the mathematics should simply get us back to the
same place. If this worked, we wouldn't have to re-quantise - we could
just use the quantisation from the first encode, and avoid adding any
more noise.

But the overlap of bands and blocks means that they have lost this
neat quantisation, and they'll have to be quantised again. Which adds
more noise.

Also, the psychacoustic model in the encoder assumes that the input is
clean - if it's running out of bits, it will add noise right up to the
limit where it believes it will be inaudble. However, the input
already includes noise. It's adding noise to noise. Worse still, it's
adding noise to noise which has leaked through from other frequency
bands and time domain blocks. So not only does the noise add up, but
it spreads away from the signal which was supposedly hiding it in the
first place.

I hope this helps - I think it's a good explanation (it's also fairly
accurate, and very close to what actually happens!), but every time I
give it, people tell me it's too complicated. Well, reality often is!

Cheers,
David.

DAB sounds worse than FM, in the UK

unread,
Mar 12, 2003, 5:39:45 AM3/12/03
to


Good explanation David. Einstein said something like "things should be
explained in as simple a manner as it is possible to do so, but no simpler".
I've been accused of showing off when I explain things to do with digital
transmission, but there's a limit to how simply you can explain things that
are pretty complex in the first place.


--
DAB sounds worse than FM - So the BBC *needs* a 2nd DAB multiplex

Radios 1-4 now on Freeview at 192kbps, but BBC DAB is a national disgrace
www.digitalradiotech.co.uk -- Subscribe for free to the Digital Radio
Listeners' Group Newsletter and join the campaign to get the BBC a 2nd DAB
multiplex


jesseg

unread,
Mar 12, 2003, 8:00:37 AM3/12/03
to
hah yeah Xing... Xing encodes your mp3 twice in one shot!!! now
you're talkin... hehehjejeje0j0j0j0


"Aztech" <a...@tech.com> wrote in message
news:V2xba.785$3R1.8...@news-text.cableinet.net...

jesseg

unread,
Mar 12, 2003, 8:03:39 AM3/12/03
to
> But apart from these two exceptions, audio codecs DO NOT REMOVE anything.


Wow... the guys at Hydrogen audio would have a real hayday with your post
Dave. That's the worst and most untrue blanket statement Ive heard in a
while... heee Thanks for a laugh =)


ff123

unread,
Mar 13, 2003, 4:06:13 AM3/13/03
to

I can't tell if you're funning with David or not, so just to keep the
record straight...

2Bdecided (aka David Robinson) is one of the most respected members of
hydrogenaudio.org (at least IMO).

Mark Taylor (erstwhile lead lame developer) has complained in the past
about how the mp3 encoding process is most often characterized as
removing information when, from an algorithmic point of view, it could
more accurately be said to be adding (inaudible) noise.

ff123

Gabriel Bouvigne

unread,
Mar 13, 2003, 9:49:05 AM3/13/03
to
"jesseg" <so...@thing.com> a écrit dans le message news:
LeGba.60444$qi4.39166@rwcrnsc54...

David is right. Expect from the (optionnal) bandpass filtering, the codec
does not remove signal. It reduces the signal quality by introducing
quantization noise, so you can use less bits.

Regards,


----
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org


jesseg

unread,
Mar 20, 2003, 11:35:19 PM3/20/03
to
Not to be anal or anything... but i believe "Bits" falls into the category
of "anything"... and so does noise, however inaudible.

Its just, a really bad blanket statement to make about ALL codecs, see what
Im saying??? I dont think he said "mp3" i think he said "all codecs"


nuff said

"Gabriel Bouvigne" <gbou...@netquartz.com> wrote in message
news:3e709a97$0$29962$4d4e...@read.news.fr.uu.net...

David Robinson

unread,
Mar 21, 2003, 10:31:54 AM3/21/03
to
"jesseg" <so...@thing.com> wrote in message news:<bEwea.187810$qi4.83818@rwcrnsc54>...

> Not to be anal or anything... but i believe "Bits" falls into the category
> of "anything"... and so does noise, however inaudible.
>
> Its just, a really bad blanket statement to make about ALL codecs, see what
> Im saying??? I dont think he said "mp3" i think he said "all codecs"

Now you're just being picky Jesseg!

However, a codec doesn't remove "bits". An audio "codec" is, by
definition, an encoder and a decoder. In between the two, we have the
advantage of requiring fewer bits to represent the audio signal.
However, we should have exactly the same number of bits at the output
of the decoder as we had at the input of the encoder - unless the
codec shortens the file, or downsamples it. In reality, many codecs
add a little silence to the start and end of files, so you actually
get more bits out than you put in.

;-)


"...Audio codecs don't remove anything..." - as you say, noise can
count as "anything"; but codecs (by definition) add it, rather than
remove it.

I suppose you could say that in some small way, an audio codec may
change some electrical energy into heat which we can no longer harness
using current technology - and hence it has removed the possibility of
using that electrical energy to do something else. But to use this as
an argument against the statement "audio codecs don't remove anything"
is a little weak!

Cheers,
David.

0 new messages