When to set the resolution.

2 views
Skip to first unread message

e deleflie

unread,
Jun 6, 2009, 7:27:06 AM6/6/09
to ambis...@googlegroups.com
In computer architectural and engineering drawings, you dont draw at
the scale of 1:100 ... you draw at 1:1. Then once your drawing is
finished, you print it out (or "deliver" it) at the relevant scale
...1:100, 1:20 or whatever.

In vector drawings (for the graphic arts) that you work at no scale
but then, when the drawing is finished, you can print out
(reduce/expand for screen) at whatever is the relevant scale.

Vector graphics has the promise of being deliverable to any scale. It
is end-format agnostic.

Ambisonics has the promise of being deliverable to any speaker array
... it is speaker array agnostic.

Yet when we encode ambisonically, we dont just choose the output
resolution (i.e. the order) _before_ encoding ... we even go so far
as to choose the relevant set of components in line with the target
array. This would be like doing an architeture drawing at the fixed
scale of 1:500 and only using black because your printer doesn't have
red ink and only works with A4. When someone else gets your drawing,
well, they may have a better printer ... but they aint getting a
better printout. That's what we are doing with ambisonics.

In this light, it doesn't make much sense to have ambisonic
encoders/processors that work at specific orders/component combos.

It would be far easier and more sensible if _all_ ambisonic
encoders/processor worked at a fixed higher order. Something
sufficiently high to satisfy large arrays. Lets say 3P for arguments
sake.

Even if your target array is 4 speakers in a square ... then you
encode at 3P (even height info is there .... at 0m). Think of the
benefits:

- no ambiguous channel counts
- no choosing orders
- no choosing component combos
- no choosing whether you will go horizontal only or not
- no plugins that have to work by changeing their input and output
channel counts
- no plugins that only work at a specific order
- only 1 set of plugins (all at the same order and component combo)
- no global settings
- no per-plugin config

The artist/producer does not need to know _ANYTHING_ about ambisonics.
THey dont have to have any target array in mind. They just use the
software and position sounds as they want.

All software / plugins etc, all have 16 in and 16 out... you just wire
them up. A bit like stereo.

Ok ... in the next email, I'll try and list all the negatives of such a scheme.

Etienne

e deleflie

unread,
Jun 6, 2009, 7:32:25 AM6/6/09
to ambis...@googlegroups.com
The challenges of this scheme (that I can think of) are:

- Mixing with 1st order recordings
I know we have already discussed and stated that you cant really
mic low orders into high orders. I didn't quite understand if this was
a mathematical impossibility or just a technical difficulty. Ok, I
know that you cant 'invent' info to fill in higher orders ... but cant
you just "scale up" low order stuff.

- Delivery format
I dont want to download a 16 channel file to play over my 4
speakers. There's 2 ways I can think of to cope with this. a) A 3P
file can be mixed down to 1P b) a compression format might be capable
of compressing the 3P format sufficiently that the file is small
enough that I dont care. The question here, is how much redundant
information is there in a higher order ambisonic signal.

Etienne

Oliver Thuns

unread,
Jun 6, 2009, 8:51:21 AM6/6/09
to ambis...@googlegroups.com
On Sat, Jun 6, 2009 at 1:32 PM, e deleflie<edel...@gmail.com> wrote:
>
> The challenges of this scheme (that I can think of) are:
>
> - Mixing with 1st order recordings
>    I know we have already discussed and stated that you cant really
> mic low orders into high orders. I didn't quite understand if this was
> a mathematical impossibility or just a technical difficulty. Ok, I
> know that you cant 'invent' info to fill in higher orders ... but cant
> you just "scale up" low order stuff.

This is also still a mystery too me.

> - Delivery format
>    I dont want to download a 16 channel file to play over my 4
> speakers. There's 2 ways I can think of to cope with this.

a) A 3P file can be mixed down to 1P

Just discard the HOA components.

b) a compression format might be capable
> of compressing the 3P format sufficiently that the file is small
> enough that I dont care. The question here, is how much redundant
> information is there in a higher order ambisonic signal.

I doubt that for example 3rd order Vorbis gives better results than
1st order FLAC. I'm more and more convinced that psychoacoustic lossy
compression should be avoided, if possible.

Dave Malham

unread,
Jun 8, 2009, 4:30:46 AM6/8/09
to ambis...@googlegroups.com


On 06/06/2009 12:32, e deleflie wrote:
> The challenges of this scheme (that I can think of) are:
>
> - Mixing with 1st order recordings
> I know we have already discussed and stated that you cant really
> mic low orders into high orders. I didn't quite understand if this was
> a mathematical impossibility or just a technical difficulty. Ok, I
> know that you cant 'invent' info to fill in higher orders ... but cant
> you just "scale up" low order stuff.
>
>

The problem is that whilst encoding is orthogonal, that is, you can
continue adding successively higher orders of components to your basic
stream without these having any effect on the pre-existing lower order
components, decoding is not necessarily orthogonal. There is a very
useful table in Jerome's thesis, (table 3.10 on page 184) which
illustrates this. From this we can see that taking a max re 2D decode,
for instance, the first order components decode at 0.707 (note this does
not include layout specific stuff) when included in a first order only
stream but at 0.861 when part of a third order stream. So, if you have a
third order stream in which some (but not all) of the sound sources are
first order only and you choose the optimum decode equations at third
order level because your speaker array can handle it, then those sources
that are first order only will be decode with an error of (0.861 -
0.707) / 0.707, or around 13%. Of course, if the rig can only handle
first order, there's no problem since the decode will need to be correct
for first order and the higher order components (of those sources that
have them) will just be discarded. Afaics, the only solution for this
which, at the same time, retains array agnosticism is to segregate the
sources that do not have the higher order components into a separate
stream, so you would have, say, one set of four signals from a first
order Soundfield mic and and another set of 16 from a third order panner
combined into a stream of 20 components. This is probably handleable if
you have only two variants (first order Soundfield mic and third order
panned) but any more than that (say first order Soundfield mic, third
order mic and 5th order NFC/HOA panned) and it becomes a positive
minefield of potential screw-ups, even with good metadata. In a
non-metadata system, this really doesn't bear thinking about, except
inside a controlled research-type environment. Unfortunately. the only
way to solve this properly is to develop studio quality higher order
microphones.

Best regards,
Dave
>
>
>
>
>
> >
>

--
These are my own views and may or may not be shared by my employer
/*********************************************************************/
/* Dave Malham http://music.york.ac.uk/staff/research/dave_malham/ */
/* Music Research Centre */
/* Department of Music "http://music.york.ac.uk/" */
/* The University of York Phone 01904 432448 */
/* Heslington Fax 01904 432450 */
/* York YO10 5DD */
/* UK 'Ambisonics - Component Imaging for Audio' */
/* "http://www.york.ac.uk/inst/mustech/3d_audio/" */
/*********************************************************************/

e deleflie

unread,
Jun 8, 2009, 9:24:18 PM6/8/09
to ambis...@googlegroups.com
Dave,

thanks for your explanation. I think I get it.

I'm thinking that you wouldn't just put 1st order info in the 1st
order channels ... you'd have to extrapolate (extend?) that 1st-order
info into the higher orders. But I guess the problem remains the same
... different decoding strategies will mean that the extrapolation
into the higher orders will need to be different according to the
decoding strategy taken at the end.

the next obvious (and probably ignorant) thought is, is it not
possible to define/restrict decoding strategy to avoid this problem.
Or will different orthogonal/horizontal decodes always want to use
different strategies to optimise results?

Etienne

Dave Malham

unread,
Jun 9, 2009, 4:21:44 AM6/9/09
to ambis...@googlegroups.com

On 09/06/2009 02:24, e deleflie wrote:
> Dave,
>
> thanks for your explanation. I think I get it.
>
> I'm thinking that you wouldn't just put 1st order info in the 1st
> order channels ... you'd have to extrapolate (extend?) that 1st-order
> info into the higher orders. But I guess the problem remains the same
> ... different decoding strategies will mean that the extrapolation
> into the higher orders will need to be different according to the
> decoding strategy taken at the end.
>
>

Definitely difficult to "extrapolate (extend?) that 1st-order info into
the higher orders"! You could do blind source extraction of the
principle sources and re-pan them into higher order format which is a
bit like Ville's SIRR approach but using Ambisonic rather than VBap
panning, but I'm no great fan of the kind of frequency domain processing
of audio that would be needed for this, due to the inevitable artefacts
and prefer to avoid it if possible. Alternatively, if the Soundfield mic
was used just to capture room sound, rather than the direct sound (which
you would do with close mics) you might be able to use the approach used
in Ambiophonics and synthesize the higher order components based on a
room model, but I am even less convinced that this would produce a
convincing simulacrum of a high quality, high order soundfield mic.


> the next obvious (and probably ignorant) thought is, is it not
> possible to define/restrict decoding strategy to avoid this problem.
> Or will different orthogonal/horizontal decodes always want to use
> different strategies to optimise results?
>
>

Maybe - and there is one strategy which _can_ be made to work (but it is
messy) and that is to have a separate copy of the W of the 1st order
only components available within the stream. This can be ignored when
the decode is to a purely first order rig, but for higher order capable
rigs, the appropriate amount of this W can added/subtracted from the
decode to correct for the error. The problem with this is that the 1st
order only stuff may perceptually stand out as different because it will
have a smaller sweet spot than the higher order stuff. Perhaps not a
problem for one person listening , but for large area work, not so good.

e deleflie

unread,
Jun 9, 2009, 7:45:17 PM6/9/09
to ambis...@googlegroups.com
Dave,

I agree that attempting to extract the locational info from 1st order
would be the wrong approach. It would degrade the information.

> Maybe - and there is one strategy which _can_ be made to work (but it is
> messy) and that is to have a separate copy the W of the 1st order
> only components available within the stream.

My thoughts now are, perhaps a higher order ambisonics system should
simply _not_ be compatible with first order. But that means that we
are excluding recordings ... which is excluding half the good stuff.

Given that 1st order can not easily be encoded into higher orders,
then perhaps it is valid to see it as 2 things .... 1st order, and
higher orders. That kinda makes sense.

Perhaps it is not that insane to have a 20 channel file .... 16 of
higher orders, and 4 of first order. hmmm. I think that really it all
depends on how a compression algorithm could cope with that.

That is a feeling that I am getting .... that perhaps the solution to
the mixed order thing exists in an ambisonics-centric compression
algorithm ....

a compression strategy specifically designed to reduce data rates for
ambisonics signals .... if this existed, it might have the ability to
significantly simplify higher order ambisonics, because it would
effectively offer an alternative way of reducing data rate ... thereby
removing the complexity of mixed orders, whose aim is also to reduce
data rates. (ofcourse, I'm extrapolating that channel counts are not
so much of an issue these days ...)

Etienne
Reply all
Reply to author
Forward
0 new messages