In vector drawings (for the graphic arts) that you work at no scale
but then, when the drawing is finished, you can print out
(reduce/expand for screen) at whatever is the relevant scale.
Vector graphics has the promise of being deliverable to any scale. It
is end-format agnostic.
Ambisonics has the promise of being deliverable to any speaker array
... it is speaker array agnostic.
Yet when we encode ambisonically, we dont just choose the output
resolution (i.e. the order) _before_ encoding ... we even go so far
as to choose the relevant set of components in line with the target
array. This would be like doing an architeture drawing at the fixed
scale of 1:500 and only using black because your printer doesn't have
red ink and only works with A4. When someone else gets your drawing,
well, they may have a better printer ... but they aint getting a
better printout. That's what we are doing with ambisonics.
In this light, it doesn't make much sense to have ambisonic
encoders/processors that work at specific orders/component combos.
It would be far easier and more sensible if _all_ ambisonic
encoders/processor worked at a fixed higher order. Something
sufficiently high to satisfy large arrays. Lets say 3P for arguments
sake.
Even if your target array is 4 speakers in a square ... then you
encode at 3P (even height info is there .... at 0m). Think of the
benefits:
- no ambiguous channel counts
- no choosing orders
- no choosing component combos
- no choosing whether you will go horizontal only or not
- no plugins that have to work by changeing their input and output
channel counts
- no plugins that only work at a specific order
- only 1 set of plugins (all at the same order and component combo)
- no global settings
- no per-plugin config
The artist/producer does not need to know _ANYTHING_ about ambisonics.
THey dont have to have any target array in mind. They just use the
software and position sounds as they want.
All software / plugins etc, all have 16 in and 16 out... you just wire
them up. A bit like stereo.
Ok ... in the next email, I'll try and list all the negatives of such a scheme.
Etienne
- Mixing with 1st order recordings
I know we have already discussed and stated that you cant really
mic low orders into high orders. I didn't quite understand if this was
a mathematical impossibility or just a technical difficulty. Ok, I
know that you cant 'invent' info to fill in higher orders ... but cant
you just "scale up" low order stuff.
- Delivery format
I dont want to download a 16 channel file to play over my 4
speakers. There's 2 ways I can think of to cope with this. a) A 3P
file can be mixed down to 1P b) a compression format might be capable
of compressing the 3P format sufficiently that the file is small
enough that I dont care. The question here, is how much redundant
information is there in a higher order ambisonic signal.
Etienne
This is also still a mystery too me.
> - Delivery format
> I dont want to download a 16 channel file to play over my 4
> speakers. There's 2 ways I can think of to cope with this.
a) A 3P file can be mixed down to 1P
Just discard the HOA components.
b) a compression format might be capable
> of compressing the 3P format sufficiently that the file is small
> enough that I dont care. The question here, is how much redundant
> information is there in a higher order ambisonic signal.
I doubt that for example 3rd order Vorbis gives better results than
1st order FLAC. I'm more and more convinced that psychoacoustic lossy
compression should be avoided, if possible.
On 09/06/2009 02:24, e deleflie wrote:
> Dave,
>
> thanks for your explanation. I think I get it.
>
> I'm thinking that you wouldn't just put 1st order info in the 1st
> order channels ... you'd have to extrapolate (extend?) that 1st-order
> info into the higher orders. But I guess the problem remains the same
> ... different decoding strategies will mean that the extrapolation
> into the higher orders will need to be different according to the
> decoding strategy taken at the end.
>
>
Definitely difficult to "extrapolate (extend?) that 1st-order info into
the higher orders"! You could do blind source extraction of the
principle sources and re-pan them into higher order format which is a
bit like Ville's SIRR approach but using Ambisonic rather than VBap
panning, but I'm no great fan of the kind of frequency domain processing
of audio that would be needed for this, due to the inevitable artefacts
and prefer to avoid it if possible. Alternatively, if the Soundfield mic
was used just to capture room sound, rather than the direct sound (which
you would do with close mics) you might be able to use the approach used
in Ambiophonics and synthesize the higher order components based on a
room model, but I am even less convinced that this would produce a
convincing simulacrum of a high quality, high order soundfield mic.
> the next obvious (and probably ignorant) thought is, is it not
> possible to define/restrict decoding strategy to avoid this problem.
> Or will different orthogonal/horizontal decodes always want to use
> different strategies to optimise results?
>
>
Maybe - and there is one strategy which _can_ be made to work (but it is
messy) and that is to have a separate copy of the W of the 1st order
only components available within the stream. This can be ignored when
the decode is to a purely first order rig, but for higher order capable
rigs, the appropriate amount of this W can added/subtracted from the
decode to correct for the error. The problem with this is that the 1st
order only stuff may perceptually stand out as different because it will
have a smaller sweet spot than the higher order stuff. Perhaps not a
problem for one person listening , but for large area work, not so good.