(I have also posted this to the Ambisonics Association reflector)
For the Ambisonics Symposium, I have now finalized and submitted my
paper on mixed-order schemes. I have also tightened up its conclusions
and recommendations, and adjusted its abstract to match. The adjusted
abstract is viewable at
<http://ambisonics.iem.at/symposium2009/authors/a-new-mixed-order-scheme-for-ambisonic-signals>.
Here's a copy:
>>Traditionally the directional resolution of a 3D Ambisonic signal is
>>uniform over the sphere. It is determined by a single scaling
>>parameter, the periphonic order P. Recently there has been increasing
>>interest in mixed-order schemes that provide higher resolution in the
>>horizontal plane than at the poles. The most widely known is a
>>two-parameter scheme (#H#P) in which the signal is the union of a
>>higher-order horizontal-only component set and a lower-order
>>fully-periphonic component set. We present an alternative
>>two-parameter scheme (#H#V) which truncates the spherical harmonic
>>expansion in a different way. It gives resolution-versus-elevation
>>curves that are flatter, in and near the horizontal plane. The paper
>>includes simulation results for various mixed-order signals and
>>speaker layouts. On the basis of these results the author recommends
>>deprecating #H#P signals with P greater than 1.
Note the final sentence in particular.
If that recommendation is adopted, we will have the following
categories of Ambisonic signal:
- Fully periphonic (P+1)^2 components
- Mixed order (H+1)^2 - (H-V)^2 components
- Horizontal plus Z 2H+2 components
- Horizontal only 2H+1 components
(Plus of-course a "none-of-the-above" category, for custom combinations
of components and matrixed components, e.g. as supported by the
Ambisonics portion of MPEG-4 Part 11.)
For an interface spec a three-parameter approach as discussed here
previously would still be workable, in the sense that it would be more
general than the above list. But other approaches might now be more
appealing. For example, one could instead have two parameters and a
switch. The two parameters could be H and V, with V=H covering
fully-periphonic and V=0 covering the horizontal-only and
horizontal-plus-Z cases. The switch would distinguish between
horizontal-only and horizontal-plus-Z. In the terms of the
three-parameter approach, the switch could be regarded as a P-minus-V
field, constrained to take a value of zero or one.
Discussion of this can wait until the Symposium. But I thought a
"heads-up" might be appropriate now.
Chris Travis
first thoughts:
Do I understand it correctly, that the HV scheme is the subset of the
HVP scheme where P=V?
I think the simplification from HVP to HV is a good thing, less
options are better.
I believe HV makes a good (non-psychoacoustic) lossy compression
scheme (= consumer format), but I cannot imagine it as a production
standard.
>[OT] Do I understand it correctly, that the HV scheme is the subset of
>the HVP scheme where P=V?
Yes.
>[OT] I believe HV makes a good (non-psychoacoustic) lossy compression
>scheme (= consumer format), but I cannot imagine it as a production
>standard.
I don't think Ambisonics has been served well by the idea that the
ultimate associated speaker layout is spherical. Dedicating equal
resource to polar sounds as to horizontal-plane sounds, e.g. seeking to
render them with the same directional fidelity, is in-fact a bad
move. This is true from multiple different perspectives, including
those of live-music recordists, content constructors/composers,
psychoacousticians and end-users. [It is not true from the point of
view of "big-project academics" or niche architects, but they can look
after themselves in other ways.]
It would be good to wean people off of the idea that 3D Ambisonics is
about regular-polyhedral speaker arrays. Such arrays have some
distinctly bad points. If you render a 1P signal conventionally to a
regular 3D array you get rV=0.58 (quite a lot of blur). But if you
render it to a horizontal array you get rV0.71 (noticably less blur)
for all important source directions. So in that sense, going from 2D
to 3D makes things worse! This is really just an illustration that
regular 3D arrays are *not* the ultimate speaker layout. This comment
applies to Ambisonics, but also more broadly. We would do well to look
at the 10.2 and 22.2 proposals, and ask why they differ so much from
e.g. the experimental spherical and hemispherical systems at various
academic institutions dotted around the world. It is because they come
from people unencumbered by a knowledge of the maths! They come from
people who have been placing much more reliance on their real-world
experience of what works and what doesn't, in practice.
The mixed-order thinking relates very-much to these points. Yes,
mixed-order schemes have a bit-rate-reduction aspect, but I don't see
that as being particularly important. The exercise more to do with
changing what Ambisonics is perceived to be about (and as a result,
changing what Ambisonics ends up being about).
---
Taking a different tack.. My background is in broadcasting. My
experience is that the need for streamlined practices and resource
efficiency can be pretty strong on the production side. If a
broadcaster wants to explore an HOA approach to implementing NHK 22.2,
we could try to interest them in e.g. 4H1V (16 channels) or 4P (25
channels). They'd want some reassurance about the achievable
horizontal resolution across the front stage, so might want to look at
5H1V (20 channels) and 5P (36 channels). My experience tells me that
the 4P and 5P options would never fly. Furthermore, you tend to get
one chance only with these things. If you went in emphasizing 4P and
5P, you'd have blown your one chance.
Chris Travis
>[ED] after all these discussions of mixed order schemes, when you
>consider the benefit to those working in ambisonics vs the complexity
>it introduces (especially in software authoring environments), I still
>believe discussions on mixed-order schemes is a misplaced focus.
Sorry that I have not responded before to the details of your
'Universal Ambisonic' assay. With one thing and another I have been
rather short of time recently. But here are some quick comments now..
As I understand it, the scheme requires that modules have a way of
knowing how many active channels have been connected to them. As far
as I can tell at the moment, this is the single biggest problem with
the proposal. It limits its scope to a small minority of DAWs, and
hence means that it fails to meet its aims. I have stated this in very
black-and-white terms so that you can counter it with similar
forthrightness if it is wrong! :-)
I think the better way forward is to accept that modules will need
configuring, even if just once on installation. It would be grand if
the configuration could be via some kind of global(s), so that a single
central change is seen by multiple modules at the same time.
At the very simplest, the configuration could be a single binary switch
selecting fully periphonic or horizontal only. But when putting the
mechanism in place, it would be sensible to make the field larger than
1 bit wide, even if 'Universal Ambisonic' places implementation
requirements on only two of the 2^n codes. In the case of a global, it
might also be a good idea to make the mechanism extendable to cover
e.g. two globals.
So how might this way of doing things mesh with the mixed-order
thinking? Well, the one global could be V. V=0 gives us the
horizontal-only case. If the global is e.g. a four-bit unsigned
integer, V=15 give us the fully-periphonic case. (Actually, if the
total number of channels the system is capable of is e.g. 64, V being
any value greater than 6 gives us the fully-periphonic case. This
simply falls out of the maths. No need to include catches for special
values.)
The H parameter possibly does not need to be explicitly
communicated. This is because it doesn't affect the
component-to-channel mapping. It only determines how many channels are
active, within that mapping. On the other hand, one can imagine
situations in which it would be valuable to have the H value globally
available. For example, it could be used (with V) to automatically set
the width of inter-module connections.
In summary, I think 'Universal Ambisonic' is broken and that fixing it
necessarily involves having some kind of configuration mechanism. (Or
supporting e.g. just fully-periphonic signals and processing!) Along
the way, some effort should be put into establishing a framework for
one-or-two Ambisonic globals. Implementation mechanisms for these
could be ad-hoc or more general, on a case-by-case basis. While fixing
Universal Ambisonic, one could quite-easily include the hooks for
mixed-order signals. One wouldn't have to make a big deal this at
present. Just get the hooks in, so that we are better equipped for
what the future will bring.
Chris Travis
On balance, 2H1V.
Chris Travis
I made up yet another Ambisonics "standard" by staring at 3D
renderings of the components for a while. Working title "Ambisonics
for the Living Room":
4 ch: 1H1V (1P)
8 ch: 2H1V
12 ch: 3H1V
16 ch: 4H1V
20 ch: 5H1V
24 ch: 6H1V
and just realized that Chris put exactly this group of mappings in the
following document under the name "H1V family"
http://ambisonics.googlegroups.com/web/Reworked+mapping+table+(two+versions)+V3.PDF
My point is that I would like to see a (consumer) standard that is as
simple as the H1V subset and doesn't waste channels for underfloor
loudspeakers.
Here is a more-recent mapping table.
<http://ambisonics.googlegroups.com/web/Mappings%20for%20all%20%23H%23V%23P%20combinations%20with%20up%20to%2016%20components%20.PDF?>
Actually, this is a figure from my forthcoming paper. The paper
includes simulation results for 2H1V, 3H1V and 4H1V signals played
over dual-ring speaker layouts as illustrated for-example in Eric
Benjamin's recent paper on "Ambisonic Loudspeaker Arrays" (AES
Convention Paper 7605, October 2008). It also includes results for
3H1P and 7H1P signals over dual-ring layouts, and for 3P signals
over a 20-speaker icosahedral layout.
Chris Travis