Tuning projection inside PVQ for 0.5+ dB improvement

瀏覽次數：143 次

跳到第一則未讀訊息

dud...@gmail.com

未讀,

2017年5月3日上午11:58:392017/5/3

收件者：Codec Developers

Dear AV1 team,

Looking for an interesting problem in AV1 for a lone theoretist, a natural choice was PVQ (any others?). In the past few days I have found and tested an essential improvement, yesterday wrote it down and just submitted to arxiv: https://www.dropbox.com/s/jgghkeov1swu53x/pvq.pdf

So the PVQ “natural choice” of radial projection between L1 and L2 unit spheres (“just normalize”) looked highly suspicious for being the optimal one – I thought that a more subtle transformation should distribute quantization points more uniformly – reducing MSE.

While finding the really optimal one seems an extremely difficult task, a simple parametrized family “power projection”:

x_i -> normalize((x_i ^p)_i)

for optimized parameter p (like 1.3) allows for really huge improvements – typically above 10% reduction of MSE for K > L/2. I see a way to try to improve it with a family having more parameters – I will probably squeeze a bit more here.

PVQ for K < L/2 has just terrible performance – degenerates to something what can be easily beaten with different quantizers (I have found up to 3dB improvement)… but to do it right will need some further work – let me know if it is worth the effort? What are typical (L,K) used?

Related topic is that I think a major obstruction for using PVQ in video compression was told to be extreme computational cost? If so, the main source is probably the costly combinatorial enumerative coding for L1 sphere?

This problem should be repairable by replacing it with a fast accurate entropy coder, by the way handling the issue of working with fractional bits while storing the final PVQ number (bounded by not a power of 2) – let me know if it is worth working on?

Alternatively, you can use L-infinity sphere (instead of L1), which is much cheaper to encode (just Z^L lattice), but leads to worse MSE (L1 lattice is closer to sphere packing).

And generally I wanted to ask about the summary of current status of PVQ in AV1?

Best regards,

Jarek Duda

Jean-Marc Valin

未讀,

2017年5月3日下午2:23:092017/5/3

收件者：codec...@webmproject.org、Jarek Duda

Hi,

On 03/05/17 11:58 AM, dud...@gmail.com wrote:
> Looking for an interesting problem in AV1 for a lone theoretist, a
> natural choice was PVQ (any others?). In the past few days I have found
> and tested an essential improvement, yesterday wrote it down and just
> submitted to arxiv: https://www.dropbox.com/s/jgghkeov1swu53x/pvq.pdf

Interesting. Do you have some images of what the points in your best
codebook looks like for L=2 and L=3? One minor comment about your
intro... The Householder reflection part of PVQ only applies to
Daala/AV1 because we came up with it after Opus was frozen. It should be
applicable to audio too, but it would require careful application to
avoid "desynchronization" on rounding errors and on packet loss.

> So the PVQ “natural choice” of radial projection between L1 and L2 unit
> spheres (“just normalize”) looked highly suspicious for being the
> optimal one – I thought that a more subtle transformation should
> distribute quantization points more uniformly – reducing MSE.

It all rests on the definition you use for "optimal". Every codebook is
optimal for some distribution of the unquantized data. Your implicit
assumption (as far as I can tell) is that the data we are quantizing is
Gaussian-distributed, which means that we want uniform codewords on the
surface of the L-sphere.

In the case of audio MDCT coefficients, as far as I can tell the
distribution is closer to being Laplacian. In addition, the regions in
coefficient space where the PVQ codebook has the highest density are the
ones where there is one or two coefficients that are much larger than
the others. That corresponds to tones -- for which we *do* want better
resolution because noise is more perceptible. So while the codebook is
far from being "optimal" in the sense most commonly considered for
coding theory, it's pretty good for audio.

In the case of video, IIRC (it's been a while) the non-predicted video
DCT coefficients were also Laplace-like, but I don't remember really
measuring with the prediction. In the case of video, the main drawback I
can see with your current codec formulation is computing the power
itself. While it can easily be done with a lookup table, table lookups
cannot be vectorized on most CPUs. This would not be much of an issue
for an audio codec like Opus, but for video (where vecotrization is
critical), it would likely be a big deal.

> While finding the really optimal one seems an extremely difficult task,
> a simple parametrized family “power projection”:
>

> *x_i -> normalize((x_i ^p)_i)*

Do you have a refinement step after the projection? Especially for
Daala/AV1 we place the last few "pulses" using RDO (see below about rate).

> PVQ for K < L/2 has just terrible performance – degenerates to something
> what can be easily beaten with different quantizers (I have found up to
> 3dB improvement)… but to do it right will need some further work – let
> me know if it is worth the effort? What are typical (L,K) used?

It's really all over the place depending on the encoding quality, the
frequency band, ... In Opus, the value of L ranges from 2 all the way
up to 176, with K ranging from 1 up to 256. When K would be too large,
or if the codebook would be more than 32 bits, we actually split the
vector (before quantization) in two so that indexing only works on 32
bits. The corresponding "bit depth" ranges from close to 0, up to around
8 bits/coefficient. For low bit depths, we apply a "spreading rotation"
to avoid the output sounding too tonal.

For video, L ranges from 8 to 128, while K is theoretically unbounded.
We tend to go up to about 8 bits/coef of depth, but < 1 bit/coeff is
more typical.

> Related topic is that I think a major obstruction for using PVQ in video
> compression was told to be extreme computational cost? If so, the main
> source is probably the costly combinatorial enumerative coding for L1
> sphere?

Actually, the enumeration is only done for Opus because we want
predictable rates and the coefficients we're coding together in a vector
tend to have similar variances. This is not true of video, where the
coefficient with the lowest frequency in the vector tends to have a much
higher variance than the coefficients with the highest frequency in that
band -- even though we already split the coefficients into many bands
(which we also do for audio). This means we have to figure out a way to
more efficiently code the PVQ codeword to take advantage of the
difference in variance. The first scheme we used is described in this:
https://jmvalin.ca/notes/pvq_encoding.pdf
but it was too slow. What we're currently using is a scheme where
knowing K and L, we split the vector (after quantization this time) in
two, code how K is split between the two parts, and continue
recursively. The distribution of how to split K is modeled so that we
can learn the difference in variance.

Of course, the new coder is also far from perfect -- in part because it
requires many adaptation contexts -- theoretically one for each K/L
pair, but we take shortcuts on larger Ks.

Of course when it comes to decoder complexity, there's also the issue
that we have to compute the forward DCT of the prediction (which you
don't need for scalar). On the encoder side, the main issue with
complexity is the search, and the fact that AV1 runs many many codebook
searches in its RDO look (IIRC about 1000x more than Daala for the same
image).

> This problem should be repairable by replacing it with a fast accurate
> entropy coder, by the way handling the issue of working with fractional
> bits while storing the final PVQ number (bounded by not a power of 2) –
> let me know if it is worth working on?

Yeah, our problem isn't with the entropy coder itself, but the
tokenization part, along with the probability modelling because of the
non-uniform distribution.

Cheers,

Jean-Marc

signature.asc

dud...@gmail.com

未讀,

2017年5月3日下午5:00:022017/5/3

收件者：Codec Developers、dud...@gmail.com

Hi Jean-Marc,

Interesting. Do you have some images of what the points in your best
codebook looks like for L=2 and L=3? One minor comment about your
intro... The Householder reflection part of PVQ only applies to
Daala/AV1 because we came up with it after Opus was frozen. It should be
applicable to audio too, but it would require careful application to
avoid "desynchronization" on rounding errors and on packet loss.

Thanks, I will probably work on two-parameter transformation (and maybe L-infinity) this weekend and update also the intro in a few days...
... especially that I have just noticed that I have switched p and 1/p (I have started writing it yesterday morning).
For L=2 the trigonometric formula (8) is the best you can do.
For L=3 sure I can generate 2D projections - but it is best to be able to rotate it - maybe just put this to Mathematica (or analogous somewhere else):

k = 15; p = 1/1.24;
t = Table[{s1, s2, s3} * Normalize[{i, j, k - i - j}^p], {i, 0, k}, {j, 0, k - i}, {s1, -1, 1, 2}, {s2, -1, 1, 2}, {s3, -1, 1, 2}];
Graphics3D[Point[Flatten[t, 4]], Boxed -> False]

It generated the left one below, the right one is for standard radial projection (p=1) - you might see some better uniformity ;)

It all rests on the definition you use for "optimal". Every codebook is
optimal for some distribution of the unquantized data. Your implicit
assumption (as far as I can tell) is that the data we are quantizing is
Gaussian-distributed, which means that we want uniform codewords on the
surface of the L-sphere.

Indeed, I am minimizing MSE assuming uniform distribution over Euclidean sphere (formula (4)) - what seemed natural for Perceptual VQ application(?)

In the case of audio MDCT coefficients, as far as I can tell the
distribution is closer to being Laplacian. In addition, the regions in
coefficient space where the PVQ codebook has the highest density are the
ones where there is one or two coefficients that are much larger than
the others. That corresponds to tones -- for which we *do* want better
resolution because noise is more perceptible. So while the codebook is
far from being "optimal" in the sense most commonly considered for
coding theory, it's pretty good for audio.

I think similar deformation can be also used to tune for other probability distributions ...
I am new in this field of vector quantization - thanks for suggesting Laplace, I will probably take a closer look in some near future.

In the case of video, IIRC (it's been a while) the non-predicted video
DCT coefficients were also Laplace-like, but I don't remember really
measuring with the prediction. In the case of video, the main drawback I
can see with your current codec formulation is computing the power
itself. While it can easily be done with a lookup table, table lookups
cannot be vectorized on most CPUs. This would not be much of an issue
for an audio codec like Opus, but for video (where vecotrization is
critical), it would likely be a big deal.

My choice of this "power projection" was mainly motivated by low computational cost - the only required addition to current PVQ are tables for calculating the power, what seems reasonable for ~0.5 dB gain (?)
Two parameter transformations will rather be more costly ...

> While finding the really optimal one seems an extremely difficult task,
> a simple parametrized family “power projection”:
>
> *x_i -> normalize((x_i ^p)_i)*

Do you have a refinement step after the projection? Especially for
Daala/AV1 we place the last few "pulses" using RDO (see below about rate).

The p would probably have to be fixed (like 1.3) as there are needed prepared tables for power (maybe prepared for a few powers) ...

It's really all over the place depending on the encoding quality, the
frequency band, ... In Opus, the value of L ranges from 2 all the way
up to 176, with K ranging from 1 up to 256. When K would be too large,
or if the codebook would be more than 32 bits, we actually split the
vector (before quantization) in two so that indexing only works on 32
bits. The corresponding "bit depth" ranges from close to 0, up to around
8 bits/coefficient. For low bit depths, we apply a "spreading rotation"
to avoid the output sounding too tonal.

My question here if you often have the low K case: K < L/2 ?
PVQ is just terrible there, like for L = 15, K = 4 you need ~15 bits, getting MSE ~ 0.47 ... in contrast, just storing signs of coordinates costs also 15 bits here, but has MSE ~ 0.24, what is 2.9 dB better ...
"Storing signs" is a special case of using L-infinity ball ... it generally might bring more help for the low K case - I will take a closer look.

For video, L ranges from 8 to 128, while K is theoretically unbounded.
We tend to go up to about 8 bits/coef of depth, but < 1 bit/coeff is
more typical.

<1 bit/coef is definitely the low K case ... just storing signs you get exactly 1 bit/coef ... in the above L=15 example getting 2.9 dB gain comparing to PVQ ...
I will work on that ...

Actually, the enumeration is only done for Opus because we want
predictable rates and the coefficients we're coding together in a vector
tend to have similar variances. This is not true of video, where the
coefficient with the lowest frequency in the vector tends to have a much
higher variance than the coefficients with the highest frequency in that
band -- even though we already split the coefficients into many bands
(which we also do for audio). This means we have to figure out a way to
more efficiently code the PVQ codeword to take advantage of the
difference in variance. The first scheme we used is described in this:
https://jmvalin.ca/notes/pvq_encoding.pdf
but it was too slow. What we're currently using is a scheme where
knowing K and L, we split the vector (after quantization this time) in
two, code how K is split between the two parts, and continue
recursively. The distribution of how to split K is modeled so that we
can learn the difference in variance.

Of course, the new coder is also far from perfect -- in part because it
requires many adaptation contexts -- theoretically one for each K/L
pair, but we take shortcuts on larger Ks.

Of course when it comes to decoder complexity, there's also the issue
that we have to compute the forward DCT of the prediction (which you
don't need for scalar). On the encoder side, the main issue with
complexity is the search, and the fact that AV1 runs many many codebook
searches in its RDO look (IIRC about 1000x more than Daala for the same
image).

Thanks, I will have to take a closer look here.

Cheers,
Jarek

Auto Generated Inline Image 1

Jean-Marc Valin

未讀,

2017年5月3日晚上11:36:532017/5/3

收件者：codec...@webmproject.org、dud...@gmail.com

> Indeed, I am minimizing MSE assuming uniform distribution over
> Euclidean sphere (formula (4)) - what seemed natural for Perceptual VQ
> application(?)

So MSE is generally fine even when you're optimizing the codec for
subjective quality because the "perceptual" part is usually handled at a
higher level, e.g. by setting K or by optimizing block size decisions.
The low-level codebook search itself is usually MSE-based.

As I said earlier, the important thing is the distribution of the data.
I think many sources are closer to being Laplace-distributed than
Gaussian distributed, but there's nothing that beats real data. This is
especially true for video because it's not just the shape of the
distribution, but how the variance changes for different coefficients.
Even the best possible Laplace quantizer would perform significantly
worse than dumb scalar quantization if it doesn't take into account the
difference in variance (we've learned that the hard way).

> My choice of this "power projection" was mainly motivated by low
> computational cost - the only required addition to current PVQ are
> tables for calculating the power, what seems reasonable for ~0.5 dB gain (?)
> Two parameter transformations will rather be more costly ...

You might actually have better luck with (low-order) polynomials. While
multiplications aren't cheap, at least they vectorize well. With 16-bit
precision, AVX2 will let you compute 16 multiplies per cycle, while
you'd only be able to perform one lookup in that same cycle.

> My question here if you often have the low K case: K < L/2 ?

It's really all over the place sometimes you have pretty large Ks, but
small Ks are also very common. One of the most commons is actually K=1,
which I *believe* is actually optimal with the normal codebook, even if
K>1 isn't. Of course, the most common of all is K=0 because the only way
to efficiently compress video is to not code anything for as many blocks
as possible.

> PVQ is just terrible there, like *for L = 15, K = 4 you need ~15 bits,

> getting MSE ~ 0.47 ... in contrast, just storing signs of coordinates

> costs also 15 bits here, but has MSE ~ 0.24, what is 2.9 dB better ...*

> "Storing signs" is a special case of using L-infinity ball ... it
> generally might bring more help for the low K case - I will take a
> closer look.

Again, be careful about the distribution. Using Gaussian distributions
is nice for developing theoretical calculations and makes nice papers,
but it doesn't give you a better video codec.

Cheers,

Jean-Marc

signature.asc

dud...@gmail.com

未讀,

2017年5月4日清晨7:45:572017/5/4

收件者：Codec Developers、dud...@gmail.com

On Thursday, May 4, 2017 at 5:36:53 AM UTC+2, Jean-Marc Valin wrote:

So MSE is generally fine even when you're optimizing the codec for
subjective quality because the "perceptual" part is usually handled at a
higher level, e.g. by setting K or by optimizing block size decisions.
The low-level codebook search itself is usually MSE-based.

As I said earlier, the important thing is the distribution of the data.
I think many sources are closer to being Laplace-distributed than
Gaussian distributed, but there's nothing that beats real data. This is
especially true for video because it's not just the shape of the
distribution, but how the variance changes for different coefficients.
Even the best possible Laplace quantizer would perform significantly
worse than dumb scalar quantization if it doesn't take into account the
difference in variance (we've learned that the hard way).

Sure Laplace is interesting, but I thought this distribution is rather used only for the norm of the vector (?) - after encoding this norm (using Laplace ... just exponential) and normalizing, there remains to encode an unitary vector from Euclidean sphere (direction) - and, due to symmetry, uniform probability here seems the only reasonable assumption (?)
Also for Perceptual VQ: after normalization, Hausholder reflection, encoding the angle - there has left the problem of encoding from unitary Euclidean sphere - and uniform probability distribution there seems the only reasonable assumption (?)

I wanted to focus on this case now for time sensitive AV1 - is PVQ planned to be used there? Could ~0.5dB boost improve its chance?

Here are some better pictures for L=3, the p=1.18 power projection gets ~11% lower MSE (~0.45dB) :

You might actually have better luck with (low-order) polynomials. While
multiplications aren't cheap, at least they vectorize well. With 16-bit
precision, AVX2 will let you compute 16 multiplies per cycle, while
you'd only be able to perform one lookup in that same cycle.

Sure multiplications are cheaper than using tables on modern processors ... however, for hardware implementations (which seem the final target) using tabled function seems cheaper (?)

Anyway, choosing such deformation is not an easy task as you also need to cheaply inverse it - the "power projection" is cheap to invert, while basing on polynomial would be cheap in one direction (e.g. decoder), while inverting it would need e.g. a few iterations of Newton method (expensive encoder).
I will think about other approaches, but this "power projection" seems a really nice compromise.

> My question here if you often have the low K case: K < L/2 ?

It's really all over the place sometimes you have pretty large Ks, but
small Ks are also very common. One of the most commons is actually K=1,
which I *believe* is actually optimal with the normal codebook, even if
K>1 isn't. Of course, the most common of all is K=0 because the only way
to efficiently compress video is to not code anything for as many blocks
as possible.

So honestly check it - PVQ for small K is just terrible.

Here is Mathmatica test for PVQ L=15, K=4, which uses 15.0554 bits:

quant[va_, k_] := ( vk = k*va; vr = Round[vk]; kr = Total[vr];
If[kr != k,     (* repair quantization *)
   If[k > kr,
    dif = vr - vk; ord = Ordering[dif]; (*sort by differences*)
    Do[vr[[ord[[i]]]]++, {i, k - kr}],
    dif = vk - vr - Sign[vr]; (* Sign[] prevents reducing 0 *)
    ord = Ordering[dif]; Do[vr[[ord[[i]]]]--, {i, kr - k}]
    ]]; vr/k)

L = 15; k = 4; n = 100000; p = 1;
lst = Table[v = Normalize[RandomReal[1, L] - 1]; vs = Sign[v];
   va = Abs[v];
   va = va^p; va = va/Total[va];   (* S2 -> S1 *)
   va = quant[va, k];    (* quantization in S1 *)
   va = va^(1/p); va = Normalize[va];     (* S1 -> S2 *)
   Norm[va*vs - v]^2, {i, n}];
mp = Mean[lst]

It returns MSE ~ 0.473

Here is test for quantization just storing signs - also needs 15 bits for L=15:

L = 15;
Mean[
Table[v = Normalize[RandomReal[1, L] - 0.5]; vs = Normalize[Sign[v]];
Norm[v - vs]^2, {i, 100000}]]

It returns MSE ~ 0.262

Using PVQ for low K is just a terrible idea!

Again, be careful about the distribution. Using Gaussian distributions
is nice for developing theoretical calculations and makes nice papers,
but it doesn't give you a better video codec.

Please explain how you could repair this gigantic 0.472 vs 0.262 MSE difference for L=15 and 15 bits?

Cheers,
Jarek

Auto Generated Inline Image 1

dud...@gmail.com

未讀,

2017年5月4日下午5:11:362017/5/4

收件者：Codec Developers、dud...@gmail.com

My first attempt for a better replacement of PVQ for low bitrates - MSE in vertical axis, number of bits in horizontal for L=20. PVQ is orange (K=1..10):

MAVQ - Maximal Absolute Value Quantizer for parameter M:
- order coordinates by absolute values,
- for M largest write their +-1 sign,
- for the remaining write zero
- normalize
It costs M + lg(binomial(L,M)) bits.

In the upper-left corner we have M=1 to 5 cases, they are the same or slightly better than PVQ.
For higher M it behaves poorly ... up to the M=20 case which clearly stands out and is just this "store (all) signs" quantizer (20 bits) mentioned before.

It is a bit better than PVQ and is cheaper to process ...

ps. Tests in Mathematica:
L = 20;
res = Table[{M + Log[2., Binomial[L, M]],
   Mean[Table[w = v = Normalize[RandomReal[1, L] - 0.5];
     ord = Ordering[Abs[v]];
     Do[w[[ord[[i]]]] = Sign[w[[ord[[i]]]]], {i, L - M + 1, L}];
     Do[w[[ord[[i]]]] = 0, {i, L - M}];
     Norm[v - Normalize[w]]^2, {j, 10000}]]}, {M, 20}]

Auto Generated Inline Image 1

dud...@gmail.com

未讀,

2017年5月5日下午2:38:102017/5/5

收件者：Codec Developers、dud...@gmail.com

For the issue of nonuniform probability distribution on the Euclidean sphere, I have worked on a more flexible calibration of the lattice of codewords - to allow to condensate or dilute around chosen points - for example:

So we choose a point c inside S_1^+ simplex.
For any vector v, we draw line segment from c through v up to a point on the boundary (e).
Our v has a [0,1] position on this line segment.
Then we apply some growing bijection f: [0,1] -> [0,1], for example a power. Decoder have to inverse this function (the same line segment). In practice both function and inverse can be put into a table. This function can generally depend on v-c.
Use this new distance.

e = c + a (v-c)
where 'a' can be found from minimum over coordinates to get zero coordinate.
the final vector is
c + f(1/a) (e-c)

Encoder projects to S_1, applies above S_1^+ -> S_1^+ for absolute values of coordinates, quantizes.
Decoder applies above using inverse of f, projects to S_2.

If you want you can play with it, let me know if I can help here or with finding some other concept/optimization (otherwise I should focus on my normal work).

Cheers,
Jarek

ps. Mathemtica used to generate above:
k = 32; c = {1, 0.01, 0.01}; c /= Total[c]; f[x_] := x^0.7;
tb = Flatten[Table[v = {i, j, k - i - j}/k;
    d = c/(c - v);
    ds = Sort[d]; j = 1; While[ds[[j]] < 0, j++]; a = ds[[j]];
    e = c + a*(v - c);         (* point from boundary *)
    Normalize[c + f[1/a]*(e - c)]
    , {i, 0, k}, {j, 0, k - i}], 1];
tbb = Flatten[
   Table[v =
     tb[[i]]; {v, {-1, 1, 1}*v, {1, -1, 1}*v, {-1, -1, 1}*v}, {i,
     Length[tb]}], 1];
Graphics3D[Point[tbb], Boxed -> False, ViewPoint -> {0, 0, Infinity}]

Auto Generated Inline Image 1

dud...@gmail.com

未讀,

2017年5月6日下午6:43:472017/5/6

收件者：Codec Developers、dud...@gmail.com

For the situation with normalized vector of coordinates from Laplace distributions of different widths - here is a relatively inexpensive general family of calibrations S_1^+ -> S_1^+ :
As in the previous post, but for the central point c being one of vertices, for example: c = (1,0,0, ...), and the nonlinear function f being dependent on the direction (widths of Laplace).

Let va be vector of absolute coordinates (va = Abs[v]) in S_1^+ (sums to 1).
denote w = (va_2, va_3, ... , va_L) as va with removed the chosen coordinate (probably the lowest or highest width direction), it sums to 1-va_1.
u = w / (1 - va_1) sums to 1, it defines direction for our function f - as applied power here:
p = sum_i u_i p_i where p_i are parameters dependent on widths of Laplace distributions - the simplest dependency
now apply the function: s = (1 - va_1)^p or s = (1 - va_1)^(1-p) depending on direction
The final vector of absolute coordinates is (1 - s, s * u_1, s * u_2, ..., s * u_{L-1})

For p=1 it is identity in S_1^+

Encoder projects S_2 -> S_1, applies above for absolute coordinates using s = (1 - va_1)^p
Decoder applies above using inverted: s = (1 - va_1)^(1-p) then projects to S_2

The question is how to choose this p basing on directions: p = sum_i u_i p_i
(or 1/p = sum_i u_i p_i )
These p_i parameters should intuitively depend monotonically on widths of Laplace distributions, but learning how to choose them properly will rather need trial-and-errors.
The arbitrary power can be realized using two tables and multiplication: x^p = exp(p * ln(x))

dud...@gmail.com

未讀,

2017年5月9日清晨5:14:142017/5/9

收件者：Codec Developers、dud...@gmail.com

One more issue - the Hausholder reflection of Perceptual VQ - it usually rotates the original directions, for which it might be safe to assume independent Laplace distributions with different (fixed estimated) standard deviations - e.g. larger for low number Fourier coefficients.

I was originally afraid that it is a crucial problem, but now I see that it is not - we can find the above above power p defining deformation accordingly to direction in the original coordinates instead of the Hausholder reflected one (lines remain lines under such reflection).
A simple way to modify algorithm from the previous post is:

w1 = (original) Hausholder reflection of (0, 0, va_2, va_3, ... , va_L)
w2 = Abs[w1];     // absloute coordinates
w2 /= Total[w2];    // normalize to L1 (convex combination)
p = sum_i w2_i * p_i     // or 1/p = sum_i w2_i * p_i

u = (va_2, va_3, ... , va_L) / (1 - va_1)
s = (1 - va_1) ^ p    // or ^(1-p) for inverse
the final vector is (1-s, s* u_1, s* u_2, ... , s * u_{L-1})

this time p_i correspond to standard deviations in the original coordinates (before Hausholder reflection).

The va_1 = 1 above leads to division by zero, but this is a special case - this vertex points do not change under this transformation - can be handled e.g. by "if".

ps. Here is a Mathematica implementation for producing deformed codebooks on S_1^+ (has to be normalized to S_2 and recovered the signs):

k = 30; ps = {0, 0.8, 0.7};

tb = Flatten[Table[v = {i, j, k - i - j}/k;

    If[v[[1]] == 1, {1, 0, 0},      (* special case *)
     u = {0, v[[2]], v[[3]]}/(1 - v[[1]]);
     p = u.ps;     (* find power as convex combination *)
     s = (1 - v[[1]])^p;     (* power deformation *)
     v = s*u; v[[1]] = 1 - s; v]

, {i, 0, k}, {j, 0, k - i}], 1];

Graphics[Point[tb.{{0.5, Sqrt[3]/2}, {0, 0}, {1, 0}}]]

It produced the following examples of deformed codebooks - for k=30, the top vertex is the emphasized one, powers p_i are written (should be chosen accordingly to standard deviations of Laplace distributions):

Auto Generated Inline Image 1

回覆所有人

回覆作者

轉寄

0 則新訊息