Run HEVC's Low-Delay-P (IPPP) in VP9

Hossam Amer

unread,

Jun 10, 2018, 6:41:58 PM6/10/18

to WebM Discussion

Hello Everyone,

My research is about video compression and was mainly working on HEVC (HM reference software).

Brief Background: In Low-Delay-P of HEVC, the encoder assigns all its frames a P-type except for the first one (I-frame). In addition, there is a predefined referencing and QP structure for these frames.

Objective: Run Low-Delay-P (IPPPP) in VP9 while observing frame types, rates/PSNR per frame, QPs

To achieve my goal, I would like to ask a few questions that will help me understand VP9's code and configurations:

What is the set of configurations to achieve my goal? So far, I am running the following command:

./vpxenc -v -o out/test.webm --ivf --verbose --psnr --codec=vp9 --passes=1 --fpf=out/test.txt --min-q=0 --max-q=51 <file>.y4m > out/<file2>.txt

What does --min_q and --max_q do? How does the encoder decide upon the QP of each frame?
Can someone please explain in further detail what CQ mode in VP9 is? Does that mean the encoder assigns constant QP to all frames?
How can I output an encoding trace file? In HM, it is a file that contains frames types, QPs, and their corresponding BRs and PSNRs. Should I use ffprobe? or I should write my own code?
In HM, based on the input QP, there is a fixed relationship to calculate lambda - Is there a similar concept for HM?

I understand that I can look at the code to know the answers to these questions, but your clarifications will definitely help.

Please note that I have already looked at the following links:

Any other resources to read about VP9 would also be appreciated!

Thanks in advance,

Hossam Amer (Ph.D. candidate at the University of Waterloo)

Hossam Amer

unread,

Jun 12, 2018, 12:24:13 PM6/12/18

to WebM Discussion

This is an example of HM's trace file:

"

POC 0 TId: 0 ( I-SLICE, nQP 15 QP 15 ) 216928 bits [Y 48.1618 dB U 49.2232 dB V 49.5354 dB] [ET 2 ] [L0 ] [L1 ]

POC 1 TId: 0 ( P-SLICE, nQP 20 QP 20 ) 19448 bits [Y 44.9442 dB U 47.4306 dB V 47.1836 dB] [ET 2 ] [L0 0 ] [L1 ]

.

SUMMARY --------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

10 a 1771.3200 43.2335 46.2793 45.8618 43.4875

I Slices--------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

1 i 10846.4000 48.1618 49.2232 49.5354 48.5302

P Slices--------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

9 p 762.9778 42.6859 45.9522 45.4536 43.1681

"

As you see, it contains information about the QP per frame, referencing structure, frame types, bits per frame, PSNR per frame - Is there something like this in libvpx? What's the parameter for it?

Jingning Han

unread,

Jun 12, 2018, 1:08:53 PM6/12/18

to WebM Discussion

Hi Hossam,

The VP9 codec is designed to achieve peak compression performance for VoD use case using variable bit-rate. It employs an alternate reference frame type, where an additional source frame is generated by temporal filtering several frames.

VP9 would then code this alternate reference frame as the first coding frame in a group of pictures, as to provide backward referencing for all the rest regular frames in that GOP.

That said there is no equivalent IPPP coding structure in VP9 encoder implemented at this point, that would results anywhere near the typical compression performance using variable bit-rate.

The IPPP coding scheme in VP9 is primarily tuned for real-time encoding purpose, where we don't actually run full rate-distortion optimization search inside.

Please see responses below.

Thanks,

Jingning

On Sunday, June 10, 2018 at 3:41:58 PM UTC-7, Hossam Amer wrote:

Hello Everyone,

My research is about video compression and was mainly working on HEVC (HM reference software).

Brief Background: In Low-Delay-P of HEVC, the encoder assigns all its frames a P-type except for the first one (I-frame). In addition, there is a predefined referencing and QP structure for these frames.
Objective: Run Low-Delay-P (IPPPP) in VP9 while observing frame types, rates/PSNR per frame, QPs

To achieve my goal, I would like to ask a few questions that will help me understand VP9's code and configurations:
What is the set of configurations to achieve my goal? So far, I am running the following command:

./vpxenc -v -o out/test.webm --ivf --verbose --psnr --codec=vp9 --passes=1 --fpf=out/test.txt --min-q=0 --max-q=51 <file>.y4m > out/<file2>.txt

This would result 1-pass encoding using variable bit-rate setting.

What does --min_q and --max_q do? How does the encoder decide upon the QP of each frame?

They set up constraints on upper and lower limit of the quantizer. Note that setting them to be equal or too close would likely break the rate control and cause unintended coding performance drop.

Can someone please explain in further detail what CQ mode in VP9 is? Does that mean the encoder assigns constant QP to all frames?

CQ - constraint variable bit-rate. It sets up certain upper and lower limit to cap the variable bit-rate range. No, it doesn't allow constant QP to all frames.

How can I output an encoding trace file? In HM, it is a file that contains frames types, QPs, and their corresponding BRs and PSNRs. Should I use ffprobe? or I should write my own code?
In HM, based on the input QP, there is a fixed relationship to calculate lambda - Is there a similar concept for HM?

Yes, look for cpi->rd.RDMUL

Hossam Amer

unread,

Jun 12, 2018, 7:03:06 PM6/12/18

to WebM Discussion

Hi Jingning,

Thank you very much for your detailed reply!

I understood what you explained about the newly introduced source frame (S-frame). I previously read a similar concept in the literature. Can I infer that the referencing structure that VP9 uses only relies on this S-frame for every GOP?

Clear Goal: Ultimately, I would like to integrate my newly developed adaptive QP method into VP9.

My method significantly improves LD-HEVC, so I am searching for a similar and simple benchmark in VP9 that I can (A) build upon (B) compare my results after integration.

I was previously advised to use the CQ mode with --min_q and --max_q. Do you think that setting all of these parameters into the same QP value should be a good starting point? How does the current VP9 encoder decide upon the QP in order to meet the rate constraints?

I will search for rd.RDMUL, but do you know what the keyword is for the QP? I looked up QP, but it is no good.

About the trace file, does VP9 produce a file that contains the referencing structure, QP per frame, bits per frame, and PSNR per frame? This is an example of a file that's produced by HM:

"

POC 0 TId: 0 ( I-SLICE, nQP 15 QP 15 ) 216928 bits [Y 48.1618 dB U 49.2232 dB V 49.5354 dB] [ET 2 ] [L0 ] [L1 ]

POC 1 TId: 0 ( P-SLICE, nQP 20 QP 20 ) 19448 bits [Y 44.9442 dB U 47.4306 dB V 47.1836 dB] [ET 2 ] [L0 0 ] [L1 ]

.

SUMMARY --------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

10 a 1771.3200 43.2335 46.2793 45.8618 43.4875

I Slices--------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

1 i 10846.4000 48.1618 49.2232 49.5354 48.5302

P Slices--------------------------------------------------------

Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR YUV-PSNR

9 p 762.9778 42.6859 45.9522 45.4536 43.1681

"

Thanks!

Hossam

Jingning Han

unread,

Jun 20, 2018, 12:09:11 AM6/20/18

to WebM Discussion

Hi Hossam,

Please see inline response below.

Thanks,

Jingning

On Tuesday, June 12, 2018 at 4:03:06 PM UTC-7, Hossam Amer wrote:

Hi Jingning,

Thank you very much for your detailed reply!

I understood what you explained about the newly introduced source frame (S-frame). I previously read a similar concept in the literature. Can I infer that the referencing structure that VP9 uses only relies on this S-frame for every GOP?

The term S-frame has another meaning more commonly seen in the error resilient video coding setting. Just a reminder not to be confused with that.

Clear Goal: Ultimately, I would like to integrate my newly developed adaptive QP method into VP9.
My method significantly improves LD-HEVC, so I am searching for a similar and simple benchmark in VP9 that I can (A) build upon (B) compare my results after integration.

I was previously advised to use the CQ mode with --min_q and --max_q. Do you think that setting all of these parameters into the same QP value should be a good starting point? How does the current VP9 encoder decide upon the QP in order to meet the rate constraints?

The libvpx encoder is designed for production. Hence its rate control system uses variable bit-rate. The encoder takes the input target bit-rate. Inside the encoding process, we would adaptive change the allowance bits per GOP and within that per frame.

Setting --min-q and --max-q would break the rate control system and cause significant compression performance loss (probably 20 - 50% level). That's not a good starting point.

You could try --end-usage=q --cq-level=QP. That would give you roughly constant Q in the regular inter frames, and lower Q in the ARF frame.

I will search for rd.RDMUL, but do you know what the keyword is for the QP? I looked up QP, but it is no good.

base_qindex for QP

About the trace file, does VP9 produce a file that contains the referencing structure, QP per frame, bits per frame, and PSNR per frame? This is an example of a file that's produced by HM:

The GOP and rate control system in VP9 is largely different from HEVC and HM model. In libvpx, I don't think we would track coding performance per frame type.

Reply all

Reply to author

Forward

Run HEVC's Low-Delay-P (IPPP) in VP9 - Please help

Hossam Amer

Hossam Amer

Jingning Han

Hossam Amer

Jingning Han