Tony,
Here's my breakdown, based on my tests on the hardware I own:
Let's begin with the actual codecs: VP9 and HEVC/H.265:
In terms of (subjective) visual quality, I have observed that HEVC is
considerably better at quality retention in low-bitrate,
low-resolution scenarios whereas VP9's superiority is demonstrated at
higher resolution, low bitrate scenarios (As in the case of Youtube's
VP9 streaming on supported platforms). Where they both level off,
perceptually, is when similar bitrate is applied (fixed-rate) on
resolutions at and above 1080p.
For fast motion scenarios, HEVC is better at retention, yet VP9 is
better at bitrate adherence for the same container specification
(Matroska, at the time of testing).
Now, on encoders:
libvpx (Which does both VP8/9 encoding in ffmpeg, if configured as
such at build time) is atrociously slow and compute expensive. Also
not very well threaded. This might change in the future, but for now,
this is the case.
x265 (I currently run this as libx265, built in ffmpeg as a
configuration switch) has proven to be significantly faster than
libvpx on the same hardware, has excellent threading and great
instruction level support (MMX, SSE, AVX, BMI on supported processors)
and is also highly configurable.
However, compared to the standard x264 encoder, both libvpx and
libx265 are way slower, and they'll both munch on delicious CPU
cycles.
And now, let's look at common encoding scenarios you may run into:
At the moment, I use Nvidia's NVENC to encode HEVC content (via
FFmpeg, needs to be compiled with --enable-nvenc switch and a
provision of the NVENC SDK+ headers) on Maxwell GM200, GM204 and
Pascal's GP104 hardware (GeForce GTX 980Ti, GeForce GTX 980M SLI combo
and a pair of GeForce GTX 1080 GPUs respectively) and when using this
form of hardware-accelerated encoding, the quality of the encoded
material is considerably worse at the same settings as a
software-based encoder such as libx265. And by the way, MainConcept's
proprietary encoder is inferior to x265 interms of performance and
output quality, at the moment.
When using hardware-accelerated encoders, one must bump up the bitrate
to offset the sub-par bitrate allocation on these encoders, so your
mileage here may vary as more platforms with this feature emerge.
At the moment, I'm aware that Intel's up and coming Kaby Lake iGPs,
current-generation Skylake and AMD's Carrizo-L APUs have support for
this feature (through VAAPI for Intel and OpenMAX IL for AMD's VCE
hardware) on Linux.
For VP9 hardware-accelerated encoding, Intel's current-generation
Skylake and up-and coming Kaby Lake SKUs wil support it theough VAAPI
(exposed through both gstreamer and Libva's implementations).
I hope that the information provided above will be of help to you.
Thanks and regards,
Dennis.