thocont tallys salvestro

0 views

Skip to first unread message

Trudi Miranda

unread,

Aug 2, 2024, 8:47:24 PM8/2/24

to guigrovylul

This guide focuses on the encoder x264. It assumes you have ffmpeg compiled with --enable-libx264. If you need help compiling and installing see one of our compiling guides. See HWAccelIntro for information on supported hardware H.264 encoders.

This method allows the encoder to attempt to achieve a certain output quality for the whole file when output file size is of less importance. This provides maximum compression efficiency with a single pass. By adjusting the so-called quantizer for each frame, it gets the bitrate it needs to keep the requested quality level. The downside is that you can't tell it to get a specific filesize or not go over a specific size or bitrate, which means that this method is not recommended for encoding videos for streaming.

A preset is a collection of options that will provide a certain encoding speed to compression ratio. A slower preset will provide better compression (compression is quality per filesize). This means that, for example, if you target a certain file size or constant bit rate, you will achieve better quality with a slower preset. Similarly, for constant quality encoding, you will simply save bitrate by choosing a slower preset.

For example, if your input is animation then use the animation tuning, or if you want to preserve grain in a film then use the grain tuning. If you are unsure of what to use or your input does not match any of tunings then omit the -tune option. You can see a list of current tunings with -tune help, and what settings they apply with x264 --fullhelp.

The -profile:v option limits the output to a specific H.264 profile. You usually do not need to use this option and the recommendation is to omit setting the profile which will allow x264 to automatically select the appropriate profile.

Some devices (mostly very old or obsolete) only support the more limited Constrained Baseline or Main profiles. You can set these profiles with -profile:v baseline or -profile:v main. Most modern devices support the more advanced High profile.

Use this rate control mode if you are targeting a specific output file size, and if output quality from frame to frame is of less importance. This is best explained with an example. Your video is 10 minutes (600 seconds) long and an output of 200 MiB is desired. Since bitrate = file size / duration:

Note that lossless output files will likely be huge, and most non-FFmpeg based players will not be able to decode lossless. Therefore, if compatibility or file size are an issue, you should not use lossless.

Tip: If you're looking for an output that is roughly "visually lossless" but not technically lossless, use a -crf value of around 17 or 18 (you'll have to experiment to see which value is acceptable for you). It will likely be indistinguishable from the source and not result in a huge, possibly incompatible file like true lossless mode.

While -preset chooses the best possible settings for you, you can overwrite these with the x264-params option, or by using the libx264 private options (see ffmpeg -h encoder=libx264). This is not recommended unless you know what you are doing. The presets were created by the x264 developers and tweaking values to get a better output is usually a waste of time.

In the above example, -bufsize is the "rate control buffer", so it will enforce your requested "average" (1 MBit/s in this case) across each 2 MBit worth of video. Here it is assumed that the receiver / player will buffer that much data, meaning that a fluctuation within that range is acceptable.

Use this mode if you want to constrain the maximum bitrate used, or keep the stream's bitrate within certain bounds. This is particularly useful for online streaming, where the client expects a certain average bitrate, but you still want the encoder to adjust the bitrate per-frame.

This will effectively "target" -crf 23, but if the output were to exceed 1 MBit/s, the encoder would increase the CRF to prevent bitrate spikes. However, be aware that libx264 does not strictly control the maximum bit rate as you specified (the maximum bit rate may be well over 1M for the above file). To reach a perfect maximum bit rate, use two-pass.

Going from medium to slow, the time needed increases by about 40%. Going to slower instead would result in about 100% more time needed (i.e. it will take twice as long). Compared to medium, veryslow requires 280% of the original encoding time, with only minimal improvements over slower in terms of quality.

You may need to use -vf format=yuv420p (or the alias -pix_fmt yuv420p) for your output to work in QuickTime and most other players. These players only support the YUV planar color space with 4:2:0 chroma subsampling for H.264 video. Otherwise, depending on your source, ffmpeg may output to a pixel format that may be incompatible with these players.

FFmpeg now implements a native xHE-AAC decoder. Currently, streams without (e)SBR, USAC or MPEG-H Surround are supported, which means the majority of xHE-AAC streams in use should work. Support for USAC and (e)SBR is coming soon. Work is also ongoing to improve its stability and compatibility. During the process we found several specification issues, which were then submitted back to the authors for discussion and potential inclusion in a future errata.

The FFmpeg community is excited to announce that Germany's Sovereign Tech Fund has become its first governmental sponsor. Their support will help sustain the maintainance of the FFmpeg project, a critical open-source software multimedia component essential to bringing audio and video to billions around the world everyday.

A new major release, FFmpeg 7.0 "Dijkstra", is now available for download. The most noteworthy changes for most users are a native VVC decoder (currently experimental, until more fuzzing is done), IAMF support, or a multi-threaded ffmpeg CLI tool.

This release is not backwards compatible, removing APIs deprecated before 6.0. The biggest change for most library callers will be the removal of the old bitmask-based channel layout API, replaced by the AVChannelLayout API allowing such features as custom channel ordering, or Ambisonics. Certain deprecated ffmpeg CLI options were also removed, and a C11-compliant compiler is now required to build the code.

The libavcodec library now contains a native VVC (Versatile Video Coding) decoder, supporting a large subset of the codec's features. Further optimizations and support for more features are coming soon. The code was written by Nuo Mi, Xu Mu, Frank Plowman, Shaun Loo, and Wu Jianhua.

The libavformat library can now read and write IAMF (Immersive Audio) files. The ffmpeg CLI tool can configure IAMF structure with the new -stream_group option. IAMF support was written by James Almer.

Thanks to a major refactoring of the ffmpeg command-line tool, all the major components of the transcoding pipeline (demuxers, decoders, filters, encodes, muxers) now run in parallel. This should improve throughput and CPU utilization, decrease latency, and open the way to other exciting new features.

This release had been overdue for at least half a year, but due to constant activity in the repository, had to be delayed, and we were finally able to branch off the release recently, before some of the large changes scheduled for 7.0 were merged.

Internally, we have had a number of changes too. The FFT, MDCT, DCT and DST implementation used for codecs and filters has been fully replaced with the faster libavutil/tx (full article about it coming soon).
This also led to a reduction in the the size of the compiled binary, which can be noticeable in small builds.
There was a very large reduction in the total amount of allocations being done on each frame throughout video decoders, reducing overhead.
RISC-V optimizations for many parts of our DSP code have been merged, with mainly the large decoders being left.
There was an effort to improve the correctness of timestamps and frame durations of each packet, increasing the accurracy of variable frame rate video.

A few days ago, Vulkan-powered decoding hardware acceleration code was merged into the codebase. This is the first vendor-generic and platform-generic decode acceleration API, enabling the same code to be used on multiple platforms, with very minimal overhead. This is also the first multi-threaded hardware decoding API, and our code makes full use of this, saturating all available decode engines the hardware exposes.

Those wishing to test the code can read our documentation page. For those who would like to integrate FFmpeg's Vulkan code to demux, parse, decode, and receive a VkImage to present or manipulate, documentation and examples are available in our source tree. Currently, using the latest available git checkout of our repository is required. The functionality will be included in stable branches with the release of version 6.1, due to be released soon.

As this is also the first practical implementation of the specifications, bugs may be present, particularly in drivers, and, although passing verification, the implementation itself. New codecs, and encoding support are also being worked on, by both the Khronos organization for standardizing, and us as implementing it, and giving feedback on improving.

A new major release, FFmpeg 6.0 "Von Neumann", is now available for download. This release has many new encoders and decoders, filters, ffmpeg CLI tool improvements, and also, changes the way releases are done. All major releases will now bump the version of the ABI. We plan to have a new major release each year. Another release-specific change is that deprecated APIs will be removed after 3 releases, upon the next major bump. This means that releases will be done more often and will be more organized.

New decoders featured are Bonk, RKA, Radiance, SC-4, APAC, VQC, WavArc and a few ADPCM formats. QSV and NVenc now support AV1 encoding. The FFmpeg CLI (we usually reffer to it as ffmpeg.c to avoid confusion) has speed-up improvements due to threading, as well as statistics options, and the ability to pass option values for filters from a file. There are quite a few new audio and video filters, such as adrc, showcwt, backgroundkey and ssim360, with a few hardware ones too. Finally, the release features many behind-the-scenes changes, including a new FFT and MDCT implementation used in codecs (expect a blog post about this soon), numerous bugfixes, better ICC profile handling and colorspace signalling improvement, introduction of a number of RISC-V vector and scalar assembly optimized routines, and a few new improved APIs, which can be viewed in the doc/APIchanges file in our tree. A few submitted features, such as the Vulkan improvements and more FFT optimizations will be in the next minor release, 6.1, which we plan to release soon, in line with our new release schedule. Some highlights are: