kalmglen sibley stavros

0 views

Skip to first unread message

Flaviano Bada

unread,

Aug 3, 2024, 1:12:36 AM8/3/24

to fargelinonp

digital audio is called PCM which is the raw audio format fundamental to any audio processing system ... its uncompressed ... just a series of integers representing the height of the audio curve for each sample of the curve (the Y axis where time is the X axis along this curve)

... this PCM audio can be compressed using some codec then bundled inside a container often together with video or meta data channels ... so to convert audio from A to B you would first need to understand the container spec as well as the compressed audio codec so you can decompress audio A into PCM format ... then do the reverse ... compress the PCM into codec of B then bundle it into the container of B

Before venturing further into this I suggest you master the art of WAVE audio files ... beauty of WAVE is that its just a 44 byte header followed by the uncompressed integers of the audio curve ... write some code to read a WAVE file then parse the header (identify bit depth, sample rate, channel count, endianness) to enable you to iterate across each audio sample for each channel ... prove that its working by sending your bytes into an output WAVE file ... diff input WAVE against output WAVE as they should be identical ... once mastered you are ready to venture into your above stated goal ... do not skip over groking notion of interleaving stereo audio as well as spreading out a single audio sample which has a bit depth of 16 bits across two bytes of storage and the reverse namely stitching together multiple bytes into a single integer with a bit depth of 16, 24 or even 32 bits while keeping endianness squared away ... this may sound scary at first however all necessary details are on the net as its how I taught myself this level of detail

modern audio compression algorithms leverage knowledge of how people perceive sound to discard information which is indiscernible ( lossy ) as opposed to lossless algorithms which retain all the informational load of the source ... opus ( -codec.org/) is a current favorite codec untainted by patents and is open source

I have some old music in a lossless format. Now that I am constantlyjumping between computers, I wanted it to be converted ina more universal format such as mp3 so that I can play it withthe simplest of players. I also wanted to avoid havingto stream my music on cloud platforms. Upon a cursory and naive scanon the web, I found that existing scripts are defunct (again cursory)or was not as simple as I would like it to be. I did not want to downloada GUI for a one time use or upload a directory of music online to have itbe converted on some server and download it again either. Instead, I wrotethis quick CLI to do it for me.

This will recursively search the INPUT_DIRECTORY for files with musicextensions. Each file found will then be converted to the TARGET_FORMAT andplaced in the OUTPUT_DIRECTORY with the same name but updated extension.

Audio can be passed to be converted to specific codecs. This is an experimental now featureas it has no error checking that certain codecs are compatible with your desired outputaudio format. Depending on ffmpeg and/or pydub, there may or may not be error logging.

This tool converts your typed text into audio Morse code. It will generate a downloadable audio file (in .wav format) to allow you to hear the result, along with displaying the dots and dashes (dits and dahs) as well. To use this tool, type in the text you to like convert to Morse code below and click the Convert to Morse Code button. A link to your downloadable file will then be provided below. With this converter tool, you also have the option to adjust the speed and frequency of how you like to generate the audible Morse code.

You may download and use the audible Morse code files for personal, business, or educational purposes, provided you include a publicly accessible and clickable link to this page with your use. Please read our Terms of Use on warranty information.

Morse code is a method of transmitting text information as a series of on-off audible tones or light pulse. There are two different signal durations called dots and dashes (or dits and dahs). The International Morse Code standardize each number and letters of the alphabet with a unique sequence of dots and dashes.

Morse code was used in the 1890s as a form of radio communication before it was possible to transmit voice. It eventually became a primary means of communication during World War II by various countries for sending messages about enemies activities. Today, most militaries have stopped using Morse code. However, it is still commonly used in aviation as a way to identify navigational stations and among amateur radio operators to identify radio repeaters.

FFmpeg now implements a native xHE-AAC decoder. Currently, streams without (e)SBR, USAC or MPEG-H Surround are supported, which means the majority of xHE-AAC streams in use should work. Support for USAC and (e)SBR is coming soon. Work is also ongoing to improve its stability and compatibility. During the process we found several specification issues, which were then submitted back to the authors for discussion and potential inclusion in a future errata.

The FFmpeg community is excited to announce that Germany's Sovereign Tech Fund has become its first governmental sponsor. Their support will help sustain the maintainance of the FFmpeg project, a critical open-source software multimedia component essential to bringing audio and video to billions around the world everyday.

A new major release, FFmpeg 7.0 "Dijkstra", is now available for download. The most noteworthy changes for most users are a native VVC decoder (currently experimental, until more fuzzing is done), IAMF support, or a multi-threaded ffmpeg CLI tool.

This release is not backwards compatible, removing APIs deprecated before 6.0. The biggest change for most library callers will be the removal of the old bitmask-based channel layout API, replaced by the AVChannelLayout API allowing such features as custom channel ordering, or Ambisonics. Certain deprecated ffmpeg CLI options were also removed, and a C11-compliant compiler is now required to build the code.

The libavcodec library now contains a native VVC (Versatile Video Coding) decoder, supporting a large subset of the codec's features. Further optimizations and support for more features are coming soon. The code was written by Nuo Mi, Xu Mu, Frank Plowman, Shaun Loo, and Wu Jianhua.

The libavformat library can now read and write IAMF (Immersive Audio) files. The ffmpeg CLI tool can configure IAMF structure with the new -stream_group option. IAMF support was written by James Almer.

Thanks to a major refactoring of the ffmpeg command-line tool, all the major components of the transcoding pipeline (demuxers, decoders, filters, encodes, muxers) now run in parallel. This should improve throughput and CPU utilization, decrease latency, and open the way to other exciting new features.

This release had been overdue for at least half a year, but due to constant activity in the repository, had to be delayed, and we were finally able to branch off the release recently, before some of the large changes scheduled for 7.0 were merged.

Internally, we have had a number of changes too. The FFT, MDCT, DCT and DST implementation used for codecs and filters has been fully replaced with the faster libavutil/tx (full article about it coming soon).
This also led to a reduction in the the size of the compiled binary, which can be noticeable in small builds.
There was a very large reduction in the total amount of allocations being done on each frame throughout video decoders, reducing overhead.
RISC-V optimizations for many parts of our DSP code have been merged, with mainly the large decoders being left.
There was an effort to improve the correctness of timestamps and frame durations of each packet, increasing the accurracy of variable frame rate video.

A few days ago, Vulkan-powered decoding hardware acceleration code was merged into the codebase. This is the first vendor-generic and platform-generic decode acceleration API, enabling the same code to be used on multiple platforms, with very minimal overhead. This is also the first multi-threaded hardware decoding API, and our code makes full use of this, saturating all available decode engines the hardware exposes.

Those wishing to test the code can read our documentation page. For those who would like to integrate FFmpeg's Vulkan code to demux, parse, decode, and receive a VkImage to present or manipulate, documentation and examples are available in our source tree. Currently, using the latest available git checkout of our repository is required. The functionality will be included in stable branches with the release of version 6.1, due to be released soon.

As this is also the first practical implementation of the specifications, bugs may be present, particularly in drivers, and, although passing verification, the implementation itself. New codecs, and encoding support are also being worked on, by both the Khronos organization for standardizing, and us as implementing it, and giving feedback on improving.

A new major release, FFmpeg 6.0 "Von Neumann", is now available for download. This release has many new encoders and decoders, filters, ffmpeg CLI tool improvements, and also, changes the way releases are done. All major releases will now bump the version of the ABI. We plan to have a new major release each year. Another release-specific change is that deprecated APIs will be removed after 3 releases, upon the next major bump. This means that releases will be done more often and will be more organized.

New decoders featured are Bonk, RKA, Radiance, SC-4, APAC, VQC, WavArc and a few ADPCM formats. QSV and NVenc now support AV1 encoding. The FFmpeg CLI (we usually reffer to it as ffmpeg.c to avoid confusion) has speed-up improvements due to threading, as well as statistics options, and the ability to pass option values for filters from a file. There are quite a few new audio and video filters, such as adrc, showcwt, backgroundkey and ssim360, with a few hardware ones too. Finally, the release features many behind-the-scenes changes, including a new FFT and MDCT implementation used in codecs (expect a blog post about this soon), numerous bugfixes, better ICC profile handling and colorspace signalling improvement, introduction of a number of RISC-V vector and scalar assembly optimized routines, and a few new improved APIs, which can be viewed in the doc/APIchanges file in our tree. A few submitted features, such as the Vulkan improvements and more FFT optimizations will be in the next minor release, 6.1, which we plan to release soon, in line with our new release schedule. Some highlights are: