Any interest to adding audio support to the std library?

488 views
Skip to first unread message

alexande...@gmail.com

unread,
Jun 3, 2016, 5:37:31 AM6/3/16
to ISO C++ Standard - Future Proposals
I have drafted some ideas on how I think the c++ std library could support audio functionality.

I know that audio functionality is a very operating specific problem, but with the recent trend towards implementing a file-system library and possibly a graphics library I believe that audio would not be too much of a reach anymore.

Here are some of the ideas I have so far. I have both some code examples of the intended usage as well as a list of the types needed to implement the given examples.

Please keep in mind my drafts are still very rough.


CODE EXAMPLES

//std::audio example 1 "single process"
void example_1(){
   
double sample_rate = 44100;
    std
::size_t frame_size =2;
    std
::size_t buffer_size=128;

    std
::audio_context<float> ctx{sample_rate,buffer_size,frame_size};//contruct from values

    std
::astream_process<float> proc(ctx,[](std::iastream const& input, std::oastream& output){
        std
::frame_buffer<float>& buff = ctx.borrow_buffer();//borrow a buffer from the context for usage
       
//prevents the need for dynamic allocation of a temporary buffer
        input
>>buff;//stream data into buffer for manipulation
       
for(auto&& frame: buff){
            frame
=0.0;//do something with audio
       
}
        output
<<buff;//stream to output
   
});//dsp object
   
//uses implied routing equivilent to
   
//std::aout<<proc<<std::ain;
   
//

    proc
.start();
   
//do other stuff
    proc
.stop();
}

//std::audio example 2 "process group"
void example_2(){

    std
::audio_context<float> ctx;//default context created with std::default_* values

   
//version 1: capture context via lambda
    std
::astream_process<float> proc1(ctx,[&ctx](std::iastream const& input, std::oastream& output){
        std
::frame_buffer<float>& buff = ctx.borrow_buffer();
        input
>>buff;
       
for(auto&& frame: buff){
            frame
*=0.5;
       
}
        output
<<buff;
   
});//dsp object

   
//version 2: have context passed as argument
    std
::astream_process<float> proc2(ctx,[](std::iastream const& input, std::oastream& output,std::audio_context<float> const& context){
        std
::frame_buffer<float>& buff = ctx.borrow_buffer();
        input
>>buff;
       
for(auto&& frame: buff){
            frame
*=2.0;
       
}
        output
<<buff;
   
});

    std
::process_group<float> pgroup;//a group of processes that will happen consecutivley
    pgroup
.push(proc1);//add to group
    pgroup
.push(proc2);//add to group

   
//configure stream relationships in terms of std::ain / std:aout manually
   
//std::ain/std::aout are std::astream globals that refer to the default audio inputs and outputs supplied by the context in use
   
//std::ain/std::aout will route the audio to the enpoint specified by the context reference held by the process that is streaming the data
    std
::aout<<proc1<<proc2<<std::ain;//method 1
   
//std::ain>>proc2>>proc1>>std::aout;//method 2

    pgroup
.start();
   
//do other stuff
    pgroup
.stop();

}


//std::audio example 3 "audio files"
void example_3(){

    std
::audio_context<float> ctx;

    std
::astream_process<float> proc(ctx,[](std::iafstream const& input, std::oafstream& output){
        std
::frame_buffer<float>& buff = ctx.borrow_buffer();
        input
>>buff;
       
for(auto&& frame: buff){
            frame
=0.0;
       
}
        output
<<buff;
   
});//dsp object

    std
::iafstream audio_file1(ctx,"filename1.extension");//an audio file handle
    std
::oafstream audio_file2(ctx,"filename2.extension");//an audio file handle

   
//routing
    audio_file2
<<proc<<audio_file1;//take input from file nad write to file
   
//audio_file1>>proc>>audio_file2;//equivilent syntax
    proc
.start();
   
//do other stuff
    proc
.stop();
}


//std::audio example 4 "combination routing"
void example_3(){

    std
::audio_context<float> ctx;
   
//manually select hardware endpoints
    std
::size_t device_id = ctx.default_device_id();
    std
::iastream input_device = ctx.get_device<std::input_device>(device_id);
    std
::oastream output_device = ctx.get_device<std::output_device>(device_id);

    std
::astream_process<float> proc(ctx,[](std::iastream const& input,
                                            std
::oastream& output,
                                            std
::iafstream const& input_file,
                                             std
::oafstream& output_file){
        std
::frame_buffer<float>& buff = ctx.borrow_buffer();
       
(input + input_file)>>buff;//add streams to perform sum before writing to buffer
       
//or you could use seperate buffers
       
//like this
       
/*
            std::frame_buffer<float> buff1;
            std::frame_buffer<float> buff2;

            input>>buff1;
            input_file>>buff2;
            buff1+=buff2;//buffer arithmatic
        */

        output
<<buff;//send the contents of buff to the hardware out and the file out
        output_file
<<buff;
   
});

    std
::iafstream audio_file1(ctx,"filename1.extension");//the actual files to be used above
    std
::oafstream audio_file2(ctx,"filename2.extension");

   
//connect the files to the process
   
//connect the hardware device to the process
    audio_file2
<<proc<<audio_file1;//take input from file
    output_device
<<proc<<input_device;//also take from hardware
    proc
.start();
   
//do other stuff
    proc
.stop();
}



REQUIRED LIBRARY MEMBERS


namespace std{
   
inline namespace audio{
       
//working context for audio flow
       
template<typename>
       
class audio_context;
       
/*
        *The context in which all audio data is centered.
        *Contains: sampling rate, buffer size, frame size, etc...
        *The values of ain,aout,afin,afout refer to the endpoints defined by the context, when applied to routing on a porocess tied to the context
        *think of a context as the program level driver object
        */


       
//audio streams (think like std::fstream and its friends)
       
class astream;//audio stream
       
class oastream;//output audio stream
       
class iastream;//input audio stream
       
class oafstream;//output audio file stream
       
class iafstream;//input audio file stream


       
//stream endpoints
       
class ain;//audio input endpoint
       
class aout;//audio output endpoint
       
class afin;//audio file input endpoint
       
class afout//audio file output endpoint

       
//stream processing
       
template<typename>
       
class astream_process;//a dsp process applied to a stream

       
template<typename>
       
class process_group;//a group of processes that will act as one

       
//containers
       
template<typename>
       
class frame_buffer;//a sequence container that is resizeable at runtime, but only with explicit resize calls. contains frames(see below)
       
/*Implementation note on frame_buffer
         *frame_buffer is intended to hold N number of frames which themselves can hold M number of samples
         *meaning that the total size in samples if frame_buffer = N * M
         *ideally frame_buffers representation of its sample data will be continuous in memory
        */


       
template<typename>
       
class frame;//a container that holds samples, thin array wrapper


       
//hardware representation
       
class device;//an audio device as recognized by the OS
       
class input_device;//an input device
       
class output_device;//an output device

       
// audio file formats
       
enum class afformat{
            raw
,//raw headerless audio bytes, interpreted only by the settings of the context.
           
//best used for temporary storage within the life of a context
            wav
,
            flac
//etc...
       
}
   
}
}



required_code.txt
usage_ideas.txt

Klaim - Joël Lamotte

unread,
Jun 3, 2016, 6:46:33 AM6/3/16
to std-pr...@isocpp.org
You might want to forward this at least to SG13 (HMI group).



--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/f490fd21-cf07-4d7b-8936-c71c97a1ab9d%40isocpp.org.

Jeffrey Yasskin

unread,
Jun 3, 2016, 11:43:42 AM6/3/16
to std-pr...@isocpp.org
Is this based on an existing library? We're much more likely to adopt
a proposal that's been used widely than one that was invented for the
standard.

Bjorn Reese

unread,
Jun 3, 2016, 12:26:28 PM6/3/16
to std-pr...@isocpp.org
I would approach an audio API differently.

The basic audio primitives are playing and recording. These are
essentially I/O (write and read) operations. Therefore I would base
them on io_context from Networking TS, so we can have both synchronous
and asynchronous audio operations. That way we have not imposed an
audio pipeline architecture.

Such a low-level audio API should not include things like decoding,
demuxing, resampling, or signal processing. If needed, they can be
added later as part of a higher level audio API.

alexande...@gmail.com

unread,
Jun 3, 2016, 12:35:08 PM6/3/16
to std-pr...@isocpp.org
Not currently, I am trying to gather syntactical usage, library feature ideas and design considerations so that I may implement a version of this that is more complete than what I present here. I understand that something like this is most likely only to be considered if it is in common practice or use, unfortunately
I have not found an audio library based off of modern C++. Most of the ones I know well are either pure C or very thin wrappers over an existing C library. My goal is to create a library with becoming a standard down the line in sight, but allow it to become mature enough for possible standardization.
> You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CANh-dX%3DR4Oa0OPZSrL7c9PXGyAdvM2CCBdP9C%2Bh--N4Vt-J%3D9g%40mail.gmail.com.
Message has been deleted

Andrey Semashev

unread,
Jun 3, 2016, 1:10:09 PM6/3/16
to std-pr...@isocpp.org

On Friday, 3 June 2016 20:04:59 MSK Bjorn Reese wrote:

> I would approach an audio API differently.

>

> The basic audio primitives are playing and recording.

 

I disagree. I would say the basic primitives are a sample and a frame. Samples can be obtained and processed in different ways, including recording, reading from file or generating algorithmically.

 

The processing is often represented as a pipeline or graph with producer, filter and consumer nodes. This is a higher level framework that builds upon the basic blocks of frames and digital signal processing algorithms.

 

Message has been deleted

Tony V E

unread,
Jun 3, 2016, 5:13:44 PM6/3/16
to Andrey Semashev
The pipeline/graph model also works for images and other data. 

A movie or a sound, etc, is a function f(t) that returns data at a given time. 

Should we build a library that does more than sound?

(and yes I know (20+ years of audio / video) it is not as simple as it sounds‎ and the devil is in the details, but at the high level it is and should be simple)

Sent from my BlackBerry portable Babbage Device
From: Andrey Semashev
Sent: Friday, June 3, 2016 1:10 PM
Subject: Re: [std-proposals] Any interest to adding audio support to the std library?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

ron novy

unread,
Jun 3, 2016, 5:32:07 PM6/3/16
to std-pr...@isocpp.org
I had some thoughts on this last year and started throwing down ideas.  I decided to start with DSP instead of standardizing ways of opening streams to/from devices.  Without a standard way of processing the data going to/from these devices, how can you create an interface that looks like it really belongs?

Here is a quick slide show presentation of some things I worked on.  The presentation is incomplete, but you should get the basic idea.  There is a lot of templates here.


The dsparray and dspvector classes inherit from their respective standard library components.  The project is on github and I'm sure its got some code rot by now.  I've since moved on to something else so I haven't been working on it lately.


The basic concept was to take the idea of valarray and expand it to be something much more usable and capable of being processed in parallel.  Being able to automatically have parallel processing by using these classes was one of the key things I wanted to achieve here.

Also, I believe removing the _Native component to the templates would make things easier since it would rarely be needed and its function can be accomplished in a better way like through a function in each container class.

Hopefully someone can grab some ideas from there.

alexande...@gmail.com

unread,
Jun 3, 2016, 5:50:18 PM6/3/16
to ISO C++ Standard - Future Proposals
That was my original plan. My original plan looked fairly similar but abstracted in a way that allowed for connection to various types of hardware devices, software devices and files in a generic way. My thought on a dsp model was to organize processing nodes in a directed a cyclic graph that connected to endpoints that represented various kinds of inputs and outputs

Thiago Macieira

unread,
Jun 3, 2016, 5:58:50 PM6/3/16
to std-pr...@isocpp.org
Considering the domain of media codecs is a patent minefield, the API should be
designed around never having access to the actual contents of the frames, only
that they exist.

More than likely, just "here's a file or URL, go play it", plus some time-sync
API and some hints on what type of audio it is, so system can route it to the
correct speaker (headphones? bluetooth? loudspeaker?). Plus the converse
recording.

IMHO, just like the 2D graphics proposal, I think this is ill-advised as
content for the C++ Standard Library.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

alexande...@gmail.com

unread,
Jun 3, 2016, 6:30:03 PM6/3/16
to ISO C++ Standard - Future Proposals
I feel that you may be missing the point of the library, in order to do any "real" audio work you need direct access to the audio data. And yes I agree that multimedia is a patent minefield but there are open source versions of most media formats that could be used if needed. But that is besides the point here, the library would only provide a standardized interface of how to interact with audio(or other signal types) and place only the needed constraints on the implementors of the library. The only guarantee that needs to be made is that data will be delivered to and from the application in the requested format if possible, the how will be OS specific and implementation defined

alexande...@gmail.com

unread,
Jun 3, 2016, 6:32:10 PM6/3/16
to ISO C++ Standard - Future Proposals
That was the basis of my original idea for the library but I found it to bee too broad and constrained the idea to an audio context for the mean time, I am planning on working up an alternative version that does what you are suggesting. The same concepts that I use in my library can be applied to any type of signal so long as the program can receive that data from the OS

ron novy

unread,
Jun 3, 2016, 6:32:51 PM6/3/16
to std-pr...@isocpp.org
>Considering the domain of media codecs is a patent minefield, the API should be
designed around never having access to the actual contents of the frames, only
that they exist.

I disagree with this concept.  Frames contain information to be processed.  If someone wants/needs to create a closed system then they can do that on their own in their own proprietary system.  And its not just about playing or recording audio, there is mixing, transcoding, interleaving, deinterleaving etc.  The system driver should be capable of giving you access to a buffer with properties that you choose and the driver itself should then handle getting it to the speakers, but these processing features should still be accessible to C++ in a standard way.  

I honestly don't think reading specific types of files should be a part of the standard, but as someone who develops audio applications, there could be a better standard for getting the data to or from an audio file or device.  And like actual hardware, an audio file or codec could be just another device that you open.  If the system supports it and its driver is installed properly then you can enumerate codecs like devices and use them to open file streams.

On Fri, Jun 3, 2016 at 3:32 PM, <alexande...@gmail.com> wrote:
That was the basis of my original idea for the library but I found it to bee too broad and constrained the idea to an audio context for the mean time, I am planning on working up an alternative version that does what you are suggesting. The same concepts that I use in my library can be applied to any type of signal so long as the program can receive that data from the OS
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Andrey Semashev

unread,
Jun 3, 2016, 6:37:37 PM6/3/16
to ISO C++ Standard - Future Proposals
On Sat, Jun 4, 2016 at 12:13 AM, Tony V E <tvan...@gmail.com> wrote:
>
> The pipeline/graph model also works for images and other data.
>
> A movie or a sound, etc, is a function f(t) that returns data at a given time.

It's not always time-dependent. Standalone image processing, for
example, is not.

My understanding is that the different abstraction layers such as data
structures (pixel, sample, audio/video frame), algorithms (conversion,
resampling, scaling, etc) and higher level tools (timing,
pipeline/graph, multithreading, I/O) should be as decoupled as
possible.

> Should we build a library that does more than sound?

I think the pipeline/graph framework is orthogonal to audio or
whatever media processing, although can be employed for these tasks as
well. Think of a generic framework like in TBB.

That, however, does not immediately help audio/video processing as the
sample/pixel/frame and DSP layers are still missing and, IMHO, would
be a very useful addition even without the graph framework.

alexande...@gmail.com

unread,
Jun 3, 2016, 6:39:10 PM6/3/16
to ISO C++ Standard - Future Proposals
my overall goal is to provide access to real time data streams Ina uniform manor, these stream could be from hardware, software or a file. the types of streams that are supported would have to be up to the system that the program is running upon. There is a need for a standard interface for processing and routing sampled data

Andrey Semashev

unread,
Jun 3, 2016, 6:44:44 PM6/3/16
to ISO C++ Standard - Future Proposals
On Sat, Jun 4, 2016 at 12:50 AM, <alexande...@gmail.com> wrote:
> That was my original plan. My original plan looked fairly similar but abstracted in a way that allowed for connection to various types of hardware devices, software devices and files in a generic way. My thought on a dsp model was to organize processing nodes in a directed a cyclic graph that connected to endpoints that represented various kinds of inputs and outputs

Could you please keep the relevant quotation you reply to? It's
difficult to read a message not really knowing what it responds to.

Andrey Semashev

unread,
Jun 3, 2016, 6:54:57 PM6/3/16
to ISO C++ Standard - Future Proposals
On Sat, Jun 4, 2016 at 12:58 AM, Thiago Macieira <thi...@macieira.org> wrote:
> On sexta-feira, 3 de junho de 2016 20:10:05 BRT Andrey Semashev wrote:
>> On Friday, 3 June 2016 20:04:59 MSK Bjorn Reese wrote:
>> > I would approach an audio API differently.
>> >
>> > The basic audio primitives are playing and recording.
>>
>> I disagree. I would say the basic primitives are a sample and a
>> frame. Samples can be obtained and processed in different ways,
>> including recording, reading from file or generating algorithmically.
>>
>> The processing is often represented as a pipeline or graph with
>> producer, filter and consumer nodes. This is a higher level
>> framework that builds upon the basic blocks of frames and digital
>> signal processing algorithms.
>
> Considering the domain of media codecs is a patent minefield, the API should be
> designed around never having access to the actual contents of the frames, only
> that they exist.

Well, regarding codecs, there are different ones, including the
unencumbered ones. But the point of the proposal is not so much aimed
at codecs but rather at raw media processing.

> More than likely, just "here's a file or URL, go play it", plus some time-sync
> API and some hints on what type of audio it is, so system can route it to the
> correct speaker (headphones? bluetooth? loudspeaker?). Plus the converse
> recording.
>
> IMHO, just like the 2D graphics proposal, I think this is ill-advised as
> content for the C++ Standard Library.

The way you described it - it would be a proposal of a fairly limited
use. And if that was the case I would agree it's not suitable for the
standard library.

However, what the OP described is different and something like what I
implemented to a certain degree in a closed project. I would really
have liked if something like that was in the standard library.

Thiago Macieira

unread,
Jun 3, 2016, 6:56:25 PM6/3/16
to std-pr...@isocpp.org
On sexta-feira, 3 de junho de 2016 15:32:49 BRT ron novy wrote:
> I honestly don't think reading specific types of files should be a part of
> the standard, but as someone who develops audio applications, there could
> be a better standard for getting the data to or from an audio file or
> device. And like actual hardware, an audio file or codec could be just
> another device that you open. If the system supports it and its driver is
> installed properly then you can enumerate codecs like devices and use them
> to open file streams.

I would say that playing sounds from a file and recoding a microphone into a
file are the two most common uses of an audio API.

Thiago Macieira

unread,
Jun 3, 2016, 7:02:49 PM6/3/16
to std-pr...@isocpp.org
On sexta-feira, 3 de junho de 2016 15:30:03 BRT alexande...@gmail.com
wrote:
> I feel that you may be missing the point of the library, in order to do any
> "real" audio work you need direct access to the audio data.

I might be missing your orignial intention, but nonetheless it is a valid
concern and a very common (probably the most common) use-case for an audio
API.

> And yes I agree
> that multimedia is a patent minefield but there are open source versions of
> most media formats that could be used if needed.

That's not how patents work. Please design your system without ever writing a
line of code that reads directly from or writes directly to a file. I would say
that uncompressed PCM is probably safe, but I am not a lawyer and I can't even
say that (note: .wav files are not just plain PCM).

Instead, write it so that it will work on a heavily locked-down system like
iOS.

> But that is besides the
> point here, the library would only provide a standardized interface of how
> to interact with audio(or other signal types) and place only the needed
> constraints on the implementors of the library. The only guarantee that
> needs to be made is that data will be delivered to and from the application
> in the requested format if possible, the how will be OS specific and
> implementation defined

That's orthogonal to what I was asking. And somewhat useless if you can't play
or record audio, nor load it from a file or save it after processing.

Thiago Macieira

unread,
Jun 3, 2016, 7:04:08 PM6/3/16
to std-pr...@isocpp.org
On sábado, 4 de junho de 2016 01:54:55 BRT Andrey Semashev wrote:
> > Considering the domain of media codecs is a patent minefield, the API
> > should be designed around never having access to the actual contents of
> > the frames, only that they exist.
>
> Well, regarding codecs, there are different ones, including the
> unencumbered ones. But the point of the proposal is not so much aimed
> at codecs but rather at raw media processing.

As replied to Alexander: that's not how patents work and raw processing is not
useful if you can't load a file or save your work.

Andrey Semashev

unread,
Jun 3, 2016, 7:05:10 PM6/3/16
to ISO C++ Standard - Future Proposals
On Sat, Jun 4, 2016 at 1:56 AM, Thiago Macieira <thi...@macieira.org> wrote:
> On sexta-feira, 3 de junho de 2016 15:32:49 BRT ron novy wrote:
>> I honestly don't think reading specific types of files should be a part of
>> the standard, but as someone who develops audio applications, there could
>> be a better standard for getting the data to or from an audio file or
>> device. And like actual hardware, an audio file or codec could be just
>> another device that you open. If the system supports it and its driver is
>> installed properly then you can enumerate codecs like devices and use them
>> to open file streams.
>
> I would say that playing sounds from a file and recoding a microphone into a
> file are the two most common uses of an audio API.

I would say real time communication is the most common use case for
audio recording nowdays. And it's not just about recording a
microphone and throwing it to network, unfortunately. There is a fair
bit of media processing going on under the hood.

Andrey Semashev

unread,
Jun 3, 2016, 7:20:17 PM6/3/16
to ISO C++ Standard - Future Proposals
On Sat, Jun 4, 2016 at 2:04 AM, Thiago Macieira <thi...@macieira.org> wrote:
> On sábado, 4 de junho de 2016 01:54:55 BRT Andrey Semashev wrote:
>> > Considering the domain of media codecs is a patent minefield, the API
>> > should be designed around never having access to the actual contents of
>> > the frames, only that they exist.
>>
>> Well, regarding codecs, there are different ones, including the
>> unencumbered ones. But the point of the proposal is not so much aimed
>> at codecs but rather at raw media processing.
>
> As replied to Alexander: that's not how patents work and raw processing is not
> useful if you can't load a file or save your work.

I think patents are irrelevant in this proposal. I don't think anyone
patented the concept of audio samples or pixel representation, at
least not the conventional ones.

How you save and load these samples and which codecs you use is a
question of the available modules you have - whether these modules are
provided by your compiler or OS vendor or a third party or even your
hardware. For all I care, all this stuff could be handled by ffmpeg
internally. The codecs themselves need not be in the C++ standard.
What is needed is a standardized interface for these different modules
to work with each other. The interface should also allow me, the
developer, to work with the media (e.g. create my own audio or image
filter or a new codec or a new device driver).

alexande...@gmail.com

unread,
Jun 3, 2016, 7:29:48 PM6/3/16
to ISO C++ Standard - Future Proposals
That is exactly what I intended

alexande...@gmail.com

unread,
Jun 3, 2016, 8:59:38 PM6/3/16
to ISO C++ Standard - Future Proposals
I will work up a revised version in the next day or so attempting to take into account some of the concerns expressed thus far.

This time I will skip the usage examples and write a specification of the types and behaviors needed for the library as well as the libraries relation to the platform that it is being used on. Hopefully that will make my intent for the library more clear and developable

Thiago Macieira

unread,
Jun 3, 2016, 9:05:34 PM6/3/16
to std-pr...@isocpp.org
On sábado, 4 de junho de 2016 02:05:08 BRT Andrey Semashev wrote:
> > I would say that playing sounds from a file and recoding a microphone into
> > a file are the two most common uses of an audio API.
>
> I would say real time communication is the most common use case for
> audio recording nowdays. And it's not just about recording a
> microphone and throwing it to network, unfortunately. There is a fair
> bit of media processing going on under the hood.

No doubt. But you can't process if you can't record. And even if you do
process some data and send it over the network with proper time sync'ing, you
will want to play it.

Thiago Macieira

unread,
Jun 3, 2016, 9:09:11 PM6/3/16
to std-pr...@isocpp.org
On sábado, 4 de junho de 2016 02:20:15 BRT Andrey Semashev wrote:
> What is needed is a standardized interface for these different modules
> to work with each other. The interface should also allow me, the
> developer, to work with the media (e.g. create my own audio or image
> filter or a new codec or a new device driver).

That I agree with.

But should the standard mandate that there should be at least one? How does
someone write a plugin to the C++ Standard Library? Will we now mandate this
kind of ability?

If not, then will there be a requirement that the C++ library vendor provide
it? If so, please think carefully how Apple should code libc++ to work on iOS.

Again, I think this is not material for the C++ Standard Library.

alexande...@gmail.com

unread,
Jun 3, 2016, 9:26:05 PM6/3/16
to ISO C++ Standard - Future Proposals
I do not think that this is as big as an issue as you portray it, each major platform provides its own mechanism for interfacing with audio hardware. Core Audio on OS X, Alsa on many Linux systems, etc... The standard would supply a specification of an interface that the compiler vendor could use to wrap the platform specific code needed to interact with the audio hardware. if the target platform provides a platform specific library for audio it should be reasonable for a compiler vendor for that platform to integrate that library into its version of libc++.

As far as how someone would write a plugin for the std library? You can write custom alligators, deleters and such, why couldn't you write a device driver given a interface specification?

Should at least one possible driver be available? In an ideal world yes, at minimum you should be able to load a file right? But an audio driver for the library? If it's available from the system, the compiler vendor would be allowed to create a driver around that library. rather than specifying a minimum required number of drivers be available, why not just specify that in the case that your vendor does not supply one that a interface is provided for you to create your own. But at the same time it would not be very difficult for the vendor to supply at least one for major platforms? The only issue I could see is for an embedded system which may not run an OS? But even then the developer could just create one for that target...

Jeffrey Yasskin

unread,
Jun 3, 2016, 9:30:49 PM6/3/16
to std-pr...@isocpp.org
On Fri, Jun 3, 2016 at 6:09 PM, Thiago Macieira <thi...@macieira.org> wrote:
On sábado, 4 de junho de 2016 02:20:15 BRT Andrey Semashev wrote:
> What is needed is a standardized interface for these different modules
> to work with each other. The interface should also allow me, the
> developer, to work with the media (e.g. create my own audio or image
> filter or a new codec or a new device driver).

That I agree with.

But should the standard mandate that there should be at least one? How does
someone write a plugin to the C++ Standard Library? Will we now mandate this
kind of ability?

Yes. We already do this with things like the random engines.

If not, then will there be a requirement that the C++ library vendor provide
it? If so, please think carefully how Apple should code libc++ to work on iOS.

WebCrypto has an example of navigating a legal minefield for implementers.

Also remember to leave the legal speculation to lawyers. It's good to have the heads-up that "hey lawyers should look at this", but anything more is unwise.

Again, I think this is not material for the C++ Standard Library.

I think it's a great area for the C++ library to expand into. I'm definitely worried about the expertise of the people writing the proposal (e.g. if I proposed an audio library for C++, I should be laughed off the stage), but we can overcome that by asking well-known audio users for their opinions before standardizing the proposal.

Jeffrey

alexande...@gmail.com

unread,
Jun 3, 2016, 9:47:29 PM6/3/16
to ISO C++ Standard - Future Proposals
I agree about the issue of expertise. By no means would I call myself an expert, but I have a degree in audio engineering and a pretty solid grasp of the c++ language. Are there people out there more skilled than me at each topic? Of course, but that's why I am here, to get the opinions of those people if I can an figure out what needs to be done to make this library a reality.

ron novy

unread,
Jun 3, 2016, 10:11:26 PM6/3/16
to std-pr...@isocpp.org
Okay, I've done some thinking here.  The simplest solution should be the correct one.  I think the standard should stick to something very basic here, otherwise we get into an area that is too complicated to implement.  All we should have in terms of a standard audio interface for C++ is:

1) A method for enumerating the interfaces on a machine.
2) A method for getting each interfaces capabilities (supported rates, bit depths etc.).
3) A method for activating a requested device for sending and receiving frames in the requested format.

Valid types for audio samples should be either a signed integer, a float or a double.  And that's pretty much it.  Get audio input, put audio output.  Anything other than that would be a different proposal all together.  Audio, video or DSP processing in general should certainly be a different proposal.

But I would propose we also add an int24_t class into the standard for processing 24-bit samples.  Something as simple as this would suffice for now:
class int24_t
{
private:
    int8_t val[3];
};

Something more fully featured would be desirable, like this: https://github.com/RonNovy/CppDSP/blob/master/src/int24_t.h
But I guess that, would be another proposal.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 3, 2016, 10:35:36 PM6/3/16
to ISO C++ Standard - Future Proposals
I think that simplicity is a good way to go but I believe that what you are proposing may be too basic. As it stands none of the std containers meet the needs for audio. The closest I would argue could be std::array of you are okay with a buffer length fixed at compile time or std::vector but std::vector could cause a dynamic memory allocation on a real time thread if the user tries to push_back and the vector needs to expand. This allocation could lead to an unknown wait time for the audio thread and possibly result in buffer under flow to the driver. You could use a plain array but that leaves room for memory leaks and makes copying buffers difficult.

A container designed for real-time usage would be needed.

I don't see how processing would need to be a different proposal, what is an audio library without a well organized and effiecint way means for interacting with the audio data?

If processing does need to be a different proposal, I do believe that they would need to be designed simultaneously so that they might be easily interfaces with each other, because I feel that they would most likely be used in conjunction in many cases.

ron novy

unread,
Jun 3, 2016, 11:51:12 PM6/3/16
to std-pr...@isocpp.org
Well, it needs to start somewhere.  I think having basic buffers using std::array or plain old arrays might do for now and get some basic audio recording and playback into C++.  It should be as simple as possible to get to the interfaces on the machine and make them work.  The data processing can build onto it with more complex primitives, templates and classes.  This way there is at least something semi-usable with code we have today while the rest of the building blocks are being worked on.  Just a thought though.


alexande...@gmail.com

unread,
Jun 4, 2016, 12:04:21 AM6/4/16
to ISO C++ Standard - Future Proposals
I have drafted up a list of basic definitions about how audio could be represented based on your comment here. I think if we are to attempt to move forward a consensus must be reached about the fundamental representation of audio data the library will take.

C++ std::audio library

theory and definitions:

    sample:
    a single sample of audio data.
    can be represented by a signed integer, float or double.
    must pass the following test to be a valid sample type:
        ((std::is_integral<T>::value && std::is_signed<T>::value) || std::is_floating_point<T>::value) == true
    or could be constrained by:
         std::is_arithmetic<T>::value==true
     if unsigned samples are allowed/desired

    frame:
    a collection of 0-N samples
    each sample represents an individual channel of audio data
    indexable as an array
    (should N be runtime or compile time?)

    buffer:
    a collection of 0-N frames
    indexable as an array
    (should N be runtime or compile time?)
    ideally maintains a continous piece of memory to house its frames

    device /interface:
    a device recognized by the system as being capable of reading and writing audio data
    can be polled for information regarding the capabilities of the device/interface


    audio callback:
    a function/lambda/functor registered with the library to be called once per buffer interval

    sampling rate: the number of samples per second of audio data

    frame interval: the length of a frame in seconds (1.0/sample_rate)
    buffer interval: the length of a buffer in seconds (frame_interval* buffer_size)

What should be added/removed/edited or clarified?


On Friday, June 3, 2016 at 9:11:26 PM UTC-5, Ron wrote:
std_audio_theories_concepts.txt

alexande...@gmail.com

unread,
Jun 4, 2016, 12:08:45 AM6/4/16
to ISO C++ Standard - Future Proposals
I agree, how do you feel about the concept i proposed in my original about having std audio stream classes similar to std::fstream. The std::astream/iastream/oastream would represent the default audio device on the system as determined by some method of polling the system. The user could either choose to use them as is or manually select a different device?

Ron

unread,
Jun 4, 2016, 2:28:01 AM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Some thoughts...

'N' should be runtime for channels in a frame and for a frame buffer.  The buffer size might be negotiable with the device.

I like a lot of that code in the original post, but I don't like the comparison to iostreams.  Its really the '>>' operator that I don't like.  When you copy into your frame buffer, an '=' operator should be used since anyone that sees an '=' operator knows its a copy.  But really, you shouldn't need to move data around so much.  If you stream from input to 'buff', then process and then move from 'buff' to output then you are moving a lot of data unnecessarily.

I think what should happen is this; The process chain is triggered when the output signals it is ready for data by calling the first linked process.  Starting at the output, each process kernel calls the previous until it reaches the input where the input device/file/etc copies the data to the chains frame buffer.  Each process then operates on that single buffer in turn until finally returning to the output, thus altering the buffer instead of moving data from place to place.  I like to think of the processing chain as starting from the output or destination.  When the output is ready for data it calls to a linked process for information, it calls to the next link and so on until data begins to get processed.  So a bottom up approach, where the input doesn't officially start until the output is ready.  I know it is opposite of the way audio really flows through a system but it makes things easier then moving data like a bucket brigade.  It can also allow easier parallel processing when there are multiple branches on a node.  So if you think of an audio mixer, where there are many inputs and a single output, the output calls to many channels/nodes at once so they can all be processed in parallel and mixed together after each of the processes end.

Also, I'm not sure the word 'format' should be used to refer to files.  I know people like to say "file format" but then, what format is the audio data in that is in the file format?  Its just confusing, so I think 'container' is a better word for files.  Then that leaves the word format open for use as a way of describing the data format of an audio frame buffer.  So an audio_format class would contain information on sample rate, channels, interleaving, channel routing and other information and a container class would deal with files.

I personally would rather have all audio with greater than 2 or 3 channels processed in Ambisonic-B format and then have the audio driver figure out how to extract the channels for each speaker, but I know this won't fly with everyone so we would need a class for channel routing information as well.

All I can think of at the moment..

Andrey Semashev

unread,
Jun 4, 2016, 4:50:00 AM6/4/16
to std-pr...@isocpp.org

On Saturday, 4 June 2016 11:43:17 MSK Thiago Macieira wrote:

> On sábado, 4 de junho de 2016 02:20:15 BRT Andrey Semashev wrote:

> > What is needed is a standardized interface for these different modules

> > to work with each other. The interface should also allow me, the

> > developer, to work with the media (e.g. create my own audio or image

> > filter or a new codec or a new device driver).

>

> That I agree with.

>

> But should the standard mandate that there should be at least one? How does

> someone write a plugin to the C++ Standard Library? Will we now mandate this

> kind of ability?

>

> If not, then will there be a requirement that the C++ library vendor provide

> it? If so, please think carefully how Apple should code libc++ to work on

> iOS.

 

I think there should not be a requirement to provide a single module. But there should be a standard way to enumerate the available devices and codecs. The standard should allow for implementation to have no hardware audio capability.

 

Message has been deleted

alexande...@gmail.com

unread,
Jun 4, 2016, 10:12:05 AM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
- hide quoted text -
For the most part I agree with you, I think that using the << operator in the callback would be confusing and ahold be replaced with = to be more clear, and that the callback parameters should be changed to be of type frame_buffer to keeps things simple. But I think that using the << operator for routing purposes still may be useful. Consider the situation where you would like to have multiple types of inputs/outputs or connect to multiple devices and need to specify which ones you are routing to and from. My intent there was to provide an interface that allowed you to specify that your program would take input from 0-N selected inputs and 0-N selected outputs if desired.

I think using a pull model for determining order of operations could work in most cases but it would be more complex if you desired a multi out situation. The model that worked best in theory for me in that situation was to treat the signal flow as a graph and have each node in the graph keep track of its completion for the buffer period, the outputs would ask their dependent nodes if they are done and if so take the result, in turn those nodes would ask their dependent nodes or inputs about completion until the graph reaches all input or non dependent nodes. A friend of mine likened this to A* path finding without weighting certain paths. Each node waits on its dependents, which eventually will be a device input or a file. This all of course assumes support of a graph model that allows for dynamic connection of I/o, which I find to be desirable.

Do you think that multiple io should be supported?

Do you think that there needs to be some way of routing between the devices and your program?

alexande...@gmail.com

unread,
Jun 4, 2016, 10:38:54 AM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Is this example more agreeable?

//std::audio example 1 "single process"
void example_1(){
   
double sample_rate = 44100;
    std
::size_t frame_size =2;
    std
::size_t buffer_size=128;

    std
::audio_context<float> ctx{sample_rate,buffer_size,frame_size};//contruct from values

    std
::astream_process<float> proc(ctx,[](std::frame_buffer<float> const& input, std::frame_buffer<float>& output){
       
for(std::size_t i =0; i <input.size();++i){
            output
[i] = input[i] * 1.0;
       
}
   
});

    proc
.start();
   
//do other stuff
    proc
.stop();
}



On Saturday, June 4, 2016 at 1:28:01 AM UTC-5, Ron wrote:

alexande...@gmail.com

unread,
Jun 4, 2016, 10:41:22 AM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Note that the input must be a const& because in  aparalell context it would allow multiple processes to read safely, this has the side effect of disabling the range for syntax unless you do this:

//std::audio example 1 "single process"
void example_1(){
   
double sample_rate = 44100;
    std
::size_t frame_size =2;
    std
::size_t buffer_size=128;

    std
::audio_context<float> ctx{sample_rate,buffer_size,frame_size};//contruct from values

    std
::astream_process<float> proc(ctx,[](std::frame_buffer<float> const& input, std::frame_buffer<float>& output){

        output
=intput;//note the copy required
       
for(auto&& i: output){
            i
*=1.0;
Message has been deleted

alexande...@gmail.com

unread,
Jun 4, 2016, 11:01:54 AM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Here is a proposed implementation of the definition of a sample:

#include <type_traits>

//if sample must be signed when integer
template<typename T>
struct is_sample :public std::integral_constant<bool,std::is_floating_point<T>::value || (std::is_integral<T>::value && std::is_signed<T>::value)>{};

//otherwise
template<typename T>
using is_sample = std::is_arithmetic<T>;

In order for some type T to be valid as a sample it must pass "is_sample<T>"



Tony V E

unread,
Jun 4, 2016, 12:15:31 PM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
But most of the time you don't want to copy the data. 

Whenever possible you want to process in place.

Sent from my BlackBerry portable Babbage Device
Sent: Saturday, June 4, 2016 10:41 AM
To: ISO C++ Standard - Future Proposals
Subject: Re: [std-proposals] Any interest to adding audio support to the std library?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 4, 2016, 12:39:50 PM6/4/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
I agree that in place processing is desired, but an issue comes up when you need to preserve that data for some other process to use as well, you would either have to copy before hand and pass a copy to each process or make the buffer a const& and force you to copy it and work in place on the output buffer

Jim

unread,
Jun 4, 2016, 6:24:16 PM6/4/16
to ISO C++ Standard - Future Proposals
Hi,

I think it would be more feasible and elegant to use a standard or *more* standard C++ media interface. Building on existing design used for the C++ boost networking APIs, and std::basic_ostream, std::basic_streambuf features, something like std::char_traits<sample>. Using an existing audio framework or creating a less common C++ approach to handling media, might break an architectual flow of the C++ standard.

The same backend ideas used for a networking API interface hold true for an audio interface. What we read and write to/from an audio interface are samples and the structure is the format settings. The same idea is used in networking, we can read and write raw protocol or more structured one like TCP packets. The audio format is like a protocol you setup between the hardware and software. The i/o features async, sync etc. will be to some extent picked up automatically from the existing backend interface. Capturing concepts this way in a low level implementation should make acceptance easier.


--Jim Smith

On Friday, June 3, 2016 at 11:43:42 AM UTC-4, Jeffrey Yasskin wrote:
Is this based on an existing library? We're much more likely to adopt
a proposal that's been used widely than one that was invented for the
standard.

On Fri, Jun 3, 2016 at 2:37 AM,  <alexande...@gmail.com> wrote:
> I have drafted some ideas on how I think the c++ std library could support
> audio functionality.
>
> I know that audio functionality is a very operating specific problem, but
> with the recent trend towards implementing a file-system library and
> possibly a graphics library I believe that audio would not be too much of a
> reach anymore.
>
> Here are some of the ideas I have so far. I have both some code examples of
> the intended usage as well as a list of the types needed to implement the
> given examples.
>
> Please keep in mind my drafts are still very rough.
>
>
> CODE EXAMPLES
>
> //std::audio example 1 "single process"
> void example_1(){
>     double sample_rate = 44100;
>     std::size_t frame_size =2;
>     std::size_t buffer_size=128;
>
>     std::audio_context<float>
> ctx{sample_rate,buffer_size,frame_size};//contruct from values
>
>     std::astream_process<float> proc(ctx,[](std::iastream const& input,
> std::oastream& output){
>         std::frame_buffer<float>& buff = ctx.borrow_buffer();//borrow a
> buffer from the context for usage
>         //prevents the need for dynamic allocation of a temporary buffer
>         input>>buff;//stream data into buffer for manipulation
>         for(auto&& frame: buff){
>             frame=0.0;//do something with audio
>         }
>         output<<buff;//stream to output
>     });//dsp object
>     //uses implied routing equivilent to
>     //std::aout<<proc<<std::ain;
>     //
>
>     proc.start();
>     //do other stuff
>     proc.stop();
> }
>
> //std::audio example 2 "process group"
> void example_2(){
>
>     std::audio_context<float> ctx;//default context created with
> std::default_* values
>
>     //version 1: capture context via lambda
>     std::astream_process<float> proc1(ctx,[&ctx](std::iastream const& input,
> std::oastream& output){
>         std::frame_buffer<float>& buff = ctx.borrow_buffer();
>         input>>buff;
>         for(auto&& frame: buff){
>             frame*=0.5;
>         }
>         output<<buff;
>     });//dsp object
>
>     //version 2: have context passed as argument
>     std::astream_process<float> proc2(ctx,[](std::iastream const& input,
> std::oastream& output,std::audio_context<float> const& context){
>         std::frame_buffer<float>& buff = ctx.borrow_buffer();
>         input>>buff;
>         for(auto&& frame: buff){
>             frame*=2.0;
>         }
>         output<<buff;
>     });
>
>     std::process_group<float> pgroup;//a group of processes that will happen
> consecutivley
>     pgroup.push(proc1);//add to group
>     pgroup.push(proc2);//add to group
>
>     //configure stream relationships in terms of std::ain / std:aout
> manually
>     //std::ain/std::aout are std::astream globals that refer to the default
> audio inputs and outputs supplied by the context in use
>     //std::ain/std::aout will route the audio to the enpoint specified by
> the context reference held by the process that is streaming the data
>     std::aout<<proc1<<proc2<<std::ain;//method 1
>     //std::ain>>proc2>>proc1>>std::aout;//method 2
>
>     pgroup.start();
>     //do other stuff
>     pgroup.stop();
>
> }
>
>
> //std::audio example 3 "audio files"
> void example_3(){
>
>     std::audio_context<float> ctx;
>
>     std::astream_process<float> proc(ctx,[](std::iafstream const& input,
> std::oafstream& output){
>         std::frame_buffer<float>& buff = ctx.borrow_buffer();
>         input>>buff;
>         for(auto&& frame: buff){
>             frame=0.0;
>         }
>         output<<buff;
>     });//dsp object
>
>     std::iafstream audio_file1(ctx,"filename1.extension");//an audio file
> handle
>     std::oafstream audio_file2(ctx,"filename2.extension");//an audio file
> handle
>
>     //routing
>     audio_file2<<proc<<audio_file1;//take input from file nad write to file
>     //audio_file1>>proc>>audio_file2;//equivilent syntax
>     proc.start();
>     //do other stuff
>     proc.stop();
> }
>
>
> //std::audio example 4 "combination routing"
> void example_3(){
>
>     std::audio_context<float> ctx;
>     //manually select hardware endpoints
>     std::size_t device_id = ctx.default_device_id();
>     std::iastream input_device =
> ctx.get_device<std::input_device>(device_id);
>     std::oastream output_device =
> ctx.get_device<std::output_device>(device_id);
>
>     std::astream_process<float> proc(ctx,[](std::iastream const& input,
>                                             std::oastream& output,
>                                             std::iafstream const&
> input_file,
>                                              std::oafstream& output_file){
>         std::frame_buffer<float>& buff = ctx.borrow_buffer();
>         (input + input_file)>>buff;//add streams to perform sum before
> writing to buffer
>         //or you could use seperate buffers
>         //like this
>         /*
>             std::frame_buffer<float> buff1;
>             std::frame_buffer<float> buff2;
>
>             input>>buff1;
>             input_file>>buff2;
>             buff1+=buff2;//buffer arithmatic
>         */
>         output<<buff;//send the contents of buff to the hardware out and the
> file out
>         output_file<<buff;
>     });
>
>     std::iafstream audio_file1(ctx,"filename1.extension");//the actual files
> to be used above
>     std::oafstream audio_file2(ctx,"filename2.extension");
>
>     //connect the files to the process
>     //connect the hardware device to the process
>     audio_file2<<proc<<audio_file1;//take input from file
>     output_device<<proc<<input_device;//also take from hardware
>     proc.start();
>     //do other stuff
>     proc.stop();
> }
>
>
>
> REQUIRED LIBRARY MEMBERS
>
>
> namespace std{
>     inline namespace audio{
>         //working context for audio flow
>         template<typename>
>         class audio_context;
>         /*
>         *The context in which all audio data is centered.
>         *Contains: sampling rate, buffer size, frame size, etc...
>         *The values of ain,aout,afin,afout refer to the endpoints defined by
> the context, when applied to routing on a porocess tied to the context
>         *think of a context as the program level driver object
>         */
>
>         //audio streams (think like std::fstream and its friends)
>         class astream;//audio stream
>         class oastream;//output audio stream
>         class iastream;//input audio stream
>         class oafstream;//output audio file stream
>         class iafstream;//input audio file stream
>
>
>         //stream endpoints
>         class ain;//audio input endpoint
>         class aout;//audio output endpoint
>         class afin;//audio file input endpoint
>         class afout//audio file output endpoint
>
>         //stream processing
>         template<typename>
>         class astream_process;//a dsp process applied to a stream
>
>         template<typename>
>         class process_group;//a group of processes that will act as one
>
>         //containers
>         template<typename>
>         class frame_buffer;//a sequence container that is resizeable at
> runtime, but only with explicit resize calls. contains frames(see below)
>         /*Implementation note on frame_buffer
>          *frame_buffer is intended to hold N number of frames which
> themselves can hold M number of samples
>          *meaning that the total size in samples if frame_buffer = N * M
>          *ideally frame_buffers representation of its sample data will be
> continuous in memory
>         */
>
>         template<typename>
>         class frame;//a container that holds samples, thin array wrapper
>
>
>         //hardware representation
>         class device;//an audio device as recognized by the OS
>         class input_device;//an input device
>         class output_device;//an output device
>
>         // audio file formats
>         enum class afformat{
>             raw,//raw headerless audio bytes, interpreted only by the
> settings of the context.
>             //best used for temporary storage within the life of a context
>             wav,
>             flac//etc...
>         }
>     }
> }
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> To view this discussion on the web visit

Robert Bielik

unread,
Jun 7, 2016, 4:59:50 AM6/7/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Dear all,

I've been programming audio for almost 20 years now, and just the other day started thinking of doing a C++ based basic audio API meeting my needs. So this is a very interesting topic indeed to me. I've done parts of Portaudio library (WDM/KS WaveRT) and the WASAPI Exclusive mode port of JUCE, hopefully I can be of some use :)

Some additional features I'd like to see, not sure if they've been addressed:
- "Hotplugging", i.e. the possibility to receive notifications when devices are inserted/removed, and handle this gracefully.
- Timestamp for callbacks, flags for input overflow and output underflow.

Regards
/Robert

Ross Bencina

unread,
Jun 9, 2016, 12:07:29 PM6/9/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Hello Everyone,

I've been involved with PortAudio since the beginning. PortAudio supports 10+ audio APIs. I didn't write all the implementations myself, but I've worked with the people who did. I use C++ almost daily and have been lurking in [sg14] recently. Although I'm not a modern C++ guru, I can offer some insights about the API and requirements. In particular things that PortAudio doesn't do (or doesn't do very well).

Here's a bit of a braindump:

- The streams (read(), write()) model is problematic if you want to do synchronous input/output (i.e. audio processing, or some forms of echo cancellation). The problem is that there is no explicit notion of time that can be used to synchronize input and output buffers. In PortAudio we offer both streams and callback. Most modern native APIs use callbacks or something equivalent. There are arguments for streams too, I'm don't remember them.

- Enumerating capabilities is a can of worms. There is a wide variety of capability managment systems out there. Some of the challenges include: slow speed of accessing native device information, non-orthogonal formats of device information, non-observable constraints between device capabilities (e.g. this device supports 96k, but only in stereo, 44k 8 channel is ok though!).

- You may want to clarify who this API is targeted at. (Will it support Pro Audio?)

- A model of time seems to be lacking from the current proposal. There should be a way to correlate input/output samples with a system monotonic clock. This means you also need to be able to provide latency information.

- Periodicity of callbacks (or not) and fraction of total callback period that can be consumed should be documented.

- Concurrency issues need to be clearly documented. For example, it should be expressly prohibited from using blocking synchronisation primitives in an audio callback. See e.g. http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing

- There needs to be some mechanism to signal asynchronous failures (common for example on OS X where the audio subsystem can reconfigure itself while audio is playing back).

- Latency control (most APIs provide support for adjusting buffer sizes, and many also have different modes for low/high latency applications, e.g. bypass high-latency mixing).

- Robert already mentioned hot-plug. This is a must-have feature these days (although PortAudio doesn't yet support it)

At a minimum I recommend reviewing the documentation for PortAudio, CoreAudio, WASAPI, ALSA and JACK to make sure that you have all of the domain elements.

A while ago I compiled some notes on buffering models in different audio APIs, it might be useful:

Unfortunately, right now I don't have time to participate much in this discussion due to deadlines. But if this proposal is serious, if someone would like to send me drafts for review (to ro...@audiomulch.com), I'm happy to give feedback.

As an idea: you could consider building a demonstration implementation on top of PortAudio (either starting from our existing C++ binding, or from scratch).

Kind Regards,

Ross.

alexande...@gmail.com

unread,
Jun 9, 2016, 10:26:37 PM6/9/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com
Do you have any suggestions as far as an explicit notion of time should be in this context?

Ron

unread,
Jun 9, 2016, 11:05:08 PM6/9/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com

On Thursday, June 9, 2016 at 7:26:37 PM UTC-7, alexande...@gmail.com wrote:
Do you have any suggestions as far as an explicit notion of time should be in this context?

As a suggestion, something I usually use is a start and end to the buffer and the rate of the buffer.  Basically a class that has a value equal to the start frame of the given buffer, a value equal to the end frame of the buffer (or start of next frame) and a value equal to the frame rate.  So you can derive a real for each, like start frames over frame rate and end frame over frame rate.  From that you can get the seconds as a double or long double.

class frame_ticks  // or some name that better describes this...
{
    uint64_t fstart
;
    uint64_t fend
;
    uint64_t frate
;
   
// ...
};

Allowing the start value to be set or reset on starting or stopping record/playback is another feature that some might need here.  And I don't recommend eliminating the rate portion from this class since it could lead to trouble where people are making assumptions they shouldn't, best to have them together.

Also, I would not add other data to this class since it could be re-used elsewhere as is.  Other properties of a buffer can be defined in another class/structure, but again this is just a suggestion...

alexande...@gmail.com

unread,
Jun 10, 2016, 12:17:17 AM6/10/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com
So essentially it is a class that provides the needed data to calculate the time in seconds a number of frames should take at a specified rate?

So:
int number_of_frames = fend - fstart;
double seconds_elapsed = number_of_frames/frate;

Am i understanding that correctly?

alexande...@gmail.com

unread,
Jun 10, 2016, 12:21:25 AM6/10/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com
Or are we trying to define how to determine a specific timepoint that a specific frame corresponds to?

double start = fstart/frate;//timepoint that corresponds to the start of the frame
double end = fend/frate;

double elapsed = end-start;//duration of frame?

alexande...@gmail.com

unread,
Jun 10, 2016, 12:30:02 AM6/10/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com
I suppose we could use the std::chrono library as a basis, and define a clock type that has a dynamically set epoch. This would allow a time-point to have a value relative to when the stream was started?

So maybe the introduction of:

std::audio::sample_clock;//a clock type that counts in frames(or samples)
/*
-has internal member of "rate" that is used for conversion
-keeps track of number of frames(or samples) elapsed
-can produce a time_point that is a real number of seconds from epoch.
-epoch can be reset
-clock will be synced with hardware??? if possible?? or good idea??


*/

std
::chrono::time_point<std::audio::sample_clock> foo;//the current time since epoch in seconds





On Thursday, June 9, 2016 at 10:05:08 PM UTC-5, Ron wrote:

ron novy

unread,
Jun 10, 2016, 1:51:39 AM6/10/16
to std-pr...@isocpp.org
Yes, I believe that would work. ;)

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

ross.b...@gmail.com

unread,
Jun 10, 2016, 4:06:31 AM6/10/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com, ross.b...@gmail.com
Do you have any suggestions as far as an explicit notion of time should be in this context?

It needs to be a monotonic system clock.

And you need to be able to correlate it with your i/o buffers.

But you can not assume (as I think you have) that i/o buffers represent a contiguous stream of samples, because there can be dropped buffers, and you need to represent this. (btw. this is another example of how real-time audio is not like an iostream).

Something I forgot earlier: each DAC or ADC may be running in a different clock domain. The CPU time is usually another clock domain. So you have multiple drifting clocks. One consequence of this is that you can't make simplifying assumptions that e.g. 44100 samples elapsed is exactly 1 second of system monotonic time. That's why you need timestamps associated with the buffers, so that the client can estimate the actual sample rate.


ross.b...@gmail.com

unread,
Jun 10, 2016, 4:13:22 AM6/10/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
Do you have any suggestions as far as an explicit notion of time should be in this context?

As a follow-up:

My default suggestion in all cases would be to consider what PortAudio does:

Here is a list of analysis documents that we produced when we redesigned the V19 API to cover a bunch of missing cases from the earlier API:

PortAudio doesn't necessarily cover all advanced use-cases. For that you'll need to consult other APIs.

Obviously representing system time with a double is not a great idea (we did it for portability and 32-bit compatibility). Using std::chrono in C++ makes sense.

Another thing that I forgot earlier: you need to consider that the client of your API may not be able to configure the sample rate of the device. Sometimes this is controlled by the client, but other times it's controlled by the device (e.g. the device is synced to another clock source, or its sample rate may be switched by a hardware switch on the device).

Ross.
 

Andrey Semashev

unread,
Jun 10, 2016, 6:34:52 AM6/10/16
to std-pr...@isocpp.org

On Friday, 10 June 2016 13:11:14 MSK alexande...@gmail.com wrote:

> I suppose we could use the std::chrono library as a basis, and define a

> clock type that has a dynamically set epoch. This would allow a time-point

> to have a value relative to when the stream was started?

>

> So maybe the introduction of:

>

> std::audio::sample_clock;//a clock type that counts in frames(or samples)

> /*

> -has internal member of "rate" that is used for conversion

> -keeps track of number of frames(or samples) elapsed

> -can produce a time_point that is a real number of seconds from epoch.

> -epoch can be reset

> -clock will be synced with hardware??? if possible?? or good idea??

>

> */

> std::chrono::time_point<std::audio::sample_clock> foo;//the current time

> since epoch in seconds

 

+1 for using chrono.

 

+1 for monotonic clock (as Ross suggested). At least, by default.

 

-1 for using an arbitrary epoch clock.

 

In my practice I found it useful to be able to synchronize multiple streams, which may not have started at the same time. Or in the same process at all.

 

I'm not sure there is much use in binding particular frames to the real world clock, at least not in audio processing domain. In video/image processing this could be useful, e.g. to present an image to the user at the given time. However, I feel it would still be useful to allow specifying a custom clock to the audio processing framework as well.

 

One use case I have in mind is providing a custom clock which is guaranteed to be equivalent to CLOCK_MONOTONIC on POSIX systems. Unlike std::chrono::steady_clock, this custom clock would be useful in interfacing with OS primitives like condition variables or events. Having such clock time points in audio frames would be useful.

 

One other thing that could be useful is a clock adaptor, which implements the usual clock interface, but provides time points in sample rate units. The actual time readings would be obtained from an underlying clock. Something like this:

 

template< typename BaseClock, unsigned int SampleRate >

class sample_rate_clock

{

public:

typedef BaseClock base_clock;

static constexpr unsigned int sample_rate = SampleRate;

typedef ratio< 1, sample_rate > period;

typedef typename base_clock::rep rep;

typedef std::chrono::duration< rep, period > duration;

typedef std::chrono::time_point< sample_rate_clock > time_point;

// ...etc. - other clock members as usual,

// imported from BaseClock as needed

 

static time_point now()

{

return time_point(duration_cast< duration >(

base_clock::now().time_since_epoch()));

}

};

 

Or, on the second thought, this might be a generic tool, not related to audio processing at all...

 

Andrey Semashev

unread,
Jun 10, 2016, 6:37:26 AM6/10/16
to std-pr...@isocpp.org

On Friday, 10 June 2016 13:36:48 MSK alexande...@gmail.com wrote:

> Or are we trying to define how to determine a specific timepoint that a

> specific frame corresponds to?

>

> double start = fstart/frate;//timepoint that corresponds to the start of

> the frame

> double end = fend/frate;

>

> double elapsed = end-start;//duration of frame?

 

Floating point types are definitely not the way to go for timestamps.

 

alexande...@gmail.com

unread,
Jun 10, 2016, 8:24:51 AM6/10/16
to std-pr...@isocpp.org
I lie your thought on the clock idea and +1 for the example. A few thoughts though, I would like to explain the distinction that I draw between an "arbitrary epoch" and a "dynamic epoch" as I had proposed. It would not be arbitrary in that it would be set to some random point, but rather would be set too a significant time point in terms of the current instance of the program... It would be reset to the time when the stream was started. In port audio it would be set to the time when Pa_StartStream was called, having that as a reference point would allow you to manage the local time within that stream easily. 

As a second note the sample rate will most likely need to be a runtime parameter and cannot be a template parameter because of such( unless we specify the need for a factory to create clocks of different sample rates) we will probably not know the desired sample rate or even the supported sampling rates at compile time
--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 10, 2016, 8:37:26 AM6/10/16
to std-pr...@isocpp.org
Or at least store a time point each time the stream is started using a clock with a well defined epoch, so that we might be able to find the length of time that the stream has been active and have meaningful measurements of time within the stream.

alexande...@gmail.com

unread,
Jun 10, 2016, 8:46:05 AM6/10/16
to ross.b...@gmail.com, ISO C++ Standard - Future Proposals
I see know what you mean.

Do the IO "driver" would take time stamps at the start and finish of each buffer and calculate the actual operating sample rate of each clock domain? That information would also be of use for  latency calculation because we could predict when the next buffer should come and then determine when it does and note the difference?

Andrey Semashev

unread,
Jun 11, 2016, 4:03:25 AM6/11/16
to std-pr...@isocpp.org

On Saturday, 11 June 2016 10:55:17 MSK alexande...@gmail.com wrote:

> I lie your thought on the clock idea and +1 for the example. A few thoughts

> though, I would like to explain the distinction that I draw between an

> "arbitrary epoch" and a "dynamic epoch" as I had proposed. It would not be

> arbitrary in that it would be set to some random point, but rather would be

> set too a significant time point in terms of the current instance of the

> program... It would be reset to the time when the stream was started. In

> port audio it would be set to the time when Pa_StartStream was called,

> having that as a reference point would allow you to manage the local time

> within that stream easily.

 

If I understood you right, that's exactly what I'm arguing against.

 

> As a second note the sample rate will most likely need to be a runtime

> parameter and cannot be a template parameter because of such( unless we

> specify the need for a factory to create clocks of different sample rates)

> we will probably not know the desired sample rate or even the supported

> sampling rates at compile time

 

A runtime-set precision cannot be implemented with chrono. I would suggest to just avoid the sample rate based clocks then and use timestamps in conventional units (ms, us, ns...).

 

Andrey Semashev

unread,
Jun 11, 2016, 4:09:54 AM6/11/16
to std-pr...@isocpp.org

On Saturday, 11 June 2016 11:03:41 MSK alexande...@gmail.com wrote:

> Or at least store a time point each time the stream is started using a clock

> with a well defined epoch, so that we might be able to find the length of

> time that the stream has been active and have meaningful measurements of

> time within the stream.

 

There is no need to have the epoch bound to the beginning of the stream to calculate its duration. Or any particular epoch at all. All you need is the timestamp of its first and last frames.

 

A fixed epoch becomes important when you try to synchronize multiple streams together. But even then it's not important what exactly is the epoch; what is important is that it has to be the same for all streams you are processing.

 

dgutson .

unread,
Jun 11, 2016, 11:00:04 PM6/11/16
to std-proposals

Sorry top posting or if this has been already pointed.
Is there any reason this hasn't been submitted to Boost and let it mature there for few years?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Jeffrey Yasskin

unread,
Jun 12, 2016, 8:20:38 AM6/12/16
to std-pr...@isocpp.org

It doesn't have to go through boost, but I would like to see a public repository with a bunch of users.


alexande...@gmail.com

unread,
Jun 12, 2016, 5:09:44 PM6/12/16
to std-pr...@isocpp.org
In that case I would argue for the use of the "std::chrono::steady_clock" based on the fact that it is specified to be a monotonic clock. 

Or a clock defined in a similar manor that counts in samples. 

I also would imagine that our clock would need to have "is_steady" always be true?
--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 12, 2016, 11:40:47 PM6/12/16
to ISO C++ Standard - Future Proposals, alexande...@gmail.com
How do you all feel about this definition for a clock?

#include <chrono>
#include <type_traits>

//define audio_clock as high_resolution clock if high_resolution clock is_steady==true else use steady_clock
//conditional will result in a clock that is monotonic either way
//but can result in using a high_resolution_clock on certain platforms
using audio_clock = typename std::conditional<std::chrono::high_resolution_clock::is_steady,
                                              std
::chrono::high_resolution_clock,
                                              std
::chrono::steady_clock>::type;



On Sunday, June 12, 2016 at 4:09:44 PM UTC-5, alexande...@gmail.com wrote:
In that case I would argue for the use of the "std::chrono::steady_clock" based on the fact that it is specified to be a monotonic clock. 

Or a clock defined in a similar manor that counts in samples. 

I also would imagine that our clock would need to have "is_steady" always be true?


On Jun 11, 2016, at 4:09 AM, Andrey Semashev <andrey....@gmail.com> wrote:

On Saturday, 11 June 2016 11:03:41 MSK alexande...@gmail.com wrote:

> Or at least store a time point each time the stream is started using a clock

> with a well defined epoch, so that we might be able to find the length of

> time that the stream has been active and have meaningful measurements of

> time within the stream.

 

There is no need to have the epoch bound to the beginning of the stream to calculate its duration. Or any particular epoch at all. All you need is the timestamp of its first and last frames.

 

A fixed epoch becomes important when you try to synchronize multiple streams together. But even then it's not important what exactly is the epoch; what is important is that it has to be the same for all streams you are processing.

 

--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposals+unsubscribe@isocpp.org.

Andrey Semashev

unread,
Jun 13, 2016, 4:24:38 AM6/13/16
to std-pr...@isocpp.org

As I suggested earlier, I think the clock should be specified by the user (probably as a template parameter for the audio frame).

ron novy

unread,
Jun 13, 2016, 5:32:04 AM6/13/16
to std-pr...@isocpp.org
Well, what if we just had both a sample clock and a high resolution clock together?  One could then use either or both at their own discretion.  And theoretically you could then calculate some approximation of clock jitter or drift between buffers/packets or separate record devices.  Or does that even make sense?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 13, 2016, 7:46:55 AM6/13/16
to std-pr...@isocpp.org
That could be a workable idea. The definition proposed earlier would be for just the high resolution clock though. That definition provides a type that will always be steady and could be high resolution as well. The important part is steady. Please note that the standard says that high_resolution_clock can be an alias of steady clock, system clock or its own third clock type. A sample clock would need to be custom built to count in samples as well as somehow sync with other clocks(hardware and software) and would probably not be as useful as I had originally thought. Having a steady high resolution clock in seconds would be much more versatile than a sample clock.
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 13, 2016, 7:48:49 AM6/13/16
to std-pr...@isocpp.org
If that should be the case we would have to write a type trait that makes sure that the user supplied clock was an acceptable clock source. I suggest using the built in ones for the mean time because the std makes certain guarantees about those clocks that can make sure that thy meet our needs. Not to mention that a default would be needed even if we allowed user supplied clocks
--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Andrey Semashev

unread,
Jun 13, 2016, 8:05:32 AM6/13/16
to std-pr...@isocpp.org

On Monday, 13 June 2016 14:55:43 MSK alexande...@gmail.com wrote:

> If that should be the case we would have to write a type trait that makes

> sure that the user supplied clock was an acceptable clock source.

 

I'm not sure we need a trait for checking the clock for acceptance.

 

> I suggest

> using the built in ones for the mean time because the std makes certain

> guarantees about those clocks that can make sure that thy meet our needs.

> Not to mention that a default would be needed even if we allowed user

> supplied clocks

 

The problem I have with std::chrono::steady_clock is that it's not guaranteed to be equivalent to CLOCK_MONOTONIC. Or any other POSIX clock type. It may work as a default clock type, should we decide one is needed, but IMO it should be customizable from the start.

 

alexande...@gmail.com

unread,
Jun 13, 2016, 11:52:40 AM6/13/16
to std-pr...@isocpp.org
It may not match any posix monotonic clock type but it is specified to be monotonic as far as I remember( at least going off of http://en.cppreference.com/w/cpp/chrono/steady_clock) I will dig through the std eventually to double check. As long as the is_steady is set to true it will work for this context
--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/Hkdh02Ejx6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Robert Bielik

unread,
Jun 13, 2016, 12:48:31 PM6/13/16
to std-pr...@isocpp.org

Since ultimately, the timing of audio frames could be used to synchronize audio from sources not sharing a clock, the resolution of the timestamp must be substantially larger than "number of samples ", I propose what was mentioned earlier, a steady clock with nanosecond resolution.

/R

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

alexande...@gmail.com

unread,
Jun 13, 2016, 1:06:42 PM6/13/16
to std-pr...@isocpp.org
That is why I am arguing for the definition of a clock I provided in the code example earlier. It will provide the highest resolution steady clock that chrono provides. Or we will need to create our own. But I'm at least for the mean time Trying to use what is already standardized 
Reply all
Reply to author
Forward
0 new messages