|Intent to Implement: Media Stream Recording API||Rachel Blum||7/11/13 5:06 PM|
Intent to Implement: Media Stream Recording API
An API for media stream recording. Allows web applications to record encoded media streams.
getUserMedia makes raw media input available to web apps, but apps currently have no way to access the media streams after encoding. This has led to workarounds like recorder.js (https://github.com/mattdiamond/Recorderjs using web audio, https://github.com/jwagener/recorder.js/ for Flash-based recording) and weppy (movie recording as raw image sequences - http://antimatter15.github.io/weppy/demo.html) for video. The media files so produced are extremely bulky compared to compressed and encoded formats.
Specific applications that would benefit from this capability are audio recording apps, screen recording, video editing and compositing, and communications apps (such as Hangouts), which all currently require plugins or NaCl libraries to achieve acceptable performance.
Small to medium. One of the authors of the spec is Travis Leithead from Microsoft. Mozilla is currently actively working on this (http://lists.w3.org/Archives/Public/public-media-capture/2013Jul/0016.html , https://bugzilla.mozilla.org/show_bug.cgi?id=803414 ).
This is still a working draft. There are e.g. discussions about changing the API to support promises.
Ongoing technical constraints
Will this feature be supported on all five Blink platforms (Windows, Mac, Linux, Chrome OS and Android)?
The plan is to target all platforms where getUserMedia and WebRTC is available. This includes all of the above, some in experimental form.
OWP launch tracking bug?
Doesn’t exist for now. Will be created.
Row on feature dashboard?
Yes, needs to be created.
Requesting approval to ship?
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/15/13 4:45 PM|
I haven't studied this too closely yet, but just some quick comments:
1- Note that we store blobs in the browser process. Is this API going to end up spamming the browser process with a lot of blobs?
2- For video processing, I can imagine wanting to route video frames to a worker thread to be manipulated. This probably involves the creation of a new video frame. I'd then want to send that video frame back to the main thread to be available as a media stream track. Is that a supported use case? How much attention will be paid to optimizing this use case? I'm worried blobs living in the browser could interfere with this use case.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Rachel Blum||7/15/13 5:42 PM|
1. Yes, there will be a new blob created every 'timeslice' milliseconds if timeslice has been set. Otherwise, all data will be gathered into one large blob. So the blob generation rate is under the applications control.
2. Video processing is out of scope for MediaRecorder - it is strictly about recording/encoding. (Processing should be the domain of http://www.w3.org/TR/streamproc/)
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Harald Alvestrand||7/16/13 5:55 AM|
Speaking as the chair of the relevant W3C task force:
- yes, this will create blobs. The current thinking is that it will create a blob every <suitable unit of time>, so that the JS can (for instance) store them to a file, send them to a recording server, or otherwise get rid of them, so that one doesn't have to store the whole recorded video in memory.
- no, this is not an API for accessing frames out of a video. The purpose of this API is creating a recording. Applications that want to access frames out of a MediaStream should connect the MediaStream to a video tag to a canvas (which is already supported), and copy data from there.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/16/13 7:52 AM|
OK, sounds good.
Just please keep in mind that Blobs have a bit of extra cost in Chrome.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||7/16/13 9:23 AM|
I'd expect the latency cost of the Blobs to be the biggest worry. I think we can likely manage that, but it is a good question. I'll do some investigation along those lines.
Do you think the API should use, for example, ArrayBuffer instead of Blob? Harald can likely point us to discussions of this nature if they've been had.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/16/13 9:31 AM|
I'm not arguing for ArrayBuffer. I just want to make sure people have considered the impact of blobs
being backed by data held by the browser.
Some notable implications:
1- If data is generated in the renderer process, then it must be uploaded to the browser process.
2- If data needs to be accessed by the renderer process, then it must be downloaded from the browser process.
3- Browser process memory usage will go up if a renderer process allocates a lot of blobs.
Thus, in cases where blob data is generated in the browser process and rarely read by renderer processes,
blobs, as implemented, are pretty good. #3 can mostly be addressed by storing blobs on disk, but there is
still a slight concern about bloat.
By the way, we store blobs in the browser process so that we can share URLs to them across processes.
It enables a blob: URL to be passed from a web page to a shared worker for instance, and then on to other
renderer processes, etc.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||7/16/13 9:36 AM|
On Tuesday, July 16, 2013 9:31:32 AM UTC-7, Darin Fisher wrote:
Yes. I've emailed Michael asking about the impact of this. I think it's a good question. Blobs have semantics beyond just 'hold some data', so if the API wants to indicate that those powers are desirable, that's one thing. If it was just a convenient way for the authors to say 'hold this data', and no Blob-ish powers are meant to be used, then something like ArrayBuffer might be a better choice. I'll do some asking and report back.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Harald Alvestrand||7/17/13 4:45 AM|
One question.... where should the process that actually creates the blobs live? Browser or renderer?
The origin of data is either the network (for remote streams), the camera/microphone (for live streams), or some other source like the screen, a file or a programmatic source. Many of these may live in the browser process, but I'm not sure which ones do.
If the Blob is recorded based on data in the browser process, and subsequently written to a File object that also lives in the browser process, the current Chrome implementation of Blob might just possibly be optimal for our purposes.
But I may be too optimistic....
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/17/13 9:51 AM|
You hit on a key issue for sure. Any data that exists browser-side is already cheap to expose to a renderer as a blob ;-)
In the case of media processing, I'd imagine that decoding probably happens in a sandboxed process. The data is probably not in the browser process in that case.
It might help to explore this further with folks who know more details about how the media backend works.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Ami Fischman||7/17/13 10:59 AM|
To be clear, the API under discussion is for exposing the result of /encoding/ a video stream, not /decoding/ or post-processing it.
While the source of the stream's frames is the browser process (for most scenarios), the encoder runs in the renderer for security/sandboxing reasons, and because it matches chrome's general process split model better. Someday soon the encoder may run in the GPU process when we have HW-accelerated encoding, but it doesn't seem likely that the encoder would ever run in the browser process.
Most/all output from the video encoder is destined for other processes (either for saving to disk or for sending on the network) so I suspect the right thing to do is to make the encoders emit their output to shared memory in the first place to minimize copying of encoded bits, and to allow cross-process access to them. Certainly this is the plan for HW-accelerated video encode, if only b/c that happens in the GPU process and we don't want to incur the extra copy to ship it to the renderer.
Darin: if Blobs can wrap shared-memory segments are you still concerned with "spamming the browser process with a lot of blobs"?
Is there guidance on what makes the difference between "spamming" and "prudent use"? :)
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/17/13 11:07 AM|
On Wed, Jul 17, 2013 at 10:59 AM, Ami Fischman <fisc...@chromium.org> wrote:
I see. It it also possible for us to add complexity to the blob system to back blobs with promises to provide data from other sources. We avoided doing something like that originally to minimize complexity.
Using SHM can probably help a lot. We wouldn't need to map the SHM into the browser process for instance. We might want to think carefully about what it means for the blobs to be mutated by sandboxed processes after the browser has a handle to their data. There might be some security concerns there.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Harald Alvestrand||7/17/13 11:37 AM|
Aren't blobs immutable?
That was an argument posed on the public-webrtc list in favour of using blobs rather than ArrayBuffers.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Ami Fischman||7/17/13 12:22 PM|
Harald: Darin is talking about the possibility of a compromised renderer mutating the shared memory after handing a handle off to the browser (i.e. exploiting the mutability of the proposed implementation of Blob, ignoring the immutability of Blob's API).
Darin: agreed that this should be kept in mind during implementation. I suspect that the result will be that the browser never inspects the bits in these SHMs, only ever handing them off to other processes or sockets, and that all receivers of such data will treat it the same as encoded media data from the web - as untrusted potentially malicious bits.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/17/13 12:26 PM|
On Wed, Jul 17, 2013 at 12:22 PM, Ami Fischman <fisc...@chromium.org> wrote:
Yeah, hopefully it's a non-issue.
Note: the browser will need to map these blobs when writing them to files or sending them over the network, but that would just be a temporary thing. The "spamming" concern is about using up browser memory / address space.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||7/21/13 11:53 PM|
Yes. This came up in the public-media-capture discussion -- if the intention is to send them out over the wire, the bytes having been transferred may end up not being that big a deal -- presuming they'd be transferred to the browser thread anyhow to get copied to the network.
Would it make sense in the first pass to not use SHM and the worries that accompany it, and instead apply some rate limiting?
That may end up impairing the feature in ways that are platform-inconsistent, though (i.e. harder for high-res encodes on constrained platforms). But perhaps the API needs some back pressure error handling to account for this kind of scenario... if the platform just can't handle whatever the app is trying to do, there's onerror and onwarning events in the API it can watch, but perhaps we need some more definition on what these kinds of errors/warnings will look like.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Daniel Bratell||7/25/13 2:28 AM|
Den 2013-07-16 18:23:19 skrev Greg Billock <gbil...@chromium.org>:What is the lifespan of these objects? Blobs can be converted to url
strings through the createObjectURL method and I wonder if that means that
blobs have to be kept around indefinitely? When blobs refer to files they
are cheap since all you need is a url <-> file name mapping but if they
are going to refer to an in-memory representation of a media recording
that can be MBs or GBs in size, then it matters quite a lot.
Also, if they are going to be stored in the browser process it seems they
could easily exhaust the browser process address space which would be very
bad for the whole browser (denial-of-service kind of).
I admit to not knowing all the details here but memory usage (temporary
peaks and long term) need to be considered.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Darin Fisher||7/25/13 8:43 AM|
createObjectURL is a pretty unfortunate API. Yes, when you use it you are creating the potential for a memory leak. You have to call revokeObjectURL when you are done with the Blob URL. Otherwise, the Blob data is retained (browser-side) until the document is unloaded.
We can avoid some of the browser-side memory issues by storing Blob data in files or unmapped shared memory.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Thibault Imbert||7/25/13 11:30 PM|
This is great news. I actually wanted to port recently an existing lib  I wrote in AS3 a few years back and it was brutal to have to rely on Web Audio for that. I shared some thoughts here .
A few more thoughts:
1. I am sorry if I missed that in the spec, but it seems like the data is encoded to specific formats, which is very cool. But for best flexibility, why not still provide the raw PCM samples as an option when it comes to audio? This would first, simplify largely the way you retrieve the stream (no Web Audio required), but also allow people to write the encoders they want using typed arrays if needed. Additionally, what if you want to access the samples to draw a simple spectrum like . Again, having a very simple way to access the incoming raw samples would be very flexible.
2. If such raw samples were exposed. It would also be convenient to have both channels interleaved already (not like with Web Audio with getChannelData) or make this optional, so that we don't have to store both channels to interleave them later manually.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Harald Alvestrand||7/25/13 11:39 PM|
On Fri, Jul 26, 2013 at 8:30 AM, Thibault Imbert <thibaul...@gmail.com> wrote:
What's raw about PCM?
If you want raw, audio/L16 is your friend.
(This is a good example of the wisdom of encoding to specific MIME types - all the questions like sample rate, number of bits, applied compression and so on either turn into parameters or "read the spec").
audio/l16 channels=2 provides that.
RFC 2586 is the registration of "raw audio data" as a MIME type.
There's also RFC 3190 for audio/L24 if you think 16 bits is too coarse.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Thibault Imbert||7/26/13 12:04 AM|
Perfect then, thanks! For raw/PCM, thanks for the reminder about the terminology.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||7/29/13 9:21 AM|
On Thu, Jul 25, 2013 at 2:28 AM, Daniel Bratell <bra...@opera.com> wrote:
Den 2013-07-16 18:23:19 skrev Greg Billock <gbil...@chromium.org>:
There's two use cases. One is that recording goes directly to a file, with the blob only being produced upon stream exhaustion or stop(). This shouldn't be a problem in terms of memory.
The other case is when the API produces intermediate results -- a blob every 50ms or something, say. (the record(timeslice) API). The assumption here is that the app will immediately do something with the intermediate blob (send it via the network, write it to disk) and then get rid of it. Obviously there's a lot more room for error there, and a lot more latency sensitivity to, for example, writing each intermediate blob to disk and then producing the handle.
This second use case is what I was thinking of in suggesting rate limiting, since as you say, there could be a lot of memory cost, especially in a low-memory device.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Rachel Blum||7/29/13 2:17 PM|
If you read the blob spec creatively, I don't think there's anything in there that forbids expiration of blobs. In fact, snapshot state seems to explicitly enables this kind of mechanism.
So, as the strange thought for the day: what if record(timeslice) required you to specify a sliding window size for blobs, so memory can be reserved (and failure be detected) at invocation, as opposed to some unspecified point later in history?
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||8/2/13 9:14 AM|
Rachel and I have started a discussion on public-media-capture about using Stream for this API. It gets around some of the sharp edges of using Blob by using a piece of the File API that's a bit more suitable for, well, streamed results.
A complication is that Stream is under active discussion and hasn't fully stabilized at this point.
Under the covers, it looks like the internal structure we'll be want to use is a mechanism where the encoded bits are appended to an internal buffer, which is then either piped to the JS API or to an underlying disk location which is then provided to the API as a Blob (Streams have a way to do that as well.) I believe the Stream API could end up being a good match for a lot of the issues that we've talked about -- it looks like it'll have the discard-on-read behavior that we need.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||8/5/13 4:04 PM|
Thanks to everyone for comments about the implementation. I've forwarded the questions on to public-media-capture if you're interested to read more discussion there.
For next steps with Blink, what is your recommendation? We could move forward with the API as is (it'll be behind a flag and/or exposed only in dev for the time being). We could move forward with a variant of the API we think is more likely to stabilize. We could wait and see what the outcome is of the suggestion to leverage Stream to reuse the File API better.
My guess is that the bulk of the implementation work won't be that different regardless of approach -- my estimate is that it'll be in getting the plumbing from the encoders working correctly. We may learn important facts in that process that contribute to the development of the API. So my preference would be to start earlier with exploratory work, even if we're targeting an API signature we think is likely to change.
I think the concerns about memory exhaustion and SHM security and the like are important and I'm glad we discussed them before starting, and I'm guessing there'll be more questions that arise as we go. My reading of the discussion is that we don't anticipate any dealbreakers, however.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Greg Billock||8/14/13 4:53 PM|
I've filed a bug for this:
Any other action needed here? WebRTC folks are currently reviewing the design doc. Any volunteers for Blink code reviews? Shouldn't be that much code.
On Mon, Aug 5, 2013 at 4:04 PM, Greg Billock <gbil...@chromium.org> wrote:
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||Rachel Blum||8/19/13 3:42 PM|
Just pinging if anybody volunteers for Blink code reviews of MediaRecorder - it seems there's mostly agreement that it's a good idea to implement and experiment?
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||kaust...@samsung.com||2/10/14 1:13 AM|
Hi, is anyone implementing this? I am interested in pushing the patches for MediaStreamRecorder implementation.
|Re: [blink-dev] Intent to Implement: Media Stream Recording API||anup.k...@gmail.com||4/29/14 6:09 AM|
Is there any update on this ?