sorry I have not been clear last tuesday :-) My Japanese is too rusty.
- even though the W3C MediaStream specifications always indicated that 1 stream can have multiple tracks, for a very long time, firefox only supported a maximum of 1 video track per stream.
- For a very long time, the consumers (<audio> object, <video> object, could only consume one track of a given type. Because of that, while it is ok by the specs to put many tracks in a stream, practically, almost no body does it, as no track but the first one of each media type would not be directly audible/viewable.
- because of this assumptions, a lot of the APIs of the original stream object were eventually moved to the track object (e.g. stop())
- nowadays, all browsers support media streams with multiple tracks.
- the MediStream is nowadays more a convenience container than anything else. It is the type of object produced by GetUserMedia, and it is the type of object consumed by peer connection, <audio> and <video> (and some more: web audio, .....).
- if you want to handle the media (mute, stop, ...) you need to manipulate tracks. For example, if you want to add an audio track to a screensharing video track, you sometimes need to make two calls to GetUserMedia (one for screensharing, one for audio), and get the tracks from one to the other, before you attach the result to peer connection.
so according to the MediStream spec, all three options you proposed are possible, but practically, not always appealing. The real problem is how to handle the corresponding signaling.
THE SIGNALING (a.k.a. JSEP, SDP and SDP O/A)
- the biggest questions for long was: if I have more than one stream in a peer connection, how do I let the remote side know?
- the first proposal was plan B which is AFAIK still in use by chrome (
here)
Hope this helps.
alex.