Re: Starting SIPCC in Firefox for multiple tabs

Randell Jesup

unread,

Apr 16, 2012, 11:29:51 PM4/16/12

to emannion, Eric Rescorla, Ethan Hugg (ehugg), dev-...@lists.mozilla.org, Maire Reavy, Anant Narayanan, Suhas Nandakumar (snandaku)

On 4/16/2012 6:10 PM, emannion wrote:
>
> Just wondering how we can get around multiple Firefox tabs with
> multiple PeerConnection objects and maybe multiple PeerConnection
> objects per tab each calling SIPCC's upcoming JSEP interface.
>
> If we could start SIPCC once on the first PeerConnection via some kind
> of Proxy then RefCount all new PeerConnections and terminating SIPCC
> on the last reference release. Each PC would then initiate its own SDP
> session. Have its own SDP that is can update or regenerate etc. As
> currently SIPCC can handle 51 calls this would equate to a current
> limitation of 51 concurrent PeerConnection SDP sessions.

A singleton Signaling object would make sense, especially if
instantiating sipcc takes a lot of time or memory or other resources.
The biggest downside is that if there's a problem with it, it could
affect all calls using it, unless there's some way to recover the state.

On the other hand, if the per-instance resources aren't large and it
doesn't take much to start - then a singleton wouldn't make sense,
because of the extra code to write (and test!) and the extra tying of
fates together of the different PeerConnections. Also, the current
limit of 51 concurrent 'calls' could be an issue (though for most
reasonable use-cases 51 active PeerConnections is plenty, but there
might be some "interesting" ways to use PeerConnection that would lead
to a large number of inactive-but-instantiated sessions).

> Any views on this would be very welcome. Currently I am developing
> the SIPCC JESP interface with only one PC in mind and that one PC is
> instantiating and stopping SIPCC. I will give an update on this work
> tomorrow.

It makes sense to look at this first, and not optimize prematurely.

>
> Please forward if this is worthy of dev-media.

Consider it done.

Randell Jesup

Enda Mannion

unread,

Apr 17, 2012, 5:58:10 PM4/17/12

to Randell Jesup, dev-...@lists.mozilla.org

Thanks for you feedback. I think running one instance of SIPCC as the
SDP engine for multiple PeerConnections is the best way to go, main
reasons being performance and resources. SIPCC is currently very
stable at handling multiple calls but we know less about its stability
if it is run as multiple instances to handle multiple PConnections.
Also running multiple SIPCCs will unnecessarily use more message
queues and threads than needed. There is also a lot more SIPCC
startup and shutdown CPU usage when running N instances.

Right now SIPCC seems to be lending itself quite nicely to adding
CreateOffer and CreateAnswer API's. I have been able to add a new
event called CreateOffer to SIPCC that is dedicated to what the name
states creating an offer. This is a better solution than hijacking
the MakeCall functionality as I was doing. CreateOffer is nearly
complete, Create answer is on the way. Nearly complete for one
PeerConnection only with some more IPC tidy up To-Do. I have also
worked out an interface that will be used by the PeerConnection
backend which I will describe in detail soon, based on pimpl.

Enda

> _______________________________________________
> dev-media mailing list
> dev-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-media

Eric Rescorla

unread,

Apr 17, 2012, 6:41:06 PM4/17/12

to Enda Mannion, dev-...@lists.mozilla.org, Randell Jesup

On Tue, Apr 17, 2012 at 2:58 PM, Enda Mannion <eman...@gmail.com> wrote:
> Thanks for you feedback. I think running one instance of SIPCC as the
> SDP engine for multiple PeerConnections is the best way to go, main
> reasons being performance and resources. SIPCC is currently very
> stable at handling multiple calls but we know less about its stability
> if it is run as multiple instances to handle multiple PConnections.
> Also running multiple SIPCCs will unnecessarily use more message
> queues and threads than needed. There is also a lot more SIPCC
> startup and shutdown CPU usage when running N instances.

I'm extremely concerned about this proposal. The security model clearly
indicates that PCs from different origins need to be isolated, and I'm
not convinced that that happens properly when you have one shared
SIPCC. What security analysis have you done to indicate that this is
OK?

-Ekr

Randell Jesup

unread,

Apr 17, 2012, 8:32:39 PM4/17/12

to Eric Rescorla, dev-...@lists.mozilla.org, Enda Mannion

Security is a concern; I forget if I raised it in my last message, but
it was on my mind (if for no other reason than one PC could access the
shared Signaling instance, and if that PC can lock up or otherwise mung
the instance it would take all current calls and probably require a
restart of FF to fix).

I'm a bit less worried about state from one call escaping to another
call, though it's certainly possible, Obviously sipcc needs to be
secure whether or not it's a singleton; singleton just means that the
thread/object has more data from other calls easily available to it.

One interesting idea would be to run the signaling singleton in a
separate process, much as we do for plugins. This would provide far
better firewalling, at the cost of some performance when doing call
signaling. Hmmm. And there still would be cross-call vulnerabilities
to worry about, but overall it might be a safer design.

Please excuse the off-the-top-of-my head consideration here...

Randell Jesup

Timothy B. Terriberry

unread,

Apr 18, 2012, 8:40:15 AM4/18/12

to dev-...@lists.mozilla.org

Randell Jesup wrote:
>> I'm extremely concerned about this proposal. The security model clearly
>> indicates that PCs from different origins need to be isolated, and I'm
>> not convinced that that happens properly when you have one shared
>> SIPCC. What security analysis have you done to indicate that this is
>> OK?
>
> Security is a concern; I forget if I raised it in my last message, but
> it was on my mind (if for no other reason than one PC could access the
> shared Signaling instance, and if that PC can lock up or otherwise mung
> the instance it would take all current calls and probably require a
> restart of FF to fix).

Not to minimize the need for security analysis, but it seems obvious to
me that we can't _completely_ isolate the state of multiple calls,
because we have finite resources that have to be provisioned between them.

Randell Jesup

unread,

Apr 18, 2012, 9:52:21 AM4/18/12

to dev-...@lists.mozilla.org

Sure: shared browser state, shared address space, etc, etc.
Signaling/SIPCC has the potential (unlike most things) to be run in a
separate process, which might help mitigate/contain any security
breaches in that code (and like plugins cause a failure to only affect
other active calls, not take down the browser). Whether that's worth
the additional work required is another question.

There's no easy way that I can think of to move the main webrtc code to
another process. It may be doable in theory, but it would be a
significant amount of work and risk in itself. (I'll think about it
some more, though, just in case.) A big issue is the amount of data
traffic back and forth for the MediaStreams coming in and going out,
though those would likely be via shared memory.

Randell Jesup

Timothy B. Terriberry

unread,

Apr 18, 2012, 10:24:20 AM4/18/12

to dev-...@lists.mozilla.org

Randell Jesup wrote:
> Sure: shared browser state, shared address space, etc, etc.
> Signaling/SIPCC has the potential (unlike most things) to be run in a
> separate process, which might help mitigate/contain any security
> breaches in that code (and like plugins cause a failure to only affect
> other active calls, not take down the browser). Whether that's worth the
> additional work required is another question.

I wasn't addressing the "move sipcc out of the main process" question
but the "multiple instances of sipcc vs. one shared instance" question.
The interactions are a little more direct than "shared address space",
in that (as we start adding things like hardware encoder support) the
SDP generated by one CreateOffer() call will depend on the resources
reserved for other calls. I suspect trying to pretend those interactions
don't exist will cause more problems than it solves in the long run.

snandaku

unread,

Apr 18, 2012, 10:24:34 AM4/18/12

to Randell Jesup, dev-...@lists.mozilla.org

On 4/18/12 6:52 AM, "Randell Jesup" <rje...@mozilla.com> wrote:

> On 4/18/2012 8:40 AM, Timothy B. Terriberry wrote:
>> Randell Jesup wrote:
>>>> I'm extremely concerned about this proposal. The security model clearly
>>>> indicates that PCs from different origins need to be isolated, and I'm
>>>> not convinced that that happens properly when you have one shared
>>>> SIPCC. What security analysis have you done to indicate that this is
>>>> OK?
>>>
>>> Security is a concern; I forget if I raised it in my last message, but
>>> it was on my mind (if for no other reason than one PC could access the
>>> shared Signaling instance, and if that PC can lock up or otherwise mung
>>> the instance it would take all current calls and probably require a
>>> restart of FF to fix).
>>
>> Not to minimize the need for security analysis, but it seems obvious
>> to me that we can't _completely_ isolate the state of multiple calls,
>> because we have finite resources that have to be provisioned between
>> them.
>

> Sure: shared browser state, shared address space, etc, etc.
> Signaling/SIPCC has the potential (unlike most things) to be run in a
> separate process, which might help mitigate/contain any security
> breaches in that code (and like plugins cause a failure to only affect
> other active calls, not take down the browser). Whether that's worth
> the additional work required is another question.
>

> There's no easy way that I can think of to move the main webrtc code to
> another process. It may be doable in theory, but it would be a
> significant amount of work and risk in itself. (I'll think about it
> some more, though, just in case.) A big issue is the amount of data
> traffic back and forth for the MediaStreams coming in and going out,
> though those would likely be via shared memory.

>>> One of the possible approaches is the way webrtc and signaling works in
Chrome today. Have both of them run in the browser process and transfer
signaling and frames (local and remote) to the other process that does the DOM
rendering. For signaling-like data use IPC channel and for time critical data
such as audio-frames/video-buffer use sync-socket and shared-memory map based
approaches.

Ethan Hugg

unread,

Apr 18, 2012, 11:18:13 AM4/18/12

to dev-...@lists.mozilla.org

>>>
>>> Not to minimize the need for security analysis, but it seems obvious
>>> to me that we can't _completely_ isolate the state of multiple calls,
>>> because we have finite resources that have to be provisioned between
>>> them.

I'm still a bit confused about how multiple tabs with multiple calls
would work from a Firefox user's perspective.

If I were on a call with Alice on one tab and opened a new tab and
initiated a call to Bob, what do you expect to have happen?

1. Both calls continue with input mic/camera split. Alice and Bob
both hearing/seeing me, but not each other. When I hit hangup on
Bob's call do I need to remember that Alice can still hear/see me?
2. Automatic three-way conference. Potential for funny user mistakes here.
3. Put Alice on hold and talk to Bob, hanging up Bob, takes Alice
off hold. Don't forget about the background call to Alice or you'll
have an open mic/camera and not know it.
4. Disconnect call with Alice. May not be obvious this will happen
if this were an incoming call from Bob.
5. Disallow new tab to make new call to Bob because of existing call
with Alice.
6. Allow JavaScript programmer to choose any of these which would
bring in even more security concerns.

Has this already been discussed at Mozilla?

-EH

Eric Rescorla

unread,

Apr 18, 2012, 11:47:39 AM4/18/12

to Ethan Hugg, dev-...@lists.mozilla.org

On Wed, Apr 18, 2012 at 8:18 AM, Ethan Hugg <etha...@gmail.com> wrote:
>>>>
>>>> Not to minimize the need for security analysis, but it seems obvious
>>>> to me that we can't _completely_ isolate the state of multiple calls,
>>>> because we have finite resources that have to be provisioned between
>>>> them.
>
> I'm still a bit confused about how multiple tabs with multiple calls
> would work from a Firefox user's perspective.
>
> If I were on a call with Alice on one tab and opened a new tab and
> initiated a call to Bob, what do you expect to have happen?

I'm assuming these are not on the same origin.

> 1. Both calls continue with input mic/camera split. Alice and Bob
> both hearing/seeing me, but not each other. When I hit hangup on
> Bob's call do I need to remember that Alice can still hear/see me?
> 2. Automatic three-way conference. Potential for funny user mistakes here.
> 3. Put Alice on hold and talk to Bob, hanging up Bob, takes Alice
> off hold. Don't forget about the background call to Alice or you'll
> have an open mic/camera and not know it.
> 4. Disconnect call with Alice. May not be obvious this will happen
> if this were an incoming call from Bob.
> 5. Disallow new tab to make new call to Bob because of existing call
> with Alice.
> 6. Allow JavaScript programmer to choose any of these which would
> bring in even more security concerns.

IMO it's precisely this kind of question that arises when you try to think
of multiple PeerConnections as somehow being a handle to the same
softphone and that I (and I think the Web model in general) wishes
to avoid.

Rather, here's how I think about it:

1. Each PeerConnection exists totally independently.
2. The browser has a bunch of I/O resources (cameras,
microphones, displays, etc.).
3. The JS can attach to these I/O resources, but some of
them cannot be shared (this is the only place where
different pages interact.) So, it might or might not be
possible for two pieces of JS to share the same camera.
4. A piece of JS can therefore set up a PeerConnection
and connect it to the other side, with it being plumbed
to any resources that the JS can acquire.

So, what are the implications of this:
1. Any piece of JS can create a new PeerConnection with DataChannel
only, no matter how many already exist, including multiples in the same
tab.

2. Any piece of JS can create a new PeerConnection with
incoming A/V only, no matter how many already exist, including multiples
in the same tab.

3. Any piece of JS can acquire a handle to a camera and/or microphone
as long as at least one of those devices is free and use it to make a
call.

What I think remains to be determined is whether one can have ahared
access to a microphone or camera. But that's a question for getUserMedia(),
not for PeerConnection.

Note that I'm not suggesting that a given calling site must allow you to
have multiple concurrent calls. For instance, GoogleTalk will not do so,
no matter how many tabs you open. However, that's something that's
implemented in JS, not in C++ and is at the discretion of the site.

-Ekr

Randell Jesup

unread,

Apr 18, 2012, 5:27:40 PM4/18/12

to dev-...@lists.mozilla.org

Agreed.

> 2. The browser has a bunch of I/O resources (cameras,
> microphones, displays, etc.).
> 3. The JS can attach to these I/O resources, but some of
> them cannot be shared (this is the only place where
> different pages interact.) So, it might or might not be
> possible for two pieces of JS to share the same camera.

Right - within the same JS object (not multiple tabs), it could clone
the MediaStream from the camera and send the data to multiple
PeerConnections (i.e. mesh conference).

If the device wanted is held by another tab, getUserMedia() and the user
interface could intervene to allow stealing of the camera/mic (and
muting of the other call); see #3 above. This could be invisible to the
other app (muting of the camera is outside of the JS's control anyways -
the user must always be able to mute camera/mic without going through
the JS app) - it would just return an "muted" image/silence instead of
real camera data.

This usecase is actually important - if you're on a call with someone
using tab A on Service 1, and a call comes in on tab B via Service 2,
what happens? (You're chatting on the interstellar communicator with
James Kirk of the Enterprise, when a call comes in on WebRTC_Skype from
your mother, which you *must* take...)

I think automatically "muting" all existing users of the selected
camera/mic is the correct solution (at least if the user selected the
in-use camera in the permission/cam-selection UI or selected Answer With
Video in an 'approved-to-access-camera-directly' app). To un-mute
(after ending the new call, or to switch back to the original call
without ending the new call) you'll probably need to interact with the
browser chrome muting controls. We could also allow an app to request
un-mute (which would prompt the user), and we should also notify the app
it was muted. So, no automatic un-mute if the new call ends.

In cases where the user selects the camera in a chrome preview, it
should be somehow marked as in-use. If it's allowed direct access, the
app should have some way to know the camera requested was already
in-use, and have the option to override (steal/mute) or not (which
speaks to needing some way to specify this in the getUserMedia()
constraints).

> 4. A piece of JS can therefore set up a PeerConnection
> and connect it to the other side, with it being plumbed
> to any resources that the JS can acquire.

Right.

>
> So, what are the implications of this:
> 1. Any piece of JS can create a new PeerConnection with DataChannel
> only, no matter how many already exist, including multiples in the same
> tab.

Yup. And this is where the 51-call limit to SIPCC comes in; 51 real
calls - not so likely; 51 DataChannel-only calls (in a game) - possible,
or in a combination of a few games.

Also, a game or other similar app might proactively connect you in an
'idle-but-open' state (with audio/video negotiated but inactive) to
other players nearby to allow those connections to be warmed up quickly
as needed (perhaps using DataChannel for direct renegotiation to enable
audio/video). Perhaps a stretch, but people will figure out
interesting/unexpected things to do with an enabling API like WebRTC.

> 2. Any piece of JS can create a new PeerConnection with
> incoming A/V only, no matter how many already exist, including multiples
> in the same tab.

Makes sense.

> 3. Any piece of JS can acquire a handle to a camera and/or microphone
> as long as at least one of those devices is free and use it to make a
> call.

And perhaps even if it isn't free, see above.

> What I think remains to be determined is whether one can have ahared
> access to a microphone or camera. But that's a question for getUserMedia(),
> not for PeerConnection.

Shared-video/mic between apps/tabs is a possible idea, but rather lower
utility and higher risk/complexity. But we should consider it in the UI
and permissions discussions.

> Note that I'm not suggesting that a given calling site must allow you to
> have multiple concurrent calls. For instance, GoogleTalk will not do so,
> no matter how many tabs you open. However, that's something that's
> implemented in JS, not in C++ and is at the discretion of the site.

Right. Within one app/tab, the JS can redirect the MediaStream where it
wants, and manage multiple PeerConnections/calls as it wishes (or
doesn't wish).

Randell Jesup

Timothy B. Terriberry

unread,

Apr 18, 2012, 5:32:06 PM4/18/12

to dev-...@lists.mozilla.org

Randell Jesup wrote:
> Yup. And this is where the 51-call limit to SIPCC comes in; 51 real
> calls - not so likely; 51 DataChannel-only calls (in a game) - possible,
> or in a combination of a few games.

Or in a DHT-based server-less communications network.