Dealing with high RTCPeerConnection churn

1,550 views
Skip to first unread message

Nick Tindall

unread,
May 1, 2014, 8:37:18 AM5/1/14
to discuss...@googlegroups.com
Hi All,

I'm currently attempting to build a randomised unstructured peer to peer overlay network to run over WebRTC in the browser as part of an academic project. A characteristic of the project is that peers engage in periodic exchanges of information with other random peers in the network and the idea is that the involvement in the network potentially is very long-lived, so in one browser session peers might engage in exchanges with hundreds (even thousands?) of other peers.

My initial implementation created a new RTCPeerConnection for each exchange that occurred, which worked fine except for the small issue of (non-heap) browser memory consumption slowly increasing, I believe due to the following statement from the webrtc spec (http://www.w3.org/TR/webrtc/#garbage-collection)

Window object has a strong reference to any RTCPeerConnection objects created from the constructor whose global object is that Window object.

i.e. the issue discussed here;


So I'm experimenting with the idea of keeping a pool of RTCPeerConnections that can be used for exchanges. This would only need to be small in size because each peer is only engaged in one, MAYBE two exchanges concurrently. The problem I have with this is the RTCPeerConnection in Chrome doesn't seem to like certain patterns of reuse.

The problems arise (I think) due to the fact it is impossible to unset local/remote session descriptions in order to "reset" an RTCPeerConnection in readiness for connection to another peer, so when I pull a second-hand RTCPeerConnection from the pool and create a data channel in readiness to create an offer the datachannel immediately fires an onopen event. I assume because the browser at each end are still connected and the RTCPeerConnection is still in the "stable" state (http://www.w3.org/TR/webrtc/#rtcpeerstate-enum).

If I attempt to close the RTCPeerConnection before its returned to the pool I of course get the IllegalStateError thrown when reuse is attempted.

If anyone has any advice as to whether/how this pattern of usage can be accommodated in WebRTC it would be sincerely appreciated. Alternatively is there any way to properly clean up the resources associated with no-longer-needed RTCPeerConnections?

The memory growth problem appears to be present in both Firefox and Chrome, with the difference being chrome seems to release the memory when the offending windows/tabs are closed.

Regards,
Nick Tindall


Justin Uberti

unread,
May 1, 2014, 6:15:06 PM5/1/14
to discuss-webrtc
Can you just stuff in empty local/remote descriptions to reset the peerconnection?

.close() should clean up PeerConnections sufficiently so that you don't run out of memory. Is there a bug filed on this?


--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nick Tindall

unread,
May 1, 2014, 7:11:01 PM5/1/14
to discuss...@googlegroups.com
Hi Justin,

You can't pass null into the local/remote descriptions. I'll give it a go with a blank string and report back if that improves the situation.

There's this issue, but from the comments it sounds like holding the peer connection reference indefinitely is by design?!


Cheers,
Nick T

Justin Uberti

unread,
May 1, 2014, 8:09:17 PM5/1/14
to discuss-webrtc
I didn't mean an empty string, but rather one with media set to a=inactive. (Hacky, but perhaps a viable short-term solution.)

Nick Tindall

unread,
May 1, 2014, 8:11:56 PM5/1/14
to discuss...@googlegroups.com
Thanks, I'll give that a go.

Shachar

unread,
May 2, 2014, 12:13:17 PM5/2/14
to discuss...@googlegroups.com
FYI Nick, in FF:
"Right now there is no way to re-use a PC.
Basically once your PC has reached the closed state, there is no way out of the ‘closed’ back into any other state. That’s why I believe it should currently be impossible to re-use PC’s."
Could you file a bug in bugzilla though on the memory leak?

Randell Jesup

unread,
May 2, 2014, 2:40:49 PM5/2/14
to discuss...@googlegroups.com

On Thursday, May 1, 2014 3:37:18 PM UTC+3, Nick Tindall wrote:
Hi All,

I'm currently attempting to build a randomised unstructured peer to peer overlay network to run over WebRTC in the browser as part of an academic project. A characteristic of the project is that peers engage in periodic exchanges of information with other random peers in the network and the idea is that the involvement in the network potentially is very long-lived, so in one browser session peers might engage in exchanges with hundreds (even thousands?) of other peers.

My initial implementation created a new RTCPeerConnection for each exchange that occurred, which worked fine except for the small issue of (non-heap) browser memory consumption slowly increasing, I believe due to the following statement from the webrtc spec (http://www.w3.org/TR/webrtc/#garbage-collection)

Window object has a strong reference to any RTCPeerConnection objects created from the constructor whose global object is that Window object.

i.e. the issue discussed here;


So I'm experimenting with the idea of keeping a pool of RTCPeerConnections that can be used for exchanges. This would only need to be small in size because each peer is only engaged in one, MAYBE two exchanges concurrently. The problem I have with this is the RTCPeerConnection in Chrome doesn't seem to like certain patterns of reuse.


I think you don't want to cache PC's - the overhead to build one isn't that large.  In some situations, you might want to have one pre-built and ready to use if setup latency is critical, but given trickle ICE, that's less of an issue.

If you stop() your peerconnection, and then drop *all* references to it, it should be garbage-collected "soon".  The item you reference about chrome and webrtc-internals I think was/is a Chrome bug where webrtc-internals kept the PC alive after close().  It does survive after close() so long as there is a reference to it, so stats can be gathered - but if the last reference is dropped, it should disappear (and does in FF).  Note that GC is unpredictable unless forced by something like about:memory in FF.


 

The problems arise (I think) due to the fact it is impossible to unset local/remote session descriptions in order to "reset" an RTCPeerConnection in readiness for connection to another peer, so when I pull a second-hand RTCPeerConnection from the pool and create a data channel in readiness to create an offer the datachannel immediately fires an onopen event. I assume because the browser at each end are still connected and the RTCPeerConnection is still in the "stable" state (http://www.w3.org/TR/webrtc/#rtcpeerstate-enum).

You want to renegotiate, but you want to do it with a *different* partner, which is not per-se renegotiation.  Which isn't to say it couldn't be somehow made to work, but I wouldn't try it.  There's little if any win, and tons of pain.  And on top of that, FF doesn't support renegotiation yet, either.



If I attempt to close the RTCPeerConnection before its returned to the pool I of course get the IllegalStateError thrown when reuse is attempted.

Right.

-- 
Randell Jesup -- rjesup a t mozilla d o t com

Nick Tindall

unread,
May 4, 2014, 12:23:51 AM5/4/14
to discuss...@googlegroups.com
Hi Randell,

Thanks for your response!
 

I think you don't want to cache PC's - the overhead to build one isn't that large.  In some situations, you might want to have one pre-built and ready to use if setup latency is critical, but given trickle ICE, that's less of an issue.

If you stop() your peerconnection, and then drop *all* references to it, it should be garbage-collected "soon".  The item you reference about chrome and webrtc-internals I think was/is a Chrome bug where webrtc-internals kept the PC alive after close().  It does survive after close() so long as there is a reference to it, so stats can be gathered - but if the last reference is dropped, it should disappear (and does in FF).  Note that GC is unpredictable unless forced by something like about:memory in FF.

       

You're right, I don't want to do pooling and it sounds like a lot of trouble. I was only considering it because got the impression that the RTCPeerConnections would live on indefinitely for a particular window. I though perhaps that my use case was outside the design of WebRTC, but it sounds as though that's not the case which is encouraging.

My reasons for suspecting RTCPeerConnections of living indefinitely included;
  • I have been doing most of my performance testing in Chrome because the memory profiling tools are a bit better (I couldn't seem to find a way to do a heap dump in Firefox) and even when all references are dropped (i.e. the RTCPeerConnection object is gone from the heap dump altogether) and the GC is invoked using the bin button on the timeline tab the RTCPeerConnections still stay in the chrome:webrtc-internals page. This is still the case in  34.0.1847.132 on Linux and I assume that's what causes the slow increase in memory consumption in Chrome. When the window is closed, all RTCPeerConnections for that window disappear from chrome:webrtc-internals.
  • The only current memory profiling tool I could seem to find in Firefox was about:memory, when I noticed the memory used by the Firefox process had grown to 3.2GB (after running six instances of my P2P client for about 24 hours) I checked in there to see whether it would tell me anything useful. The memory growth was mostly in heap-unclassified (about 2.9GB from memory) the section under "window-objects" for each P2P client page was less than 5mb. When I closed the tabs containing the P2P clients this heap-unclassified memory doesn't go down, I also tried triggering a GC and "minimize memory usage", but to no avail. I read somewhere that it should not be possible to make the browser leak memory using JS APIs alone, but it seems to be the case here. This testing is in Firefox 29 on Linux.
  • That statement from the specification that I quoted.


You want to renegotiate, but you want to do it with a *different* partner, which is not per-se renegotiation.  Which isn't to say it couldn't be somehow made to work, but I wouldn't try it.  There's little if any win, and tons of pain.  And on top of that, FF doesn't support renegotiation yet, either.



Yeah I've definitely gone off the idea, I need something that's going to work cross-browser (ideally now) and PC reuse is sounding too hacky and probably unnecessary.

I'm just trying to boil down a bare-bones test page that demonstrates the memory consumption so I can share it with you guys. My P2P app has too many moving parts and libraries involved so it probably would be less helpful to point you to it. I'm also trying to convince myself it's not a bug in my code. I was basing the assumption that the memory leak was in the browser on the fact that the size of the heap dumps generated by Chrome are pretty much static (~10mb per P2P instance page), even after running for days. I'm no JavaScript guru so that could be a false assumption. 

FYI: Each peer in my P2P network attempts to open an RTCPeerConnection every 30 seconds to do an exchange with another peer. I haven't got to the stage of testing different intervals, but I can't imagine 2 per minute should be too much for WebRTC? The Connection is opened with a single datachannel on it then the channel and connection are closed after the exchange is complete. So if I leave 6 clients running in a browser, for 24 hours that's a total of 120 * 24 * 6 = 17,280 peer connections each with a single data channel. Probably more than your average WebRTC application. Is it unrealistic to expect this to be performant?

I'll post back as soon as I have a basic concrete example.

Thanks for your input,
Nick T
 

Nick Tindall

unread,
May 4, 2014, 12:29:20 AM5/4/14
to discuss...@googlegroups.com
Hi Shachar,

My idea initially was to not close the RTCPeerConnections, but instead re-set the remote/local descriptions to their new values and renegotiate. It sounds like that's not a possibility and that I should be able to use RTCPeerConnections and they should be garbage collected when I'm finished anyway. So I'm abandoning the connection pooling idea.

I will wait until I've confirmed that it is definitely a leak in Firefox before I create a bug. I'm just working on a very bare bones page that will reproduce the problem at the moment. I will post a link to it here once it's done so you guys can sanity check it before I raise an issue.

Thanks for your reply.

Cheers,
Nick T 

Nick Tindall

unread,
May 4, 2014, 3:41:35 AM5/4/14
to discuss...@googlegroups.com
Hi All,

I've just completed what I think is a bare-bones example of the problem (see https://drive.google.com/file/d/0B1EeCghikL26bzl2SHBIY1lpbWM/edit?usp=sharing)

I started with a fresh Firefox window, no plugins other than those installed by default in Ubuntu, but not in safe mode.

The memory consumption before was;

Then I created and closed 10,000 RTCPeerConnections in the HTML demo attached, waited til it finished, triggered a GC and "Minimize memory usage" and the resulting memory consumption after was

Closing the window which I ran the demo in didn't change the memory consumption, even after another GC and minimize. The magnitude isn't quite what I'd observed with my P2P app (my estimate of 17,000ish didn't include incoming peer connections), but the numbers of connections is significantly lower, and the connection isn't actually being established (or having ICE candidates/remote description set etc.)

I've included the about:memory verbose dump in the archive and the one-page example I used to simulate the problem.

It's very unscientific and I could be way off, as I don't understand browser internals and am by no means a Javascript expert, but I think the example is simple enough that if there were something inherently wrong with it you should be able to see it easily.

Running this you can also see that Chrome doesn't remove old RTCPeerConnections from the chrome:webrtc-internals page until the window is closed, even when there are no references to them left hanging around.

Thanks again,
Nick T

Nick Tindall

unread,
May 8, 2014, 4:07:38 AM5/8/14
to discuss...@googlegroups.com
Hi All,

Did anyone get a chance to look at the example? It's very simple. It'd be good to know if it likely indicated an internal browser memory leak relating to closed RTCPeerConnections, if so I'll raise an issue in the Firefox bugzilla. And copy the details to it so it can be tracked.

I've attached just the HTML demo to this message. It also illustrates the RTCPeerConnections staying in chrome://webrtc-internals indefinitely (until the window is closed) regardless of state. These are probably not issues for apps that don't use A LOT of PeerConnections, unfortunately I do.

Any advice is appreciated.

Thanks
Nick Tindall
leakdemo.html

Harald Alvestrand

unread,
May 15, 2014, 6:23:44 AM5/15/14
to discuss...@googlegroups.com
I think this is a spec bug - the spec says (effectively) that PeerConnections are never garbage collected while the Window object is alive, and I think that's not right.




--

Nick Tindall

unread,
May 15, 2014, 7:03:22 PM5/15/14
to discuss...@googlegroups.com
Thanks Harald,

That was my interpretation of it too. I've also filed this bug https://bugzilla.mozilla.org/show_bug.cgi?id=1010198 to track the issue in Firefox.

Cheers,
Nick T

Felix Wolff

unread,
Oct 22, 2014, 2:53:00 PM10/22/14
to discuss...@googlegroups.com
Hi guys,

Are there any news concerning this topic? I am running in exactly the same problems and am very happy to find others with similar observations. Especially since I noticed that the webrtc-internals page isn't working very well with about 20+ connections on it, I am suspicious. 

Best,
Felix

Nick Tindall

unread,
Oct 22, 2014, 7:54:38 PM10/22/14
to discuss...@googlegroups.com
Hi Felix,

The bug in the specification (https://www.w3.org/Bugs/Public/show_bug.cgi?id=25724) has been fixed, and I think Chrome seem to have improved things in M38 (https://code.google.com/p/chromium/issues/detail?id=373690), although I think it still leaks, and Firefox have acknowledged the issue (https://bugzilla.mozilla.org/show_bug.cgi?id=1010198), but it hasn't been fixed.

That's all I know about it. I don't think most WebRTC applications behave the way ours do so maybe the problem isn't affecting many people. At least there's two of us now :)

My concern is that both browsers now have webrtc debug pages (chrome:webrtc-internals/about:webrtc) that appear to keep information about all the connections ever, so maybe the leaks are by design?

Anyhow, good luck :)

-Nick T

Felix Wolff

unread,
Oct 23, 2014, 12:51:24 PM10/23/14
to discuss...@googlegroups.com
Thank’s for the links. I think you’re right and the leak is 'by design’. So I will have to put on my engineering hat and work around it. ;)


Reply all
Reply to author
Forward
0 new messages