Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

extensions/plugins and platform invariants

9 views
Skip to first unread message

L. David Baron

unread,
Oct 1, 2009, 1:19:15 PM10/1/09
to dev-pl...@lists.mozilla.org
I found yesterday (see comments 19-21 in
https://bugzilla.mozilla.org/show_bug.cgi?id=500879#c19 ) that a
substantial number of the crashes our users are seeing are due to
threadsafety problems in three or four common extensions, in which
they violate our platform's threading invariants by using
single-thread objects on the wrong thread. (That set of crashes
accounts for about 5% of the crashes on Windows Firefox 3.5.3.)

This is not the first time that I've installed an extension in a
debug build and hit large numbers of threadsafety assertions (or
other assertions).


It seems to me that we need better ways of exposing our platform
invariants (threading rules being a major one of them, but not the
only one) to people writing extensions, plugins, malware, etc., if
we want these things to crash less. Some possibilities include:
* doing some of the more important checks in release builds and
putting messages on the error console
* promoting builds compiled with DEBUG defined (though not
necessarily with debugging symbols) to authors of extensions

I'm not sure how effective either of these would be, though.

Thoughts? Better ideas?

-David

--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/

Boris Zbarsky

unread,
Oct 1, 2009, 1:26:10 PM10/1/09
to
On 10/1/09 1:19 PM, L. David Baron wrote:
> Thoughts? Better ideas?

Random thought: what if our beta has the threadsafety checks be fatal
(as in, instead of just asserting, abort(), and in the release build)?
This will affect performance if done across the board, obviously, but if
these people bother to test at all with the beta they will quickly
discover that their code is buggy...

-Boris

Shawn Wilsher

unread,
Oct 1, 2009, 1:45:01 PM10/1/09
to dev-pl...@lists.mozilla.org
On 10/1/09 10:19 AM, L. David Baron wrote:
> I'm not sure how effective either of these would be, though.
What about another option:
* On getService and createInstance calls, throw an error if the object
being obtained doesn't implement nsIClassInfo and does not have the
threadsafe flag set.

This would require that all threadsafe objects have an nsIClassInfo, but
I don't think this is too terrible. We could even make an internal api
version of do_GetService and do_CreateInstace that does not do these
extra checks.

It's possible that this wouldn't stop some of the common cases we have
though, so maybe it's not good enough.

Cheers,

Shawn

John J. Barton

unread,
Oct 1, 2009, 2:08:31 PM10/1/09
to
L. David Baron wrote:
> * promoting builds compiled with DEBUG defined (though not
> necessarily with debugging symbols) to authors of extensions

This extension author would be interested in DEBUG with symbols and with
https://bugzilla.mozilla.org/show_bug.cgi?id=508562

I'm not sure what "promoting" means but simply reliable, clear link to a
try-server build would work (referring you to my past complaints about
finding nightly builds).

jjb

Benjamin Smedberg

unread,
Oct 1, 2009, 2:01:58 PM10/1/09
to
On 10/1/09 1:19 PM, L. David Baron wrote:

> It seems to me that we need better ways of exposing our platform
> invariants (threading rules being a major one of them, but not the
> only one) to people writing extensions, plugins, malware, etc., if
> we want these things to crash less. Some possibilities include:
> * doing some of the more important checks in release builds and
> putting messages on the error console
> * promoting builds compiled with DEBUG defined (though not
> necessarily with debugging symbols) to authors of extensions
>
> I'm not sure how effective either of these would be, though.
>
> Thoughts? Better ideas?

The stacks you put in the bug indicate that McAfee was trying to use XPCOM
proxies to achieve some kind of thread safety... they just didn't realize
that XPCOM proxies don't actually achieve thread safety if the underlying
object doesn't implement threadsafe refcounting. Since XPCOM proxies seem to
be one of the most dangerously subtle ways of introducing thread bugs, can
we just get of them? In almost all cases you'd be better off firing a
runnable anyway.

If plugins are showing up causing multi-thread crashes I think it would be
fairly straightforward to ensure correct threading at the NPAPI layer in
release builds.

If extension JS is causing problems, we should at least make sure that
XPConnect doesn't allow main-thread-only functions (ones which are attached
to a window global object) from being entered off the main thread.

If the performance of NS_IsMainThread is a big concern, we can definitely
make that faster using thread local storage.

--BDS

Mook

unread,
Oct 10, 2009, 8:38:05 PM10/10/09
to
Benjamin Smedberg wrote:

> The stacks you put in the bug indicate that McAfee was trying to use XPCOM
> proxies to achieve some kind of thread safety... they just didn't realize
> that XPCOM proxies don't actually achieve thread safety if the underlying
> object doesn't implement threadsafe refcounting. Since XPCOM proxies seem to
> be one of the most dangerously subtle ways of introducing thread bugs, can
> we just get of them? In almost all cases you'd be better off firing a
> runnable anyway.

Has there been a measurement on how costly it would be to always use
threadsafe refcounting (but keep the messages about using non-threadsafe
objects on the wrong thread)? The last time I poke somebody to test a
tryserver run for me, it showed ~ no difference (but then I decided that
I wasn't sure it was doing the right thing, and given the lack of
tryserver access myself that's annoyingly hard). It might be worthwhile
to try that again?

It won't magically make things threadsafe, it'll just make things not
stupidly unsafe. When I asked around on #developers way back, as far as
I could tell people just mostly agreed that on the platforms we target
it's all fairly cheap anyway. I don't have enough experience to know
how true that is, though.

--
Mook

Benjamin Smedberg

unread,
Oct 12, 2009, 9:12:32 AM10/12/09
to
On 10/10/09 8:38 PM, Mook wrote:

> Has there been a measurement on how costly it would be to always use
> threadsafe refcounting (but keep the messages about using non-threadsafe
> objects on the wrong thread)? The last time I poke somebody to test a

I have not seen such measurements, no.

Currently the non-threadsafe messages are only triggered on addref/release,
not typically on other methods. Are you proposing that we use threadsafe
refcounting but still assert that addref/release are called only on the main
thread? Or allow addref/release on any thread but assert that other
interface methods are main-thread-only?

Note that DOM objects (or anything that participates in cycle collection)
must not be refcounted on non-main threads no matter what, because cycle
collection purple triggers on calling Release(), and must only occur on the
main thread.

> It won't magically make things threadsafe, it'll just make things not
> stupidly unsafe. When I asked around on #developers way back, as far as
> I could tell people just mostly agreed that on the platforms we target
> it's all fairly cheap anyway. I don't have enough experience to know
> how true that is, though.

I would indeed like to know what 'fairly cheap' actually measures out as. I
expect it will cost us 1-3% at runtime, and I suspect that's not an
acceptable cost.

--BDS

Mook

unread,
Oct 12, 2009, 11:20:59 AM10/12/09
to
Benjamin Smedberg wrote:
> On 10/10/09 8:38 PM, Mook wrote:
>
>> Has there been a measurement on how costly it would be to always use
>> threadsafe refcounting (but keep the messages about using non-threadsafe
>> objects on the wrong thread)? The last time I poke somebody to test a
>
> I have not seen such measurements, no.
>
> Currently the non-threadsafe messages are only triggered on addref/release,
> not typically on other methods. Are you proposing that we use threadsafe
> refcounting but still assert that addref/release are called only on the main
> thread? Or allow addref/release on any thread but assert that other
> interface methods are main-thread-only?
The first option, since that's our best approximation of "no, really,
this isn't threadsafe" without rewriting the whole code base to add a
thread safety check to all other methods. It does mean having false
spew when you proxy things, though.

>
> Note that DOM objects (or anything that participates in cycle collection)
> must not be refcounted on non-main threads no matter what, because cycle
> collection purple triggers on calling Release(), and must only occur on the
> main thread.
Correct; they also have their own AddRef/Release implementation which is
specific to cycle collection, and therefore won't be touched at all.
I was thinking of changing just
http://mxr.mozilla.org/mozilla-central/search?string=%23define+NS_IMPL_ADDREF%28&find=nsISupportsImpl.h&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

>
>> It won't magically make things threadsafe, it'll just make things not
>> stupidly unsafe. When I asked around on #developers way back, as far as
>> I could tell people just mostly agreed that on the platforms we target
>> it's all fairly cheap anyway. I don't have enough experience to know
>> how true that is, though.
>
> I would indeed like to know what 'fairly cheap' actually measures out as. I
> expect it will cost us 1-3% at runtime, and I suspect that's not an
> acceptable cost.
You have better tools than I do (namely, a LDAP login), and more
experience at reading unit test output :) Sadly, the last tinderbox run
I had is long dead. Having somebody look at prove that I'm wrong would
make me happy, though.

--
Mook

L. David Baron

unread,
Oct 12, 2009, 12:34:09 PM10/12/09
to dev-pl...@lists.mozilla.org
On Monday 2009-10-12 08:20 -0700, Mook wrote:
> Benjamin Smedberg wrote:
>> On 10/10/09 8:38 PM, Mook wrote:
>>> Has there been a measurement on how costly it would be to always use
>>> threadsafe refcounting (but keep the messages about using non-threadsafe
>>> objects on the wrong thread)? The last time I poke somebody to test a

>> Note that DOM objects (or anything that participates in cycle collection)


>> must not be refcounted on non-main threads no matter what, because cycle
>> collection purple triggers on calling Release(), and must only occur on the
>> main thread.
> Correct; they also have their own AddRef/Release implementation which is
> specific to cycle collection, and therefore won't be touched at all.

Except that if you want to fix the threadsafety crashes that we're
seeing, that's by far the most important one to touch. That said, I
think doing https://bugzilla.mozilla.org/show_bug.cgi?id=521750
would solve the majority of the crashes caused by extensions
violating our threadsafety rules; protecting the reference count
(per-object) is less important than protecting the purple buffer
(global).

0 new messages