This is not the first time that I've installed an extension in a
debug build and hit large numbers of threadsafety assertions (or
other assertions).
It seems to me that we need better ways of exposing our platform
invariants (threading rules being a major one of them, but not the
only one) to people writing extensions, plugins, malware, etc., if
we want these things to crash less. Some possibilities include:
* doing some of the more important checks in release builds and
putting messages on the error console
* promoting builds compiled with DEBUG defined (though not
necessarily with debugging symbols) to authors of extensions
I'm not sure how effective either of these would be, though.
Thoughts? Better ideas?
-David
--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/
Random thought: what if our beta has the threadsafety checks be fatal
(as in, instead of just asserting, abort(), and in the release build)?
This will affect performance if done across the board, obviously, but if
these people bother to test at all with the beta they will quickly
discover that their code is buggy...
-Boris
This would require that all threadsafe objects have an nsIClassInfo, but
I don't think this is too terrible. We could even make an internal api
version of do_GetService and do_CreateInstace that does not do these
extra checks.
It's possible that this wouldn't stop some of the common cases we have
though, so maybe it's not good enough.
Cheers,
Shawn
This extension author would be interested in DEBUG with symbols and with
https://bugzilla.mozilla.org/show_bug.cgi?id=508562
I'm not sure what "promoting" means but simply reliable, clear link to a
try-server build would work (referring you to my past complaints about
finding nightly builds).
jjb
> It seems to me that we need better ways of exposing our platform
> invariants (threading rules being a major one of them, but not the
> only one) to people writing extensions, plugins, malware, etc., if
> we want these things to crash less. Some possibilities include:
> * doing some of the more important checks in release builds and
> putting messages on the error console
> * promoting builds compiled with DEBUG defined (though not
> necessarily with debugging symbols) to authors of extensions
>
> I'm not sure how effective either of these would be, though.
>
> Thoughts? Better ideas?
The stacks you put in the bug indicate that McAfee was trying to use XPCOM
proxies to achieve some kind of thread safety... they just didn't realize
that XPCOM proxies don't actually achieve thread safety if the underlying
object doesn't implement threadsafe refcounting. Since XPCOM proxies seem to
be one of the most dangerously subtle ways of introducing thread bugs, can
we just get of them? In almost all cases you'd be better off firing a
runnable anyway.
If plugins are showing up causing multi-thread crashes I think it would be
fairly straightforward to ensure correct threading at the NPAPI layer in
release builds.
If extension JS is causing problems, we should at least make sure that
XPConnect doesn't allow main-thread-only functions (ones which are attached
to a window global object) from being entered off the main thread.
If the performance of NS_IsMainThread is a big concern, we can definitely
make that faster using thread local storage.
--BDS
> The stacks you put in the bug indicate that McAfee was trying to use XPCOM
> proxies to achieve some kind of thread safety... they just didn't realize
> that XPCOM proxies don't actually achieve thread safety if the underlying
> object doesn't implement threadsafe refcounting. Since XPCOM proxies seem to
> be one of the most dangerously subtle ways of introducing thread bugs, can
> we just get of them? In almost all cases you'd be better off firing a
> runnable anyway.
Has there been a measurement on how costly it would be to always use
threadsafe refcounting (but keep the messages about using non-threadsafe
objects on the wrong thread)? The last time I poke somebody to test a
tryserver run for me, it showed ~ no difference (but then I decided that
I wasn't sure it was doing the right thing, and given the lack of
tryserver access myself that's annoyingly hard). It might be worthwhile
to try that again?
It won't magically make things threadsafe, it'll just make things not
stupidly unsafe. When I asked around on #developers way back, as far as
I could tell people just mostly agreed that on the platforms we target
it's all fairly cheap anyway. I don't have enough experience to know
how true that is, though.
--
Mook
> Has there been a measurement on how costly it would be to always use
> threadsafe refcounting (but keep the messages about using non-threadsafe
> objects on the wrong thread)? The last time I poke somebody to test a
I have not seen such measurements, no.
Currently the non-threadsafe messages are only triggered on addref/release,
not typically on other methods. Are you proposing that we use threadsafe
refcounting but still assert that addref/release are called only on the main
thread? Or allow addref/release on any thread but assert that other
interface methods are main-thread-only?
Note that DOM objects (or anything that participates in cycle collection)
must not be refcounted on non-main threads no matter what, because cycle
collection purple triggers on calling Release(), and must only occur on the
main thread.
> It won't magically make things threadsafe, it'll just make things not
> stupidly unsafe. When I asked around on #developers way back, as far as
> I could tell people just mostly agreed that on the platforms we target
> it's all fairly cheap anyway. I don't have enough experience to know
> how true that is, though.
I would indeed like to know what 'fairly cheap' actually measures out as. I
expect it will cost us 1-3% at runtime, and I suspect that's not an
acceptable cost.
--BDS
--
Mook
>> Note that DOM objects (or anything that participates in cycle collection)
>> must not be refcounted on non-main threads no matter what, because cycle
>> collection purple triggers on calling Release(), and must only occur on the
>> main thread.
> Correct; they also have their own AddRef/Release implementation which is
> specific to cycle collection, and therefore won't be touched at all.
Except that if you want to fix the threadsafety crashes that we're
seeing, that's by far the most important one to touch. That said, I
think doing https://bugzilla.mozilla.org/show_bug.cgi?id=521750
would solve the majority of the crashes caused by extensions
violating our threadsafety rules; protecting the reference count
(per-object) is less important than protecting the purple buffer
(global).