ClassCache Map Thread Safety?

Tom Palkot

unread,

Jan 25, 2010, 3:03:48 PM1/25/10

to dev-tech-js-...@lists.mozilla.org

We are trying to resolve an issue where the Rhino engine occasionally
gets stuck early in the run.

We use a shared sealed global scope, and create scripts that execute in
multiple threads, inheriting the core objects from the single shared
global scope.

Under load, it looks like the HashMap instances in
org.mozilla.javascript.ClassCache get corrupted. We get a code thread
hung in this code below, and all the other threads waiting for this guy
to give up a synchronization lock.

We are trying the use of ConcurrentHashMap instead of HashMap inside the
ClassCache.

Has anyone seen this before, or have any suggestions? Our environment:

JDK 1.6

Rhino 1.7 R1

"[ACTIVE] ExecuteThread: '395' for queue: 'weblogic.kernel.Default
(self-tuning)'" daemon prio=3 tid=0x02f0b000 nid=0x1bc runnable
[0x8f87d000]

java.lang.Thread.State: RUNNABLE

at java.util.HashMap.get(HashMap.java:303)

at
org.mozilla.javascript.JavaMembers.lookupClass(JavaMembers.java:836)

at
org.mozilla.javascript.NativeJavaClass.initMembers(NativeJavaClass.java:
83)

at org.mozilla.javascript.NativeJavaClass.

(NativeJavaClass.java:78)

at
org.mozilla.javascript.NativeJavaPackage.getPkgProperty(NativeJavaPackag
e.java:159)

- locked

<0xc05ce0c0>

(a org.mozilla.javascript.NativeJavaPackage)

at
org.mozilla.javascript.NativeJavaPackage.get(NativeJavaPackage.java:105)

at
org.mozilla.javascript.ScriptableObject.getProperty(ScriptableObject.jav
a:1575)

at
org.mozilla.javascript.ScriptRuntime.getObjectProp(ScriptRuntime.java:13
97)

at
org.mozilla.javascript.ScriptRuntime.getObjectProp(ScriptRuntime.java:13
83)

Attila Szegedi

unread,

Jan 26, 2010, 2:57:47 AM1/26/10

to Tom Palkot, dev-tech-js-...@lists.mozilla.org

I have the same setup (ClassCache in a shared prototype scope). During a warmup phase (which happens quite often, as I have dynamic class reloading too backed by a soft-reference storage, so code versions get unloaded and reloaded as memory requires, and there are several of them in memory at once, each of them from a separate class loader, with a separate prototype scope and ClassCache), there'll be 50 threads pulling in the class workset concurrently. This happens several times an hour 24/7, on dozens of machines, for 6 years now, heavy production load. I never had that problem.

I'm not saying it's not real, though - race conditions are notorious for being obscure. I could've been lucky.

Under *correctly synchronized* multithreaded access, java.util.HashMap.get() can not possibly hang, except if the key's equals() or hashCode() hangs. However, the map you're referring to has java.lang.Class objects as keys, and its equals() and hashCode() don't hang either.

On the other hand, unsafe multithreaded mutations could corrupt it so that it never exits the loop in HashMap.get() - i.e. if a race condition victim thread observes an element of the hash table in a moment when e == e.next. I think I did encounter this once in the wild. You might be hitting the same problem.

So, do we have unsafe multithreaded mutation here? Well, the only lock is in the NativeJavaPackage.getPkgProperty(), and there can be more than one of those. So I believe that indeed more than one thread can access the ClassCache concurrently - they'll be synchronizing on different NativeJavaPackage objects and thus not mutually exclude one another.

Yeah, on the first sight, this indeed looks like we have a potential for an unsynchronized concurrent mutation of the ClassCache backing HashMap.

Anyone else care to double-check my analysis?

Attila.

--
home: http://www.szegedi.org
twitter: http://twitter.com/szegedi
weblog: http://constc.blogspot.com

Johan Compagner

unread,

Jan 26, 2010, 4:51:45 AM1/26/10

to dev-tech-js-...@lists.mozilla.org

In rhino that i still use (1.6R7) it is a HashTable, and it seems that that
is just being replaced with a HashMap
a HashTable is threadsafe but a hashmap isnt so i think the change done
there was a bit dangerous yes

In my code that i browse now it is not only called through
NativeJavaPackage.getPkgProperty(), but also through NativeJavaClass.get()
(also makes a NativeJavaClass instance) and NativeJavaObject classes
so it can be called from all over the place.

I think it should be replaced with a ConcurrentHashmap.

> _______________________________________________
> dev-tech-js-engine-rhino mailing list
> dev-tech-js-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rhino
>

Attila Szegedi

unread,

Jan 26, 2010, 5:54:41 AM1/26/10

to Johan Compagner, dev-tech-js-...@lists.mozilla.org

Indeed, that's a significant difference - I'm using Rhino 1.6R7 too in production, and since java.util.Hashtable synchronizes every operation, it's not prone to this. That'd explain why my system never hit this problem. Indeed, we'll need to switch to ConcurrentMap in Rhino code.

Thanks for the insight, Johan.

Attila.

Attila Szegedi

unread,

Jan 26, 2010, 6:02:30 AM1/26/10

to Rhino JS User List

Just a further detail, this is a regression bug introduced by fixing bug 414554 (ClassCache.java CVS revision 1.17 from 2008 January 29). As such, Rhino 1.7R1 and 1.7R2 are both affected.

Norris, I think this'd warrant an urgent bugfix release for 1.7R2.

Attila.