ConcurrentHashMap size() method could read stale value?

260 views
Skip to first unread message

yang liu

unread,
Sep 18, 2017, 1:42:20 AM9/18/17
to mechanical-sympathy

Recently I studied the source code of ConcurrentHashMap. 

In java 1.7, the segment field "count" got no volatile modifiers which is different from java 1.6. 

Is possible the size() method could read stale value through race read?


Nikolay Tsankov

unread,
Sep 18, 2017, 1:46:34 AM9/18/17
to mechanica...@googlegroups.com
* Bear in mind that the results of aggregate status methods including
* {@code size}, {@code isEmpty}, and {@code containsValue} are typically
* useful only when a map is not undergoing concurrent updates in other threads.
* Otherwise the results of these methods reflect transient states
* that may be adequate for monitoring or estimation purposes, but not
* for program control.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yang liu

unread,
Sep 18, 2017, 2:54:45 AM9/18/17
to mechanical-sympathy
"...useful only when a map is not undergoing concurrent updates in other threads...
The size() method sums the segment field "modCount" twice and compares the result to ensure no concurrent updates in other threads.
If there's concurrent updates, the size() method resort to locking segments. So the size() method tries to get the mostly updated 
result even if after the result returns the result may already be stale. In java1.6 field "count" has volatile present and "modCount" 
field read has happen-before relation with "count" write, so the sum of "count" can has mostly updated result. But that's not the case for java 1.7.
In java1.7 the no volatile present field "modCount" and "count" may fail to get mostly updated value.

Gil Tene

unread,
Sep 18, 2017, 11:31:52 AM9/18/17
to mechanica...@googlegroups.com
In the presence of concurrent modification, size() can be stale by definition, since the modification could occur between size() establishing a notion of what size is and your next operation that might assume the size is somehow representative of the current state of the table.

When no concurrent modification exists between your call into size() and whatever the next thing you do that depends on that size is, the size will be up-to-date. The thing that achieves that in the OpenJDK 7 variant is the fact that segmentAt() does a volatile read, which is sufficient (at least in the specific OpenJDK 7 JVM implementation) to establish a LoadLoad ahead of the following reads of seg.count and seg.modCount to reflect external modifications that (by other ordering rules) occurred prior to your size() call. The unlock() at e.g. the bottom of seg.put() creates enough ordering on the modifying side after the changes to seg.count and seg.modCount.

A critical thing to keep in mind when reading JDK code like this is that the you should not necessarily assume that your code can do the same thing safely from an ordering perspective. JDK code is special because it *knows* what JVMs it ships with, and can validly make assumptions about the JVMs behavior. Since j.u.c.ConcurrentHashMap in OpenJDK 7 is part of the JDK, it can make certain assumptions about the OpenJDK 7 JVM's handling of ordering that may be stronger than the JMM guarantees. E.g. it could assume that the *sufficient* (but not fully required by the JMM) ordering rules in http://gee.cs.oswego.edu/dl/jmm/cookbook.html are actually implemented, based on knowledge of the specific JVM that ships with the JDK code. E.g. a JVM that satisfies these rules: 


Will meet the JMM ordering requirements between volatile loads and stores, regular loads and stores, and monitor enters and exits. But that doesn't actually mean that all JVMs (and future JDKs you may run on) will actually follow these rules. They may find some way to meet the JMM requirements without following these rules to the letter. E.g. they may apply the ordering between specific sets of fields (rather than globally, across all fields as stated in this matrix), and still meet the JMM without enforcing a LoadLoad between any volatile load and any field load.

yang liu

unread,
Sep 18, 2017, 10:59:10 PM9/18/17
to mechanical-sympathy
Thanks for the detailed reply. 
Yes, JDK code is special.  From an ordering  perspective, the segmentAt() does a volatile read and ensures visibility. 
However from happen-before perspective, I cannot find happen-before relation between seg.count read action with seg.count write action. 

thread 1                                                                    thread 2
--------------------------------------------------------------------------------------
1. segment reference volatile write                   |       3. segment reference volatile read
2. seg.count write                                              |       4. seg.count read

It's obvious that action 1 has a happen-before relation with action 3.
But there's no happen-before relation between action 2 and action 4.
Am I wrong? Or I just should not consider it from the happen-before perspective because the JDK code is special?

Gil Tene

unread,
Sep 19, 2017, 2:06:42 AM9/19/17
to mechanical-sympathy
My dive into the specifics (of e.g. the volatile load) probably confused the issue. Nothing on either side actually establishes a happens-before relationship between e.g. seg.count write in put() in Thread1 and a seg.count read in size() in Thread2... It's up to an [optional] something else to establish that, if at all. There are basically three possibilities:

1. Writes in the put() in Thread1 happen-before Reads in size() in Thread2 (via some other happens-before-establishing-thing on each side, one occurring after the return from put() on Thread1, and the other occurring before the call to size() in Thread2).

2. The Reads in size() in Thread2 happen-before the Writes in put() in Thread1 (via some other happens-before-establishing-thing on each side, one occurring after the return from size() in Thread2, and the other occurring before the call to put() in Thread1).

3. There is no happens-before-establishing-thing (in either direction) between the put() call or return in Thread1 and the size() call or return in Thread2.

If it's 1, the happens-before-establishing-things placed after the [Thread1] return from put() and before the [Thread2] call to size() establishes that e.g. the write to seg.count in put() happens-before the read of seg.count in size(). [and will therefore return the size after the put]

If it's 2, the happens-before-establishing-things placed after the [Thread2] return from size() and before the [Thread1] call to put() establishes that e.g. the read of seg.count in size() happens-before the write to seg.count in put() [and will therefore return the size before the put]

If it's 3, it's irrelevant, as no happens-before-establishing stuff means that size() is experiencing concurrent modification, and in the presence of concurrent modification, size() is allowed to return a stale notion of size and does not need an established happens-before relationship to the put()'s writes.
Reply all
Reply to author
Forward
0 new messages