Hashtables and Multithreading

TJoker .NET

unread,

Jun 19, 2003, 8:07:49 PM6/19/03

to

Hi. I'm trying to implement some sort of lazy-loading cache for some data in
my applcation.
I decided to implement it using Hashtables. My question is: What is the
right way to use Hashtables in
an multithreaded environment?
My sample code below expects the user to write: myObj =
MyLazyCacheV1.GetItemFromCache("someKey")
and get the object he is looking for.
Can somebody explain me the proper way to do this? Some of my doubts are in
comments in the code.

Thanks so much.
- TJ

public class MyLazyCacheV1
{
private MyLazyCacheV1() { }

private static Hashtable ht = new Hashtable();
// OR: (option 1)
//private static Hashtable ht = Hashtable.Synchronized(new Hashtable());

public static MyComplexObject GetItemFromCache(string itemKey)
{
MyComplexObject obj = (MyComplexObject)ht[itemKey];

if(obj == null)
{
lock(ht) //do I need this if I'm using (option 1) above ??
{
obj = (MyComplexObject)ht[itemKey];
if(obj == null)
{
obj = SlowQuery(itemKey);
ht.Add(itemKey, obj);
// OR: (option 2)
//Hashtable.Synchronized(ht).Add(itemKey, obj); ????
}
}
}

return obj;
}

private static MyComplexObject SlowQuery(string itemKey)
{ // query code here
return new MyComplexObject();
}
}

public class MyComplexObject
{
public string ItemKey;
public int Property1;
public int Property2;
// .... ....
public int PropertyN;
}

----------------------

Bruno Jouhier [MVP]

unread,

Jun 20, 2003, 2:52:14 AM6/20/03

to

Hi TJ,

Some comments:

Option 1 is a bad idea. If you use it without the lock, your code will look
like:

// V1

public static MyComplexObject GetItemFromCache(string itemKey)
{
MyComplexObject obj = (MyComplexObject)ht[itemKey];

if (obj == null)

{
obj = SlowQuery(itemKey);
ht.Add(itemKey, obj);
}
}

You won't corrupt the hashtable because all the method calls will be
synchronized, but your cache will not work right.
If a second thread calls GetItemFromCache while the first thread is inside
SlowQuery, the second thread will end up calling SlowQuery too (and more
threads may execute SlowQuery concurrently).

So, to get it right, you have to write your method as:

// V2

public static MyComplexObject GetItemFromCache(string itemKey)
{

lock (ht.SyncRoot)

{
MyComplexObject obj = (MyComplexObject)ht[itemKey];

if (obj == null)

{
obj = SlowQuery(itemKey);
ht.Add(itemKey, obj);
}
}
}

Note: you should lock on ht.SyncRoot rather than ht.

And then there is no advantage in having a Synchronized hash table, because
you would be acquiring the monitor (SyncLock) three times instead of one
(the explicit lock + ht[itemKey] + ht.Add)

I saw that you used the "double-checked locking" pattern. This is an
optimization to avoid the overhead of acquiring a lock when the data is
already in cache. Some remarks on this optiomisation:

First, it does not make any sense if the ht is Synchronized, because then
the first ht[itemKey] acquires the monitor, so the "optimization" will
actually slow down the code.

Second, it only works if:

a) The memory barriers have the right semantics (see
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html).
From what I have read, this is the case in C# but not in Java.

b) The ht[itemKey] accessor is written in a way that make it safe to call in
a non synchronized context, even if the ht may be modified concurrently from
a synchronized context. You have to check that the implementation of
System.Collections.Hashtable has this special property (I think I read
somewhere that this is true but I cannot guarantee it) -- this is necessary
because the first test is then "outside" of the lock block.

So, you may check this further and try this additional optimization.
Personally, I would stay with V2 above: simple, clear, safe, not dependent
on subtle details of the VM / Hashtable implementation.

Bruno.

"TJoker .NET" <nos...@nonono.no> a écrit dans le message de
news:uPyf9$rNDHA...@TK2MSFTNGP12.phx.gbl...

TJoker .NET

unread,

Jun 20, 2003, 1:10:33 PM6/20/03

to

Bruno,
Thanks a lot for your very nice explanation. I was having a hard time
finding the correct information in the documentation.
The documentation that I found does not mention that reading ht[itemKey]
puts a lock in the object. The documentation also makes you believe that the
Synchronized() method should be involved in the code some how.

Merci beaucoup

TJ

"Bruno Jouhier [MVP]" <bjou...@club-internet.fr> wrote in message
news:eykB#hvNDH...@TK2MSFTNGP12.phx.gbl...

Brian Grunkemeyer [MSFT]

unread,

Jun 20, 2003, 7:20:21 PM6/20/03

to

This is a very good summary of what we did with Hashtable. I have a few
followup points to the questions Bruno raised.

First, use Bruno's second option of locking on the SyncRoot explicitly in
your code. There are two reasons for this. We had thought our synchronized
wrappers would be useful, but in practice you cannot write some seemingly
simple code correctly. This includes code that uses two operations on the
collection, like this:

ArrayList list = ArrayList.Synchronized(new ArrayList());
...
if (list.Count > 1) {
Object o = list[0];
// Do something with o now.
}

The reason is the synchronized wrappers only take the lock for the duration
of the method call. So this code could try accessing the 0'th element in an
empty ArrayList because it releases the lock in the call to list.Count. Our
synchronized wrappers are significantly less useful because they don't solve
this problem, so I'd like to encourage users to not rely on them all the
time. While certainly with Hashtable they seem to be useful, overall the
synchronized wrappers were probably a less than fully baked idea.

The second reason to explicitly take the locks in your own code when using
Hashtable is a bug in our Hashtable code. We thought it was threadsafe for
multiple readers and one concurrent writer, but it isn't. Sorry. We're
looking at how to fix that efficiently. More importantly, our Synchronized
wrapper around a Hashtable only ensures that we end up with multiple readers
and one concurrent writer - it doesn't take a lock in the read-only methods.
So it doesn't actually solve the problem, at least in V1 and V1.1 as they
shipped. It's possible we'll fix this in a service pack.

I hope this helps.

"Bruno Jouhier [MVP]" <bjou...@club-internet.fr> wrote in message

news:eykB%23hvND...@TK2MSFTNGP12.phx.gbl...

Bruno Jouhier [MVP]

unread,

Jun 21, 2003, 4:28:49 AM6/21/03

to

Hi Brian,

Some additional thoughts:

I did not really understand why you re-introduced Synchronized wrappers
around the collection classes. I think that it confuses the developer
instead of helping him (when I see a feature in a framework, I always wonder
why it was introduced, if I should use it, and sometimes I decide not to use
it but a lot of people don't question the relevance and think that because
the feature is there, it is good and it should be used).

Also, this is at odds with other design decisions that you made. For
example, C# does not let you qualify individual methods with lock (unlike
Java). To me, this was an excellent design decision because synchronized
methods (and Synchronized wrappers) only give a false sense of thread-safety
(the implementer of the class is on the safe side, but the consumer is not).
So, why ban "lock" at the method level and still introduce Synchronized
wrappers? This is rather inconsistent.

Your response also raises a very interesting issue: allowing multiple
readers to execute concurrently, while enforcing exclusive access for write
operations. The System.Threading.ReaderWriterLock class supports this
functionality but I think that it deserves "language-level" support. Instead
of having a single "lock" keyword, it would be nice if we had two:
"lockread" to get read-only access and "lockwrite" to get read/write access.
Unlike the lock keyword, these locks would only be applicable to instances
of ReaderWriterLock. Without keywords like these, we have to write lots of
ugly try/finally if we want to use ReaderWriterLock correctly.

BTW, I do not understand why "lock" can be applied to any object and why the
Monitor class has static methods rather than instance methods (actually, I
understand "why": the .NET designers adopted the Java model too quickly and
did not take enough time to criticize it). IMO, Monitor should be a normal
class with instance methods and lock should only be allowed on instances of
the Monitor class (or even better on instances that implement an ILockable
interface). This would force developers to put some discipline (and more
thought) into their locking code.

Bruno.

"Brian Grunkemeyer [MSFT]" <bria...@online.microsoft.com> a écrit dans le
message de news:ODPOSK4N...@TK2MSFTNGP12.phx.gbl...

Bruno Jouhier [MVP]

unread,

Jun 21, 2003, 4:38:34 AM6/21/03

to

"TJoker .NET" <nos...@nonono.no> a écrit dans le message de

news:%23H8Ld70...@TK2MSFTNGP11.phx.gbl...

> Bruno,
> Thanks a lot for your very nice explanation. I was having a hard time
> finding the correct information in the documentation.
> The documentation that I found does not mention that reading ht[itemKey]
> puts a lock in the object.

I was wrong on this one. The Synchronized wrapper does not lock when you
read ht[itemKey], but it is buggy (see Brian's response). Here I assumed too
quickly that the sychronized wrapper synchronized all methods but it is a
bit more subtle (and thus buggy).

> The documentation also makes you believe that the
> Synchronized() method should be involved in the code some how.

IMO, the documentation should discourage you from using the Synchronized
wrapper.

>
> Merci beaucoup

De rien, tout le plaisir est pour moi!