rwlock using pthread_cond

69 views
Skip to first unread message

Hovik Melikyan

unread,
Sep 16, 2002, 6:58:44 AM9/16/02
to
Hi,

This might be interesting to those who write portable multithreaded
programs.

I was looking for a read/write lock implementation based on the other
pthreads primitives for systems which lack built-in rwlock, e.g. Mac
OS X. I came across two implementations on the Net (I omit many others
since they evidently mimic these two, along with their bugs and
weaknesses) and thoroughly tested them both. The test program was
starting 50 threads each of which tries to acquire either a read or a
write lock (1 to 5, randomly selected) repeatedly.

The first one can be found in one of the AIX tutorials:
http://www.unet.univie.ac.at/aix/aixprggd/genprogc/complex_synchronization_objects.htm
. Many other articles and sites refer to this page. This
implementation tries to give higher priority to writers, however, it
contains a bug that makes this code unusable. Take a look at function
rwlock_lock_read() on the page above: the waiting condition must be

while ((rwl->lock_count < 0) || (rwl->waiting_writers))
pthread_cond_wait(&rwl->rcond, &rwl->lock);

instead of &&.

The other one was posted in this newsgroup a few months ago by Sean
P.Burke: http://groups.google.com/groups?selm=82fzyzjq2k.fsf%40dev0.welkyn.com
. Sean's rwlock module is not writer-priority, however, it is
reentrant. Few modifications in the source code make Sean's
implementation writer-priority and, according to my tests, a little
bit faster, but NOT reentrant (please see the patch below).

Regards,

--
Hovik Melikyan


*** rwlock.c.ORIG Mon Sep 16 15:44:19 2002
--- rwlock.c Mon Sep 16 15:47:05 2002
***************
*** 51,57 ****
*/

/* Wait while writers are active */
! while (lock->active_readers < 0)
pthread_cond_wait(&lock->ok_to_read, &lock->mutex);

lock->pending_readers--;
--- 51,57 ----
*/

/* Wait while writers are active */
! while (lock->active_readers < 0 || lock->pending_writers > 0)
pthread_cond_wait(&lock->ok_to_read, &lock->mutex);

lock->pending_readers--;
***************
*** 186,192 ****
/* Signal waiters while holding the mutex, to meet the
* requirements of our pthread_cond_wait emulation for WIN32.
*/
! if (lock->pending_readers != 0)
/* Wake all readers.
*/
pthread_cond_broadcast(&lock->ok_to_read);
--- 186,192 ----
/* Signal waiters while holding the mutex, to meet the
* requirements of our pthread_cond_wait emulation for WIN32.
*/
! if (lock->pending_writers == 0)
/* Wake all readers.
*/
pthread_cond_broadcast(&lock->ok_to_read);

Sean P. Burke

unread,
Sep 16, 2002, 1:43:09 PM9/16/02
to

ho...@moon.yerphi.am (Hovik Melikyan) writes:

Thank you for subjecting my readers/writer lock to a rigorous
test. I have some questions about your conclusions, though.

My implementation was intended to allow a single writer thread
to alternate with a group of reader threads for access to the
lock - an attempt at "fairness". Your change appears to give
writers priority in all cases, which may be prefereable, but
how does it become non-reentrant?

-SEan

Hovik Melikyan

unread,
Sep 17, 2002, 4:15:15 AM9/17/02
to
Hi Sean,

sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message news:<82ptvds...@dev0.welkyn.com>...


>
> My implementation was intended to allow a single writer thread
> to alternate with a group of reader threads for access to the
> lock - an attempt at "fairness".

Do you think "ladies first" is fair, especially if the female
population prevales? (i mean, if readers are ladies) :) Seriously,
from the point of view of practical programming it's much easier to
give writers exclusive priveleges and forget about the problem of
starvation than, on the other hand, make the user of your module think
whether his/her application will starve with your rwlock. With your
scheme, I'll have to test and evaluate the number of concurrent reads
and writes in my app to see if writes are possible at all. The
writer-priority scheme does not burden the developer with this kind of
problems. (How about making your code conditional at run-time, btw? I
mean, one additional bool flag, smth like writer_prioritized).

> Your change appears to give
> writers priority in all cases, which may be prefereable, but
> how does it become non-reentrant?

Frankly, I don't remember how I came to this conclusion. Most likely I
was wrong by saying it is not reentrant.

BTW, as someone who writes threading algorithms, what do you think
about the rwlock implementation for Win32 found in APR (Apache
Portable Runtime)? Win32 lacks posix-like conditions, but it offers
Event instead. APR uses two Event objects and a mutex to implement
rwlocks, but I'm not sure I understand this code and how it
prioritizes the locks. The thing is that I'm writing a portable
library for multithreading and networking and I need to port the
rwlock interface to Win32 too.

Below is the APR rwlock module for Win32. Any comments and
explanations would be greatly appreciated. Thanks.

--
Hovik Melikyan


static apr_status_t thread_rwlock_cleanup(void *data)
{
apr_thread_rwlock_t *rwlock = data;
CloseHandle(rwlock->readevent);
CloseHandle(rwlock->finished);
CloseHandle(rwlock->writemutex);
return APR_SUCCESS;
}

APR_DECLARE(apr_status_t) apr_thread_rwlock_create(apr_thread_rwlock_t
**rwlock,
apr_pool_t *pool)
{
(*rwlock) = apr_palloc(pool, sizeof(**rwlock));
(*rwlock)->pool = pool;
(*rwlock)->readevent=CreateEvent(NULL,TRUE,FALSE,NULL);
(*rwlock)->finished = CreateEvent(NULL,FALSE,TRUE,NULL);
(*rwlock)->writemutex = CreateMutex(NULL,FALSE,NULL);
(*rwlock)->counter = -1;
(*rwlock)->wrcounter = 0;
return APR_SUCCESS;
}

APR_DECLARE(apr_status_t) apr_thread_rwlock_rdlock(apr_thread_rwlock_t
*rwlock)
{
if (InterlockedIncrement(&rwlock->counter) == 0) {
WaitForSingleObject(rwlock->finished, INFINITE);
SetEvent(rwlock->readevent);
}
WaitForSingleObject(rwlock->readevent,INFINITE);
return APR_SUCCESS;
}

APR_DECLARE(apr_status_t) apr_thread_rwlock_wrlock(apr_thread_rwlock_t
*rwlock)
{
WaitForSingleObject(rwlock->writemutex,INFINITE);
WaitForSingleObject(rwlock->finished, INFINITE);
rwlock->wrcounter++;
return APR_SUCCESS;
}

APR_DECLARE(apr_status_t) apr_thread_rwlock_unlock(apr_thread_rwlock_t
*rwlock)
{
if (rwlock->wrcounter) {
/* If wrcounter is > 0, then we must have a writer lock */
rwlock->wrcounter--;
SetEvent(rwlock->finished);
ReleaseMutex(rwlock->writemutex);
}
else {
if (InterlockedDecrement(&rwlock->counter) < 0) {
ResetEvent(rwlock->readevent);
SetEvent(rwlock->finished);
}
}
return APR_SUCCESS;
}

APR_DECLARE(apr_status_t)
apr_thread_rwlock_destroy(apr_thread_rwlock_t *rwlock)
{
return apr_pool_cleanup_run(rwlock->pool, rwlock,
thread_rwlock_cleanup);
}

Alexander Terekhov

unread,
Sep 17, 2002, 6:54:28 AM9/17/02
to

Hovik Melikyan wrote:
[...]

> BTW, as someone who writes threading algorithms, what do you think
> about the rwlock implementation for Win32 found in APR (Apache
> Portable Runtime)? Win32 lacks posix-like conditions, but it offers
> Event instead. APR uses two Event objects and a mutex to implement
> rwlocks, but I'm not sure I understand this code and how it
> prioritizes the locks. The thing is that I'm writing a portable
> library for multithreading and networking and I need to port the
> rwlock interface to Win32 too.

http://sources.redhat.com/pthreads-win32
(Win32 condvars emulation, to begin with)

http://groups.google.com/groups?selm=3B166244.F923B993%40web.de
(Subject: Re: how to distinguish between read and write locks?)

"....
for example (version w/o condition variables and with
"no read/write preference" policy; pseudo-code): ...."

> Below is the APR rwlock module for Win32. Any comments and
> explanations would be greatly appreciated. Thanks.

Comment: <as usual> MS-threading stuff is UTTERLY BRAIN-DEAD
and, for ENTIRELY that reason, this rwlock is totally broken
as well (unless I'm just missing and/or misunderstanding
something).

Explanation...

[...]


> APR_DECLARE(apr_status_t) apr_thread_rwlock_create(apr_thread_rwlock_t
> **rwlock,
> apr_pool_t *pool)
> {
> (*rwlock) = apr_palloc(pool, sizeof(**rwlock));
> (*rwlock)->pool = pool;
> (*rwlock)->readevent=CreateEvent(NULL,TRUE,FALSE,NULL);

That's manual reset event used by the first/last reader to
allow/disallow "more readers" to come in.

> (*rwlock)->finished = CreateEvent(NULL,FALSE,TRUE,NULL);

That's binary semaphore (MS-auto-reset event) and it's acquired/
releases by first/last reader and writers (recursive write case
is broken, BTW).

> (*rwlock)->writemutex = CreateMutex(NULL,FALSE,NULL);

Well, that's MS "Mutex" {recursive} beast and it's used here to
exclude concurrent writes.

> (*rwlock)->counter = -1;

Count of pending and/or active readers.

> (*rwlock)->wrcounter = 0;

That's recursive write locking count (again, it's broken for > 1).

> return APR_SUCCESS;
> }
>
> APR_DECLARE(apr_status_t) apr_thread_rwlock_rdlock(apr_thread_rwlock_t
> *rwlock)
> {
> if (InterlockedIncrement(&rwlock->counter) == 0) {

This MS-thing should really provide ACQUIRE memory sync.
semantics in order to allow unlock to see proper value of
rwlock->wrcounter variable -- and you should really ask MS
folks whether it really works that way... ;-)

Anyway, here we've got a race condition with respect to
unlock (last reader in the "previous" readers batch).

> WaitForSingleObject(rwlock->finished, INFINITE);

Here's the point to decided who's next -- either a whole
bunch of readers [the first one will release all others
(next line) and, AFAICS, continuous stream of subsequent
readers WILL result in writers starvation here) or some
writer.

> SetEvent(rwlock->readevent);

Return is missing here, sort of. ;-)

> }
> WaitForSingleObject(rwlock->readevent,INFINITE);

Here pending readers are awaiting to become released by
the first reader in the bacth that is granted the access
via the "finished" sema.

> return APR_SUCCESS;
> }
>
> APR_DECLARE(apr_status_t) apr_thread_rwlock_wrlock(apr_thread_rwlock_t
> *rwlock)
> {
> WaitForSingleObject(rwlock->writemutex,INFINITE);

Mutual exclusion among writers.

> WaitForSingleObject(rwlock->finished, INFINITE);

Write lock acquisition; here it competes with some lucky
reader -- first one in the batch of readers (and priorities
don't really work for that reason, BTW).

> rwlock->wrcounter++;

Recursive write lock counting.

> return APR_SUCCESS;
> }
>
> APR_DECLARE(apr_status_t) apr_thread_rwlock_unlock(apr_thread_rwlock_t
> *rwlock)
> {
> if (rwlock->wrcounter) {

``What am I?'' (might well be broken for readers, see above)

> /* If wrcounter is > 0, then we must have a writer lock */
> rwlock->wrcounter--;
> SetEvent(rwlock->finished);

BUG. if ( 0 == --rwlock->wrcounter ) SetEvent(rwlock->finished);

> ReleaseMutex(rwlock->writemutex);
> }
> else {
> if (InterlockedDecrement(&rwlock->counter) < 0) {

Bye-bye. Here's the major problem: another batch of readers
(all but first one) may easily pass lock() prior to ResetEvent(
rwlock->readevent) below... with concurrent access granted to
some pending writer (on "finished" sema) as well -- result of
SetEvent(rwlock->finished) below.

> ResetEvent(rwlock->readevent);
> SetEvent(rwlock->finished);

Last reader kinda turns-off-the-lights (race condition).

Hmmm. Am I missing and/or misunderstanding something?

regards,
alexander.

--
You Could Have Won >>ONE MILLION<< At http://sources.redhat.com/pthreads-win32

You May Also Want To BUY Microsoft Windows >>UNIX<< for $50-$99 (plus $30/CAL)

http://www.infoworld.com/articles/pl/xml/02/07/29/020729plmsu.xml
(REVIEWS: A nod toward Unix... PROS: + Bypasses Win32 API for performance
+ Superior performance to Win32 API
+ Price is unbeatable)

ftp://ftp.microsoft.com/developr/Interix/sfu30/POSIX1Conformance.doc
(Microsoft Windows Services for UNIX 3.0 Interix POSIX.1 Conformance Document)

ftp://ftp.microsoft.com/developr/Interix/sfu30/POSIX2Conformance.doc
(Microsoft Windows Services for UNIX 3.0 Interix POSIX.2 Conformance Document)

http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/windows2000serv/deploy/sfu/sfuposix.asp
(Writing POSIX-Standard Code)

http://www.microsoft.com/windows/sfu/default.asp
(Services for UNIX Home)

http://www.microsoft.com/windows/sfu/productinfo/trial/default.asp
(Trial Software Windows Services for UNIX 3.0)

http://www.microsoft.com/windows/sfu/support/xnews/default.asp
(microsoft.public.servicesforunix.general)

#include <disclaim.er>

Sean P. Burke

unread,
Sep 17, 2002, 12:13:22 PM9/17/02
to

ho...@moon.yerphi.am (Hovik Melikyan) writes:

> Hi Sean,
>
> sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message news:<82ptvds...@dev0.welkyn.com>...
> >
> > My implementation was intended to allow a single writer thread
> > to alternate with a group of reader threads for access to the
> > lock - an attempt at "fairness".
>
> Do you think "ladies first" is fair, especially if the female
> population prevales? (i mean, if readers are ladies) :) Seriously,
> from the point of view of practical programming it's much easier to
> give writers exclusive priveleges and forget about the problem of
> starvation than, on the other hand, make the user of your module think
> whether his/her application will starve with your rwlock. With your
> scheme, I'll have to test and evaluate the number of concurrent reads
> and writes in my app to see if writes are possible at all. The
> writer-priority scheme does not burden the developer with this kind of
> problems. (How about making your code conditional at run-time, btw? I
> mean, one additional bool flag, smth like writer_prioritized).

I wrote "fairness" in quotes because people often disagree about
what is "fair", and not just in the field of computer software! :^>

To be serious, this lock shouldn't starve anyone. The intention is
that when a writer releases a lock, any currently pending readers
are allowed in. If there are also pending writers, then any new
readers attempting to acquire the lock will wait until the next
writer has finished, so that no thread is locked out indefinitely.
If you find that the code doesn't do that, please let me know.

> > Your change appears to give
> > writers priority in all cases, which may be prefereable, but
> > how does it become non-reentrant?
>
> Frankly, I don't remember how I came to this conclusion. Most likely I
> was wrong by saying it is not reentrant.

The "fairness" policy I described above will deadlock if readers
take locks recursively. Probably that's what you were thinking of.

> BTW, as someone who writes threading algorithms, what do you think
> about the rwlock implementation for Win32 found in APR (Apache
> Portable Runtime)? Win32 lacks posix-like conditions, but it offers
> Event instead. APR uses two Event objects and a mutex to implement
> rwlocks, but I'm not sure I understand this code and how it
> prioritizes the locks. The thing is that I'm writing a portable
> library for multithreading and networking and I need to port the
> rwlock interface to Win32 too.
>
> Below is the APR rwlock module for Win32. Any comments and
> explanations would be greatly appreciated. Thanks.

I'd like to help, but the only thing I know about Win32 threads
is that pthread-win32 runs on them.

-SEan

Hovik Melikyan

unread,
Sep 18, 2002, 7:07:55 AM9/18/02
to
Thanks for your comments, Alexander.

Alexander Terekhov <tere...@web.de> wrote in message >

> http://sources.redhat.com/pthreads-win32
> (Win32 condvars emulation, to begin with)
>

This is a condvar implementation, which in its turn is based on two
semaphores and a mutex. As a result, you have 4 semaphores and 4
mutex'es for a single rwlock object! Hmmm... I'd rahter think about
involving native Win32 objects than sticking on anti-microsoft or
whatever. The reality is that I want my daemon to work on Win32 too,
and I don't want it to be just a clumsy 'emulated' thing with a bunch
of soft pillows, that is, emulating DLL's. After all, while implicitly
promoting Unix, Apache and MySQL are still running perfectly on almost
anything, even on a cell phone :) and maybe that's why they're number
one, aren't they?

In fact, if we can make sure the APR code works, we can suggest it for
pthreads-win32 too. Let's see if we are on the right track at all.

So, Event is a binary semaphore with two different modes of operation:
when signaling a manual-reset event, it releases all threads waiting
for the event; when signaling an auto-reset event, it lets only one
thread in and remains in a non-signaled state. As you can see Event's
behaviour is somewhat resembling pthread_cond_signal and
pthread_cond_broadcast, but of course it's not quite the same.

I'm re-writing the APR code to some kind of a OO pseudo-code. I'll try
then to answer to your comments. (Actually I did write a wrapper class
for this event/mutex implementation on Windows and tested it.
Surprisingly it works perfectly :) even under very stressing
conditions.)

Cast:
writelock - a mutex
oktoread - a manual reset event
finished - an auto-reset event
readercnt - initialized to -1
writercnt - initialized to 0

1: void rwlock::rdlock()
2: {
3: if (atomic_increment(&readercnt) == 0)
4: {
5: finished.wait();
6: oktoread.post();
7: }
8: oktoread.wait();
9: }
10:
11: void rwlock::wrlock()
12: {
13: writelock.lock();
14: finished.wait();
15: writercnt++;
16: }
17:
18: void rwlock::unlock()
19: {
20: if (writercnt != 0)
21: {
22: writercnt--;
23: finished.post();
24: writelock.unlock();
25: }
26: else if (atomic_decrement(&readercnt) < 0)
27: {
28: oktoread.reset();
29: finished.post();
20: }
31: }

>
> > if (InterlockedIncrement(&rwlock->counter) == 0) {
>
> This MS-thing should really provide ACQUIRE memory sync.
> semantics in order to allow unlock to see proper value of
> rwlock->wrcounter variable -- and you should really ask MS
> folks whether it really works that way... ;-)
>

Are you talking about InterlockedIncrement? If so, indeed, it's a
strange thing that behaves differently on Win95+ and NT+ families. But
anyway, you can safely test the result for 0 and for signness on any
platform (it's a CPU compatibility issue). As for not using atomic_inc
or dec for writercnt, it's still safe: a reader can never see anything
other than 0 in writercnt (on line 20).

>
> Anyway, here we've got a race condition with respect to
> unlock (last reader in the "previous" readers batch).
>

Could you explain why?

>
> > WaitForSingleObject(rwlock->finished, INFINITE);
>
> Here's the point to decided who's next -- either a whole
> bunch of readers [the first one will release all others
> (next line) and, AFAICS, continuous stream of subsequent
> readers WILL result in writers starvation here) or some
> writer.
>

This is a reader-priority scheme, but not because of that. Take a look
at lines 23 and 29: both the writer and the last reader make
finished.post(), however, the next writer is stuck on a mutex, rather
than on the 'finished' event, and that's why readers will come in
first.

> > WaitForSingleObject(rwlock->writemutex,INFINITE);
>
> Mutual exclusion among writers.
>
> > WaitForSingleObject(rwlock->finished, INFINITE);
>
> Write lock acquisition; here it competes with some lucky
> reader -- first one in the batch of readers (and priorities
> don't really work for that reason, BTW).
>

Right, the next finished.post() will randomly choose a reader or a
writer, which makes this scheme somewhat "fairer" than just
reader-priority. Hence, what we actually have here is: writers ALWAYS
release readers when they finish, and readers MAY release writers or
readers (presumably, depending on the order they locked the 'finished'
event). Isn't it "fair"?

>
> > if (rwlock->wrcounter) {
>
> ``What am I?'' (might well be broken for readers, see above)
>

Actually not, see above.

> > /* If wrcounter is > 0, then we must have a writer lock */
> > rwlock->wrcounter--;
> > SetEvent(rwlock->finished);
>
> BUG. if ( 0 == --rwlock->wrcounter ) SetEvent(rwlock->finished);
>

I don't think so. This code is protected with a mutex, so writecnt can
never be > 1. It's just either 0 or 1.

>
> > ReleaseMutex(rwlock->writemutex);
> > }
> > else {
> > if (InterlockedDecrement(&rwlock->counter) < 0) {
>
> Bye-bye. Here's the major problem: another batch of readers
> (all but first one) may easily pass lock() prior to ResetEvent(
> rwlock->readevent) below... with concurrent access granted to
> some pending writer (on "finished" sema) as well -- result of
> SetEvent(rwlock->finished) below.
>

Nope! The thing is that if there are readers waiting on line 8, it
means readercnt is > 0 and thus, the exiting reader on line 26 will
not execute lines 28/29. On the other hand, if readercnt is 0, there
seems to be a race condition between lines 3 and 26, but the order of
wait's and post's on both event objects actually prevents it.

> > ResetEvent(rwlock->readevent);
> > SetEvent(rwlock->finished);
>
> Last reader kinda turns-off-the-lights (race condition).
>

Why?


The other question is whether this rwlock is reentrant. And besides,
how can you change the priorities here, if at all possible. Other than
that it seems pretty good to me.

Thanks,

--
Hovik Melikyan

Hovik Melikyan

unread,
Sep 18, 2002, 7:23:41 AM9/18/02
to
sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message news:<82hegos...@dev0.welkyn.com>...

>
> To be serious, this lock shouldn't starve anyone. The intention is
> that when a writer releases a lock, any currently pending readers
> are allowed in. If there are also pending writers, then any new
> readers attempting to acquire the lock will wait until the next
> writer has finished, so that no thread is locked out indefinitely.
>

... which means, if there are many writers (more than readers)
continuously approaching the point of locking, they will have priority
over poor readers and they will never let them in.

My concern is that in the real world of networking any "unfairness" of
such kind opens a possibility of denial-of-service attacks. If someone
knows how you distribute priorities between threads, he can
deliberately cause starvation and hence degradation of your service.
The best thing for the developer is to know that the service he writes
can not be broken that way.

Regards,

--
Hovik Melikyan

Alexander Terekhov

unread,
Sep 18, 2002, 8:43:57 AM9/18/02
to

Hovik Melikyan wrote:
>
> Thanks for your comments, Alexander.
>
> Alexander Terekhov <tere...@web.de> wrote in message >
> > http://sources.redhat.com/pthreads-win32
> > (Win32 condvars emulation, to begin with)
> >
>
> This is a condvar implementation, which in its turn is based on two
> semaphores and a mutex. As a result, you have 4 semaphores and 4
> mutex'es for a single rwlock object! Hmmm... I'd rahter think about
> involving native Win32 objects than sticking on anti-microsoft or
> whatever.

Have you also noticed that pthreads-win32's rwlock object is
actually built along the lines of: < from my previous reply >

: http://groups.google.com/groups?selm=3B166244.F923B993%40web.de


: (Subject: Re: how to distinguish between read and write locks?)
:
: "....
: for example (version w/o condition variables and with
: "no read/write preference" policy; pseudo-code): ...."

where cndSharedAccessCompleted could be replaced easily with
binary sema (ms-auto-reset event), thread cancellation aside
for a moment?

http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/implement.h?rev=1.88&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32

http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/pthread_rwlock_unlock.c?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32

http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/pthread_rwlock_wrlock.c?rev=1.2&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32

http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/ptw32_rwlock_cancelwrwait.c?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32

> The reality is [...Unix, Apache, etc...]

The "reality" is also this:

http://groups.google.com/groups?selm=%23AsFvRtHCHA.2144%40tkmsftngp13
(Subject: Re: Apache won't compile)

"....the new config.guess and config.sub understand about Interix3 and
are doing pretty well. ...."

> So, Event is a binary semaphore with two different modes of operation:
> when signaling a manual-reset event, it releases all threads waiting
> for the event; when signaling an auto-reset event, it lets only one
> thread in and remains in a non-signaled state.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/setevent.asp

"The state of a manual-reset event object remains signaled until
it is set explicitly to the nonsignaled state by the ResetEvent
function. Any number of waiting threads, or threads that
subsequently begin wait operations for the specified event
object by calling one of the wait functions, can be released
while the object's state is signaled.

The state of an auto-reset event object remains signaled until
a single waiting thread is released, at which time the system
automatically sets the state to nonsignaled. If no threads are
waiting, the event object's state remains signaled.

Setting an event that is already reset has no effect."

> As you can see Event's
> behaviour is somewhat resembling pthread_cond_signal and
> pthread_cond_broadcast, but of course it's not quite the same.

http://groups.google.com/groups?selm=3CE0BCD3.EF695748%40web.de
(Subject: Re: The implementation of condition variables in pthreads-win32)

> I'm re-writing the APR code to some kind of a OO pseudo-code. I'll try
> then to answer to your comments. (Actually I did write a wrapper class
> for this event/mutex implementation on Windows and tested it.
> Surprisingly it works perfectly :) even under very stressing

> conditions.) ....

Yeah, and you can easily break it with one or two Sleeps added,
I guess. ;-)

> > > if (InterlockedIncrement(&rwlock->counter) == 0) {
> >
> > This MS-thing should really provide ACQUIRE memory sync.
> > semantics in order to allow unlock to see proper value of
> > rwlock->wrcounter variable -- and you should really ask MS
> > folks whether it really works that way... ;-)
> >
>

> Are you talking about InterlockedIncrement? ....

http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
(Subject: Re: Can multiple threads set a global variable simultaneously?)

> > Anyway, here we've got a race condition with respect to
> > unlock (last reader in the "previous" readers batch).
> >
>
> Could you explain why?

See below.

> > > WaitForSingleObject(rwlock->finished, INFINITE);
> >
> > Here's the point to decided who's next -- either a whole
> > bunch of readers [the first one will release all others
> > (next line) and, AFAICS, continuous stream of subsequent
> > readers WILL result in writers starvation here) or some
> > writer.
> >
>
> This is a reader-priority scheme, but not because of that. Take a look
> at lines 23 and 29: both the writer and the last reader make
> finished.post(), however, the next writer is stuck on a mutex, rather
> than on the 'finished' event, and that's why readers will come in
> first.

Not really. See the event implementation pseudo-code (link above).

> > > WaitForSingleObject(rwlock->writemutex,INFINITE);
> >
> > Mutual exclusion among writers.
> >
> > > WaitForSingleObject(rwlock->finished, INFINITE);
> >
> > Write lock acquisition; here it competes with some lucky
> > reader -- first one in the batch of readers (and priorities
> > don't really work for that reason, BTW).
> >
>
> Right, the next finished.post() will randomly choose a reader or a
> writer, which makes this scheme somewhat "fairer" than just
> reader-priority. Hence, what we actually have here is: writers ALWAYS
> release readers when they finish, and readers MAY release writers or
> readers (presumably, depending on the order they locked the 'finished'
> event). Isn't it "fair"?

See rwlock implementation with one single "entry" lock ("try"-
readers aside) which is supposed to make it "kinda fair" (link
above).

> > > if (rwlock->wrcounter) {
> >
> > ``What am I?'' (might well be broken for readers, see above)
> >
>
> Actually not, see above.

Right. "see above" ;-)

> > > /* If wrcounter is > 0, then we must have a writer lock */
> > > rwlock->wrcounter--;
> > > SetEvent(rwlock->finished);
> >
> > BUG. if ( 0 == --rwlock->wrcounter ) SetEvent(rwlock->finished);
> >
>
> I don't think so. This code is protected with a mutex, so writecnt can
> never be > 1. It's just either 0 or 1.

The code is protected with MS >>recursive<< mutex. To make it
"correct" (other problems aside for a moment), you'd need to
add more "if":

APR_DECLARE(apr_status_t) apr_thread_rwlock_wrlock(apr_thread_rwlock_t
*rwlock)
{
WaitForSingleObject(rwlock->writemutex,INFINITE);

if ( 0 == rwlock->wrcounter )


WaitForSingleObject(rwlock->finished, INFINITE);
rwlock->wrcounter++;
return APR_SUCCESS;
}

> > > ReleaseMutex(rwlock->writemutex);


> > > }
> > > else {
> > > if (InterlockedDecrement(&rwlock->counter) < 0) {
> >
> > Bye-bye. Here's the major problem: another batch of readers
> > (all but first one) may easily pass lock() prior to ResetEvent(
> > rwlock->readevent) below... with concurrent access granted to
> > some pending writer (on "finished" sema) as well -- result of
> > SetEvent(rwlock->finished) below.
> >
>
> Nope! The thing is that if there are readers waiting on line 8, it
> means readercnt is > 0 and thus, the exiting reader on line 26 will
> not execute lines 28/29. On the other hand, if readercnt is 0, there
> seems to be a race condition between lines 3 and 26, but the order of
> wait's and post's on both event objects actually prevents it.
>
> > > ResetEvent(rwlock->readevent);
> > > SetEvent(rwlock->finished);
> >
> > Last reader kinda turns-off-the-lights (race condition).
> >
>
> Why?

Try to imagine what could happen if you'd add this:

: > > > ReleaseMutex(rwlock->writemutex);


: > > > }
: > > > else {
: > > > if (InterlockedDecrement(&rwlock->counter) < 0) {

----------------> Sleep( rand() ); // Let's BREAK it...

> The other question is whether this rwlock is reentrant.

What do you mean?

regards,
alexander.

Sean P. Burke

unread,
Sep 18, 2002, 11:34:11 AM9/18/02
to

ho...@moon.yerphi.am (Hovik Melikyan) writes:

> sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message news:<82hegos...@dev0.welkyn.com>...
> >
> > To be serious, this lock shouldn't starve anyone. The intention is
> > that when a writer releases a lock, any currently pending readers
> > are allowed in. If there are also pending writers, then any new
> > readers attempting to acquire the lock will wait until the next
> > writer has finished, so that no thread is locked out indefinitely.
> >
>
> ... which means, if there are many writers (more than readers)
> continuously approaching the point of locking, they will have priority
> over poor readers and they will never let them in.

Perhaps you overlooked this subtlety - we set lock->active_readers
to be -1 in the case of an active writer, so the write-lock operation
waits on the ok_to_write condition if the lock is held by anyone:

while (lock->valid && lock->active_readers)
pthread_cond_wait(&lock->ok_to_write, &lock->mutex);

When a writer thread releases the lock, it favors pending
readers:

if (lock->pending_readers != 0)
/* Wake all readers.
*/
pthread_cond_broadcast(&lock->ok_to_read);

else
/* Wake a single writer.
*/
pthread_cond_signal(&lock->ok_to_write);

Now, I suppose that a new writer thread could grab the lock
in a narrow window after the old writer releases the mutex,
but before a reader thread waking from the cond_wait can
acquire the mutex. I would expect this to be rare.

> My concern is that in the real world of networking any "unfairness" of
> such kind opens a possibility of denial-of-service attacks. If someone
> knows how you distribute priorities between threads, he can
> deliberately cause starvation and hence degradation of your service.
> The best thing for the developer is to know that the service he writes
> can not be broken that way.

Well, there's no need to speculate, since we can test the code.
Were you able to demonstrate lockout of readers in your testing?

-SEan

Hovik Melikyan

unread,
Sep 19, 2002, 4:00:48 AM9/19/02
to
sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message
> Well, there's no need to speculate, since we can test the code.
> Were you able to demonstrate lockout of readers in your testing?

Well, I used a program that starts some number of threads. Each thread
iteratively locks the rwlock object for reading or writing, randomly
chosen, but with the prespecified ratio, e.g. 1 writer and 5 readers,
or vice versa. Readers plot dots, and writers plot w's. The picture
you get when running the test gives approximate notion of what's going
on, who's starving etc. I tried both your code and my modifications to
it. Actually the test is much longer, I'm giving just few lines from
each run:

SEAN, readers/writers = 5/1

w........ww...w........w..w..ww............w...w..w..w..........w....w.........
......ww..........................w........w..w......w.....w.ww..w...ww.w..w..w

SEAN, readers/writers = 1/5

.www.w.www..w.wwwwww..wwwwwwww.wwwwwwwwwwww.wwwwww..wwwwwwwwwww.ww.w.wwwwwwwwww
wwwwwwww..wwwwww..w..wwwwwwwwww.w.www.wwww..wwwwwww.w.wwwww.wwwwwww.wwwwwww.w.w

HOVIK, readers/writers = 5/1

....wwwwwwwwwww.........................................wwwwwwwwwwwww..........
....................www.................................wwwwwww................

HOVIK, readers/writers = 5/1

wwwwwwwwwwwwwwwwwwwww..............................wwwwwwwwwwwwwwwwwwwwwwwwwwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.......


In your case we see much better distribution than in mine. However,
yours is running approx. 3-4 times slower (naturally). The maximum
number of readers that entered the critical section simultaneously is
twice as more in my case.

All in all, despite its slowness, a better distribution in your case
is convincing. I'm going to use your algorithm in my library, if you
allow. It's an open-source library and can be found at
http://www.melikyan.com/ptypes/

Thanks,

--
Hovik Melikyan

-----------------------------------------------------------------------------

//
// rwlock test
//

const int rw_max_threads = 30;
const int rw_max_tries = 30;
const int rw_max_delay = 10;
const int rw_rw_ratio = 5;
const bool rw_swap = false;


class rwthread: public thread
{
protected:
virtual void execute();
public:
rwthread(): thread(false) {}
virtual ~rwthread() { waitfor(); }
};


rwlock rw;

int reader_cnt = 0;
int writer_cnt = 0;
int total_writers = 0;
int total_readers = 0;
int max_readers = 0;


int prand(int max)
{
return rand() % max;
}


void rwthread::execute()
{
for(int i = 0; i < rw_max_tries; i++)
{
psleep(prand(rw_max_delay));
bool writer = prand(rw_rw_ratio) == 0;
if (writer ^ rw_swap)
{
rw.wrlock();
pout.put('w');
if (pincrement(&writer_cnt) > 1)
fatal(0xa0, "Writer: Huh?! Writers in here?");
pincrement(&total_writers);
}
else
{
rw.rdlock();
pout.put('.');

// we have a race condition here, so max_readers
// wouldn't be that precise.
int t;
if ((t = pincrement(&reader_cnt)) > max_readers)
max_readers = t;

if (writer_cnt > 0)
fatal(0xa1, "Reader: Huh?! Writers in here?");
pincrement(&total_readers);
}

psleep(prand(rw_max_delay));
if (writer ^ rw_swap)
pdecrement(&writer_cnt);
else
pdecrement(&reader_cnt);

rw.unlock();
}
}


void rwlock_test()
{
// a bug in MSC does not allow us to declare
// an array of static thread objects :(
rwthread* threads[rw_max_threads];

srand((unsigned)time(0));

int i;
for(i = 0; i < rw_max_threads; i++)
{
threads[i] = new rwthread();
threads[i]->start();
}
for(i = 0; i < rw_max_threads; i++)
delete threads[i];

pout.putf("\nmax readers: %d\n", max_readers);
}

t...@cs.ucr.edu

unread,
Sep 19, 2002, 5:24:50 AM9/19/02
to
Hovik Melikyan <ho...@moon.yerphi.am> wrote:
+ sbu...@dev0.welkyn.com (Sean P. Burke) wrote in message
+> Well, there's no need to speculate, since we can test the code.
+> Were you able to demonstrate lockout of readers in your testing?

+ Well, I used a program that starts some number of threads. Each thread
+ iteratively locks the rwlock object for reading or writing, randomly
+ chosen, but with the prespecified ratio, e.g. 1 writer and 5 readers,
+ or vice versa. Readers plot dots, and writers plot w's. The picture
+ you get when running the test gives approximate notion of what's going
+ on, who's starving etc. I tried both your code and my modifications to
+ it. Actually the test is much longer, I'm giving just few lines from
+ each run:

+ SEAN, readers/writers = 5/1

+ w........ww...w........w..w..ww............w...w..w..w..........w....w.........
+ ......ww..........................w........w..w......w.....w.ww..w...ww.w..w..w

+ SEAN, readers/writers = 1/5

+ .www.w.www..w.wwwwww..wwwwwwww.wwwwwwwwwwww.wwwwww..wwwwwwwwwww.ww.w.wwwwwwwwww
+ wwwwwwww..wwwwww..w..wwwwwwwwww.w.www.wwww..wwwwwww.w.wwwww.wwwwwww.wwwwwww.w.w

+ HOVIK, readers/writers = 5/1

+ ....wwwwwwwwwww.........................................wwwwwwwwwwwww..........
+ ....................www.................................wwwwwww................

+ HOVIK, readers/writers = 5/1

+ wwwwwwwwwwwwwwwwwwwww..............................wwwwwwwwwwwwwwwwwwwwwwwwwwww
+ wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.......


+ In your case we see much better distribution than in mine. However,
+ yours is running approx. 3-4 times slower (naturally). The maximum
+ number of readers that entered the critical section simultaneously is
+ twice as more in my case.

Hmmmmm. I have difficulty accounting for that "3-4 times slower". If
I understand correctly, each algorithm takes the same amount of time
to accomplish the writing tasks, but Hovik's writers-get-priority
policy leads to about half as many batches of concurrent readers.

Even if Hovik's algorithm cut reading time to zero, I can't see a
factor of 3-4 speedup. What am I missing?

Tom Payne

David Schwartz

unread,
Sep 19, 2002, 6:00:30 AM9/19/02
to
Hovik Melikyan wrote:

> ... which means, if there are many writers (more than readers)
> continuously approaching the point of locking, they will have priority
> over poor readers and they will never let them in.

If there are many writers, more than readers, you shouldn'y be using an
rwlock.



> My concern is that in the real world of networking any "unfairness" of
> such kind opens a possibility of denial-of-service attacks. If someone
> knows how you distribute priorities between threads, he can
> deliberately cause starvation and hence degradation of your service.

If you don't want writer priority, you probably don't want an rwlock.

> The best thing for the developer is to know that the service he writes
> can not be broken that way.

If you are really concerned about 'fairness', you probably want a
mode-switching rwlock. It has a more complex state machine than a
regular rwlock.

Essentially, what it does is it allows one writer through, then all
presently waiting readers. As soon as a write lock is requested, no new
readers are granted the lock. But two consecutive write locks are not
granted if there are waiting readers.

So it basically alternates. One writer, all waiting readers, one
writer, all waiting readers. Where 'waiting' means that they've been
waiting since before the writer released the lock.

DS

Hovik Melikyan

unread,
Sep 19, 2002, 10:14:22 AM9/19/02
to
> The state of an auto-reset event object remains signaled until
> a single waiting thread is released, at which time the system
> automatically sets the state to nonsignaled. If no threads are
> waiting, the event object's state remains signaled.
>

The description is incomplete: "If no threads are waiting, the event
object's state remains signaled" ... until the next thread releases
the object. However, wait's and post's are NOT balanced like for an
ordinary semaphore. If you call wait() twice it doesn't necessarily
mean you will release two threads.

This is important for understanding the implementation of rwlock on
Win32.

>
> > > > if (InterlockedIncrement(&rwlock->counter) == 0) {
> > >
> > > This MS-thing should really provide ACQUIRE memory sync.
> > > semantics in order to allow unlock to see proper value of
> > > rwlock->wrcounter variable -- and you should really ask MS
> > > folks whether it really works that way... ;-)
> > >
> >
> > Are you talking about InterlockedIncrement? ....
>
> http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
> (Subject: Re: Can multiple threads set a global variable simultaneously?)
>

InterlockedXXX rely on 'LOCK XADD', a i486+ instruction. LOCK is for
SMP, actually. I don't understand why should I care about how the
cache works. LOCK is for locking the main memory for the next
instruction, and that means, any instruction with LOCK should work
exactly the same way as without it on a single-CPU system. The CPU
_should_ write through, validate the cache or the main memory or
whatever it is doing, I don't care, to work AS EXPECTED.

(Frankly, if I would become anti-something, I would start with Intel
rather than with Microsoft).

> http://msdn.microsoft.com/library/default.asp?url=/library/en- us/dllproc/synchro_11df.asp
>
> Does NOT look very convincing to me!
>
> ("Processors can be instructed to force their memory
> caches to agree with main memory..." w.r.t mem.sync
> IS JUST PLAIN MS-STYLE-BULLSHIT, AFAIK/AFAICT/IMHO!)
>

All is not bullshit that smells :) As you can see, they are solving
the problem with InterlockedExchange, because it issues a lock on the
main memory (LOCK XCHG) and thus validates its contents.

>
> > > > if (rwlock->wrcounter) {
> > >
> > > ``What am I?'' (might well be broken for readers, see above)
> > >
> >
> > Actually not, see above.
>
> Right. "see above" ;-)
>

Wrong... (Does anyone look above in this list, after all?!) :)

> > > > /* If wrcounter is > 0, then we must have a writer lock */
> > > > rwlock->wrcounter--;
> > > > SetEvent(rwlock->finished);
> > >
> > > BUG. if ( 0 == --rwlock->wrcounter ) SetEvent(rwlock->finished);
> > >
> >
> > I don't think so. This code is protected with a mutex, so writecnt can
> > never be > 1. It's just either 0 or 1.
>
> The code is protected with MS >>recursive<< mutex. To make it
> "correct" (other problems aside for a moment), you'd need to
> add more "if":
>

... only if the 'finished' event is recursive too. I'm not sure it is.
I'm gonna play with it a little to see what it's doing... Kind of
'experimental programming', which reminds me my true speciality -
experimental physics :)

>
> Try to imagine what could happen if you'd add this:
>
> : > > > ReleaseMutex(rwlock->writemutex);
> : > > > }
> : > > > else {
> : > > > if (InterlockedDecrement(&rwlock->counter) < 0) {
> ----------------> Sleep( rand() ); // Let's BREAK it...
>

I tried it, and again, it worked fine. With 100 threads, or just 2 or
even 1. With Sleep or sleepless - it still works. How many threads and
what situation do you need to break it, exactly?

Regards,

--
Hovik Melikyan

Hovik Melikyan

unread,
Sep 19, 2002, 11:00:00 AM9/19/02
to
David Schwartz <dav...@webmaster.com> wrote in message news:<3D89A03E...@webmaster.com>...

>
> If there are many writers, more than readers, you shouldn'y be using an
> rwlock.
>

Why shouldn't I? If writers prevail, I would give them higher priority
but still, I want my readers to do their job simultaneously as they
don't bother each other when reading the shared resource.

>
> > The best thing for the developer is to know that the service he writes
> > can not be broken that way.
>
> If you are really concerned about 'fairness', you probably want a
> mode-switching rwlock. It has a more complex state machine than a
> regular rwlock.
>

Agreed. And few more words about breaking a service that uses an
rwlock.

Let's say you have an open service with two public functions: Fr and
Fw for reading a shared resource and for writing to it respectively.
If I know that you are using any of the 'unfair' schemes, say,
w-priority, I can open two connections to your service and start
calling Fw repeatedly so that the calls always overlap. Other users
calling Fr will wait forever. Notice, I need only 2 connections, not
1000.

> Essentially, what it does is it allows one writer through, then all
> presently waiting readers. As soon as a write lock is requested, no new
> readers are granted the lock. But two consecutive write locks are not
> granted if there are waiting readers.
>

I think that's exactly what Sean's algorithm is doing. Correct me if
I'm mistaken.

Regards,

--
Hovik Melikyan

Hovik Melikyan

unread,
Sep 19, 2002, 11:14:42 AM9/19/02
to
t...@cs.ucr.edu wrote in message news:<amc552$nn6$1...@glue.ucr.edu>...

>
> Hmmmmm. I have difficulty accounting for that "3-4 times slower". If
> I understand correctly, each algorithm takes the same amount of time
> to accomplish the writing tasks, but Hovik's writers-get-priority
> policy leads to about half as many batches of concurrent readers.
>

This comes from real timing tests. In Sean's case the maximum number
of readers is always half the total number of threads, whereas in my
case they are equal. In other words, my algorithm (conventionally,
it's not mine actually) uses reads more effectively. As for "3-4
times", as I said, it's coming from tests, I can't tell you exactly
why it's just 3-4, although there must be some explanation, of course.

--
Hovik Melikyan

Alexander Terekhov

unread,
Sep 19, 2002, 11:33:18 AM9/19/02
to

Hovik Melikyan wrote:
[...]
> InterlockedXXX rely on 'LOCK XADD', a i486+ instruction. ....

Really? How fascinating.

http://search.microsoft.com/default.asp?qu=%22non%2Dx86+systems%22&boolean=ALL&nq=NEW&so=RECCNT&p=1&ig=01&i=00&i=01&i=02&i=03&i=04&i=05&i=06&i=07&i=08&i=09&i=10&i=11&i=12&i=13&i=14&i=15&i=16&i=17&i=18&i=19&i=20&i=21&i=22&i=23&i=24&i=25&i=26&i=27&i=28&i=29&i=30&i=31&i=32&i=33&i=34&i=35&i=36&i=37&i=38&i=39&i=40&i=41&siteid=us/dev
(Search Results for ""non-x86 systems"" using All words)

> > : > > > if (InterlockedDecrement(&rwlock->counter) < 0) {
> > ----------------> Sleep( rand() ); // Let's BREAK it...
> >
>
> I tried it, and again, it worked fine. With 100 threads, or just 2 or
> even 1. With Sleep or sleepless - it still works. How many threads and
> what situation do you need to break it, exactly?

Nah, let's first settle MS-InterlockedXXX-class-action-lawsuit, okay? ;-)

regards,
alexander.

Sean P. Burke

unread,
Sep 19, 2002, 1:46:01 PM9/19/02
to

ho...@moon.yerphi.am (Hovik Melikyan) writes:

> David Schwartz <dav...@webmaster.com> wrote in message news:<3D89A03E...@webmaster.com>...
> >
> > If there are many writers, more than readers, you shouldn'y be using an
> > rwlock.
> >
>
> Why shouldn't I? If writers prevail, I would give them higher priority
> but still, I want my readers to do their job simultaneously as they
> don't bother each other when reading the shared resource.

Look at the overhead of the readers/writer lock: in the uncontended
case, you must do two mutex lock/unlock operations for each access
to the protected object, compared to a single lock/unlock if you use
a simple mutex. When writers contend for a lock, the overhead is
three mutex_lock/unlock cycles plus a cond_wait and cond_signal.

Mutex lock and unlock operations are expensive to the extent that they
operate on shared data, and can therefore induce cache ping-pong, and
they invoke memory barriers on most SMP machines. I think that testing
would reveal that many applications of rwlocks could be replaced by
simple mutexes with a net gain in performance.


> > > The best thing for the developer is to know that the service he writes
> > > can not be broken that way.
> >
> > If you are really concerned about 'fairness', you probably want a
> > mode-switching rwlock. It has a more complex state machine than a
> > regular rwlock.
> >
>
> Agreed. And few more words about breaking a service that uses an
> rwlock.
>
> Let's say you have an open service with two public functions: Fr and
> Fw for reading a shared resource and for writing to it respectively.
> If I know that you are using any of the 'unfair' schemes, say,
> w-priority, I can open two connections to your service and start
> calling Fw repeatedly so that the calls always overlap. Other users
> calling Fr will wait forever. Notice, I need only 2 connections, not
> 1000.
>
> > Essentially, what it does is it allows one writer through, then all
> > presently waiting readers. As soon as a write lock is requested, no new
> > readers are granted the lock. But two consecutive write locks are not
> > granted if there are waiting readers.
> >
>
> I think that's exactly what Sean's algorithm is doing. Correct me if
> I'm mistaken.

I copied the algorthim from Uresh Vahalia's "UNIX Internals: The New
Frontiers", ISBN 0-13-101908-2, a fascinating book that I recommend
most heartily. Curt Schimmel's "UNIX Systems for Modern Architectures",
ISBN 0-201-63338-8, is also an excellent read in the same area.

-SEan

Drazen Kacar

unread,
Sep 19, 2002, 2:05:46 PM9/19/02
to
Sean P. Burke wrote:

> Look at the overhead of the readers/writer lock: in the uncontended
> case, you must do two mutex lock/unlock operations for each access
> to the protected object, compared to a single lock/unlock if you use
> a simple mutex. When writers contend for a lock, the overhead is
> three mutex_lock/unlock cycles plus a cond_wait and cond_signal.
>
> Mutex lock and unlock operations are expensive to the extent that they
> operate on shared data, and can therefore induce cache ping-pong, and
> they invoke memory barriers on most SMP machines. I think that testing
> would reveal that many applications of rwlocks could be replaced by
> simple mutexes with a net gain in performance.

Is that still true if one uses say, pthread_rwlock_* functions? The thread
library should be able to optimize somewhat, but I don't know how
effective that is compared to the implementation built on a mutex and
condition variables.

--
.-. .-. I don't work here. I'm a consultant.
(_ \ / _)
| da...@willfork.com
|

David Schwartz

unread,
Sep 19, 2002, 3:45:06 PM9/19/02
to
Hovik Melikyan wrote:

> > If there are many writers, more than readers, you shouldn'y be using an
> > rwlock.

> Why shouldn't I?

Because the added complexity will not gain you anything unless there
are more readers than writers.

> If writers prevail, I would give them higher priority
> but still, I want my readers to do their job simultaneously as they
> don't bother each other when reading the shared resource.

But they won't unless there are more readers than writers. They'll wind
up waiting for the writers to finish.



> > > The best thing for the developer is to know that the service he writes
> > > can not be broken that way.

> > If you are really concerned about 'fairness', you probably want a
> > mode-switching rwlock. It has a more complex state machine than a
> > regular rwlock.

> Agreed. And few more words about breaking a service that uses an
> rwlock.

> Let's say you have an open service with two public functions: Fr and
> Fw for reading a shared resource and for writing to it respectively.
> If I know that you are using any of the 'unfair' schemes, say,
> w-priority, I can open two connections to your service and start
> calling Fw repeatedly so that the calls always overlap. Other users
> calling Fr will wait forever. Notice, I need only 2 connections, not
> 1000.

You shouldn't do this anyway. Mutexes will break the same way. In
principle, a mutex could be implemented to give the thread that was
created the earliest priority. So if you map threads to connections, a
connection that repeatedly does anything involving the mutex will starve
all other connections that want the mutex.

But I don't think mutex or lock fairness is the correct solution. I
think the solution is to write your code so that you don't care who gets
the lock. Since you decide what work is done once the lock is acquired,
you shouldn't care which thread gets it, since any thread will do
exactly what work you ask it to.

To a certain extent, rwlock's are an exception because there's a
fundamental difference between a thread that gets a read lock and a
thread that gets a write lock. So you can't sensibly code them to behave
exactly the same.

IMO, the correct and safe way to use an rwlock is in a case where you
would otherwise use a mutex, but you want the possibility that
concurrent readers will give you a performance boost. Because there's
more overhead, that performance boost should be expected to be
significant because otherwise you will wind up losing more than you
gain.

The rationale behind write priority is that if you have many more
readers than writers and you don't allow a write lock until there happen
to be no readers, you may delay threads that want the write lock for
huge amounts of time. However, unless you have a special case (for
example, where writers obsolete previous data entirely), you probably
would prefer for writers not to be able to monopolize the lock and shut
out readers either.

DS

Ziv Caspi

unread,
Sep 19, 2002, 6:23:12 PM9/19/02
to
On Thu, 19 Sep 2002 17:33:18 +0200, Alexander Terekhov
<tere...@web.de> wrote:

I don't get your point. Obviously, on non-x86 systems, the
implementation is not 'LOCK XADD', but something that is appropriate
for the target architecture.

Ziv

t...@cs.ucr.edu

unread,
Sep 20, 2002, 2:12:26 AM9/20/02
to
Hovik Melikyan <ho...@moon.yerphi.am> wrote:
+ t...@cs.ucr.edu wrote in message news:<amc552$nn6$1...@glue.ucr.edu>...
+>
+> Hmmmmm. I have difficulty accounting for that "3-4 times slower". If
+> I understand correctly, each algorithm takes the same amount of time
+> to accomplish the writing tasks, but Hovik's writers-get-priority
+> policy leads to about half as many batches of concurrent readers.
+>

+ This comes from real timing tests. In Sean's case the maximum number
+ of readers is always half the total number of threads, whereas in my
+ case they are equal. In other words, my algorithm (conventionally,
+ it's not mine actually) uses reads more effectively. As for "3-4
+ times", as I said, it's coming from tests, I can't tell you exactly
+ why it's just 3-4, although there must be some explanation, of course.
^^^^^^^^
I would expect far less speedup. Was it the same for the 5/1 test as
for the 1/5 test? I'm suspecting that there was some flaw in the
implementation or the testing of those algorithms.

Tom Payne

Hovik Melikyan

unread,
Sep 20, 2002, 3:30:00 AM9/20/02
to
David Schwartz <dav...@webmaster.com> wrote in message news:<3D8A2942...@webmaster.com>...

>
> > Let's say you have an open service with two public functions: Fr and
> > Fw for reading a shared resource and for writing to it respectively.
> > [...]

>
> You shouldn't do this anyway. Mutexes will break the same way.

They won't, because mutexes work either on a FIFO or on a random
basis. If you provide a public function F which uses a simple mutex,
starvation of concurrent users will be exponential, whereas with
'unfair' Fr/Fw it is discrete: only two overlapping iterative calls to
the prioritized function will _stop_ the service, not just slow it
down.

>
> IMO, the correct and safe way to use an rwlock is in a case where you
> would otherwise use a mutex, but you want the possibility that
> concurrent readers will give you a performance boost. Because there's
> more overhead, that performance boost should be expected to be
> significant because otherwise you will wind up losing more than you
> gain.
>

Regarding the overhead: (1) you can use rwlocks in your app with the
hope that once the question of prioritization is settled and somehow
standardized, they will become efficient system-level objects; (2) in
the real-world cases I deal with currently, the protected parts of my
code may intensively use dynamic memory allocation and even disk I/O
(at least for logging), so that few more CPU instructions is nothing
compared to them.

Regards,

--
Hovik Melikyan

P.S. I'm sorry for my slowness, but actually it's not me, that's
Google posts with a big delay, from 3 to 9 hours. It's damn anti-spam
or anti-idunno-what policy... :((

David Butenhof

unread,
Sep 20, 2002, 9:32:13 AM9/20/02
to
Drazen Kacar wrote:

"An implementation [of read-write locks] built on a mutex and condition
variables" is a poor read-write lock. Really poor. The only advantage is
that it "implements the API".

Remember that CV wakeup requires locking the mutex. Every CV wakeup. So what
happens when you broadcast to a CV to wake up a swarm of "concurrent
readers"? Well, there are various optimizations, but essentially EACH of
them, IN TURN (one at a time) will acquire the mutex, manipulate the RWlock
shared data to get out of the API, and then start executing the protected
application code.

Now, if your readers do a lot of work while holding the read lock (which is
generally a bad idea), then there's a chance that some of the other readers
will have time to wake up and progress in parallel before you're done. If
it's short, most likely you'll have serial access anyway -- exactly as if
you'd used a mutex but with more overhead.

A real implementation of pthread_rwlock_* CAN (but need not) support true
parallel scheduling of concurrent readers across a set of available CPUs.
It's not trivial, because it means the awakened readers cannot require any
locks on "the way out". Even atomic memory operations (like incrementing a
reader count) reduces concurrency by provoking memory contention (and
transfer of cache line ownership).

So the answer is that the ideal thread library implementation can do far
better than you can possibly do with anything resembling portable code.
Otherwise, you're pretty much out of luck and you might as well stick with
just a mutex.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Alexander Terekhov

unread,
Sep 20, 2002, 9:41:16 AM9/20/02
to

Ziv Caspi wrote:
>
> On Thu, 19 Sep 2002 17:33:18 +0200, Alexander Terekhov
> <tere...@web.de> wrote:

[...InterlockedXXX...]

> I don't get your point.

My point is this:

http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
(Subject: Re: Can multiple threads set a global variable simultaneously?)

> Obviously, on non-x86 systems, the


> implementation is not 'LOCK XADD', but something that is appropriate
> for the target architecture.

http://altair.cs.oswego.edu/pipermail/concurrency-interest/2002-February/000023.html
http://altair.cs.oswego.edu/pipermail/concurrency-interest/2002-February/000024.html
http://gee.cs.oswego.edu/dl/concurrent/java/util/concurrent/AtomicInteger.html

So,

// Java's {proposed} atomic integer increment operation: <behaviorally>
for ( ;; ) {
int i = get(); // VOLATILE-ACQUIRE
if ( attemptUpdate(i, i+1) ) // VOLATILE-RELEASE
break;
}

HTH.

regards,
alexander.

David Schwartz

unread,
Sep 20, 2002, 2:17:05 PM9/20/02
to
Hovik Melikyan wrote:

> > You shouldn't do this anyway. Mutexes will break the same way.

> They won't, because mutexes work either on a FIFO or on a random
> basis.

Do you have a reference for this claim? It was my understanding that
which thread gets the mutex when their priorities were equal is
unspecified.

> If you provide a public function F which uses a simple mutex,
> starvation of concurrent users will be exponential, whereas with
> 'unfair' Fr/Fw it is discrete: only two overlapping iterative calls to
> the prioritized function will _stop_ the service, not just slow it
> down.

What if the mutex implementation is "thread that has the lowest thread
ID gets the mutex"?

DS

Ziv Caspi

unread,
Sep 21, 2002, 5:50:05 AM9/21/02
to
On Fri, 20 Sep 2002 15:41:16 +0200, Alexander Terekhov
<tere...@web.de> wrote:

>> I don't get your point.
>
>My point is this:
>
>http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
>(Subject: Re: Can multiple threads set a global variable simultaneously?)

Perhaps I'm a bit dense, but I still don't see the point. Win32's
InterlockedXxx guarantee the proper behavior for global vars. It's the
OS's job to work in a single-processor system, x86 or not, as well as
SMP system. Nobody suggests that you use LOCK XADD in your code -- you
are supposed to use InterlockedXxx instead, and are assured by the
system that it would work across architectures.

(The link to MSDN about flushing processor caches is broken, so I
can't follow that one to see what you were referring to.)

I'm not sure why Java is relevant to this discussion. It's the "job"
of Win32 to provide these guarantees; the OS is, after all, the best
place to make such decisions. This works regardless of the runtime
that called these functions.

Ziv

t...@cs.ucr.edu

unread,
Sep 21, 2002, 5:33:11 AM9/21/02
to
David Butenhof <David.B...@compaq.com> wrote:
[...]
+ A real implementation of pthread_rwlock_* CAN (but need not) support true
+ parallel scheduling of concurrent readers across a set of available CPUs.

Hmmmmm. Extractiing those readers from the ready queue (or wherever)
seems inherently serial. What am I missing?

Tom Payne

Alexander Terekhov

unread,
Sep 21, 2002, 10:03:23 AM9/21/02
to

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/synchronization_and_multiprocessor_issues.asp
("Synchronization and Multiprocessor Issues", *MS-STYLE-BULLSHIT*)

http://www.crhc.uiuc.edu/ece412/papers/models_tutorial.pdf
(Shared Memory Consistency Models: A Tutorial)

Ziv Caspi wrote:
>
> On Fri, 20 Sep 2002 15:41:16 +0200, Alexander Terekhov

> ?tere...@web.de? wrote:
>
> ?? I don't get your point.
> ?
> ?My point is this:
> ?
> ?http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
> ?(Subject: Re: Can multiple threads set a global variable simultaneously?)


>
> Perhaps I'm a bit dense, but I still don't see the point. Win32's
> InterlockedXxx guarantee the proper behavior for global vars. It's the
> OS's job to work in a single-processor system, x86 or not, as well as
> SMP system. Nobody suggests that you use LOCK XADD in your code -- you
> are supposed to use InterlockedXxx instead, and are assured by the
> system that it would work across architectures.
>
> (The link to MSDN about flushing processor caches is broken, so I
> can't follow that one to see what you were referring to.)
>

> ?http://altair.cs.oswego.edu/pipermail/concurrency-interest/2002-February/000023.html
> ?http://altair.cs.oswego.edu/pipermail/concurrency-interest/2002-February/000024.html
> ?http://gee.cs.oswego.edu/dl/concurrent/java/util/concurrent/AtomicInteger.html

Alexander Terekhov

unread,
Sep 21, 2002, 10:41:18 AM9/21/02
to

t...@cs.ucr.edu wrote:

>
> David Butenhof ?David.B...@compaq.com? wrote:
> [...]
> + A real implementation of pthread_rwlock_* CAN (but need not) support true
> + parallel scheduling of concurrent readers across a set of available CPUs.
>
> Hmmmmm. Extractiing those readers from the ready queue (or wherever)
> seems inherently serial. What am I missing?

Consider the following rwlock algorithm where pending readers stay
blocked on mutex (in addition to "kinda fairness" it allows us to
bypass condvar broadcast operation [for readers], which may somewhat
"confuse" MP-optimized mutexes, though ;-) ). Another problem here
is exactly the "inherently serial" wakeup for readers (unless they
just happen to spin-wait long enough ;-) ).

http://groups.google.com/groups?selm=3B166244.F923B993%40web.de
(Subject: Re: how to distinguish between read and write locks?)

< slightly changed/simplified version; error checking,
cancel-cleanup and code for recursive readlocks omitted >

mtxEntryLock - mutex
mtxReadExitLock - mutex
cndWriteAccessGranted - condvar
nReadEntryCount - int
nReadExitCount - int
nWriteAccessCount - int

rdlock() {

lock( mtxEntryLock );

if ( INT_MAX == ++nReadEntryCount ) {
lock( mtxReadExitLock );
nReadEntryCount -= nReadExitCount;
nReadExitCount = 0;
unlock( mtxReadExitLock );
}

unlock( mtxEntryLock );
}

wrlock() {

lock( mtxEntryLock );

if ( 0 == nWriteAccessCount ) {

lock( mtxReadExitLock );

if ( 0 != nReadExitCount ) {
nReadEntryCount -= nReadExitCount;
nReadExitCount = 0;
}

if ( 0 != nReadEntryCount ) {

nReadExitCount = -nReadEntryCount;
nReadEntryCount = 0;
do {
wait( cndWriteAccessGranted, mtxReadExitLock );
} while( 1 != nWriteAccessCount );
}
else {
nWriteAccessCount = 1;
}

unlock( mtxReadExitLock );
}
else { // recursive write locking
nWriteAccessCount++; // via recursive mtxEntryLock
}
}

unlock() {

auto: bool bGrantWriteAccess

if ( 0 == nWriteAccessCount ) {

lock( mtxReadExitLock );

if ( 0 == ++nReadExitCount ) {
nWriteAccessCount = 1;
bGrantWriteAccess = true;
}
else {
bGrantWriteAccess = false;
}

unlock( mtxReadExitLock );

if ( bGrantWriteAccess ) {
signal( cndWriteAccessGranted );
}
}
else {
nWriteAccessCount--;
unlock( mtxEntryLock );
}
}

regards,
alexander.

Ziv Caspi

unread,
Sep 21, 2002, 4:19:42 PM9/21/02
to
On Sat, 21 Sep 2002 16:03:23 +0200, Alexander Terekhov
<tere...@web.de> wrote:

The MSDN article you quote is certainly inadequate. It *does*,
however, indicate what is the approach in Win32 for solving shared
memory problems -- use a Win32 function that is guaranteed to
synchronize.

Ziv

Alexander Terekhov

unread,
Sep 21, 2002, 5:44:16 PM9/21/02
to

Really? This silly MSDN article advocates "Classic-DCL"-like totally
*BROKEN* stuff [AFAICS even on IA32 under its SSE-extension-"relaxed
memory" model, *IA64* aside for a moment], to begin with.

"....
Consequently, the multiprocessor race condition above can be repaired
as follows:

void CacheComputedValue()
{
if (!fValueHasBeenComputed) {
iValue = ComputeValue();
InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
}
}"

Heck, they even claim that:

"The InterlockedExchange function ensures that the value of iValue is
updated for all processors before the value of fValueHasBeenComputed
is set to TRUE."

Yeah, ``inadequate'', indeed.

regards,
alexander.

Ziv Caspi

unread,
Sep 22, 2002, 8:07:59 AM9/22/02
to
On Sat, 21 Sep 2002 23:44:16 +0200, Alexander Terekhov
<tere...@web.de> wrote:

>Really? This silly MSDN article advocates "Classic-DCL"-like totally
>*BROKEN* stuff [AFAICS even on IA32 under its SSE-extension-"relaxed
>memory" model, *IA64* aside for a moment], to begin with.

I think you're missing the fact that the "relaxed" memory model in
Intel's advanced processors does *not* mean that the lock model
featured in previous processors breaks. It simply means there are now
more efficient ways to do the same thing. If there ever was a company
that "understands" binary backward compatibility, it's Intel.

See "Intel Software Developer's Manual Vol. 3: System Programming",
chapter 7 "Multiple-Processor Management", 7.1 "Locked Atomic
Operations", 7.2 "Memory Ordering", and 7.4 "Serializing
Instructions".

>"....
> Consequently, the multiprocessor race condition above can be repaired
> as follows:
>
> void CacheComputedValue()
> {
> if (!fValueHasBeenComputed) {
> iValue = ComputeValue();
> InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
> }
> }"

This statement in MSDN is, in fact, true. The Win32 Interlocked* model
assures callers that there's a memory barrier there. You can rely on
the above pattern working on any machine Windows is available for (and
have been shown to work in stress for up to 32 processors).

Ziv

t...@cs.ucr.edu

unread,
Sep 22, 2002, 9:14:50 AM9/22/02
to
Alexander Terekhov <tere...@web.de> wrote:

+ t...@cs.ucr.edu wrote:
+>
+> David Butenhof ?David.B...@compaq.com? wrote:
+> [...]
+> + A real implementation of pthread_rwlock_* CAN (but need not) support true
+> + parallel scheduling of concurrent readers across a set of available CPUs.
+>
+> Hmmmmm. Extractiing those readers from the ready queue (or wherever)
+> seems inherently serial. What am I missing?

+ Consider the following rwlock algorithm where pending readers stay
+ blocked on mutex (in addition to "kinda fairness" it allows us to
+ bypass condvar broadcast operation [for readers], which may somewhat
+ "confuse" MP-optimized mutexes, though ;-) ). Another problem here
+ is exactly the "inherently serial" wakeup for readers (unless they
+ just happen to spin-wait long enough ;-) ).

+ http://groups.google.com/groups?selm=3B166244.F923B993%40web.de
+ (Subject: Re: how to distinguish between read and write locks?)

[... code snipped ...]

An interesting implementation, but waiters on spinlocks already have
CPUs. What I'm curious about is Dave Butenhof's mention of "true


parallel scheduling of concurrent readers across a set of available
CPUs".

My question presumed that there would be a ready list, i.e., a
priority queue of runnable threads from which available CPUs would get
their threads to run. I've also seen situations where CPUs obtain
their work from a master list of all threads, where each thread has
state bits that tell whether it is runnable, running, waiting etc. I
suppose that the available CPUs could concurrently scan such a list
looking for work if each thread (descriptor) were given its own mutex.
I'm hoping that Dave knows of a better alternative.

Tom Payne

Alexander Terekhov

unread,
Sep 23, 2002, 4:24:15 AM9/23/02
to

Ziv Caspi wrote:
>
> On Sat, 21 Sep 2002 23:44:16 +0200, Alexander Terekhov
> <tere...@web.de> wrote:
>
> >Really? This silly MSDN article advocates "Classic-DCL"-like totally
> >*BROKEN* stuff [AFAICS even on IA32 under its SSE-extension-"relaxed
> >memory" model, *IA64* aside for a moment], to begin with.
>
> I think you're missing the fact that....

http://groups.google.com/groups?selm=3C3C9B63.DFDC9920%40web.de
(Subject: Re: Multi-Processor Concurrency Problem)

> >"....
> > Consequently, the multiprocessor race condition above can be repaired
> > as follows:
> >
> > void CacheComputedValue()
> > {
> > if (!fValueHasBeenComputed) {
> > iValue = ComputeValue();
> > InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
> > }
> > }"
>
> This statement in MSDN is, in fact, true.

<quote>

BOOL FetchComputedValue(int *piResult)
{
if (fValueHasBeenComputed)
{
*piResult = iValue;
return TRUE;
}
else
return FALSE;
}

</quote>

> The Win32 Interlocked* model assures callers that there's a memory

> barrier there. ....

http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com


(Subject: Re: Can multiple threads set a global variable simultaneously?)

http://www.crhc.uiuc.edu/ece412/papers/models_tutorial.pdf
(Shared Memory Consistency Models: A Tutorial,
see "Figure 1: What value can a read return?")

http://groups.google.com/groups?selm=G3Q49.14%24Fr2.86101%40news.cpqcorp.net
(Subject: Re: DCL and code optimization and etc.)

regards,
alexander.

David Butenhof

unread,
Sep 23, 2002, 10:21:35 AM9/23/02
to
t...@cs.ucr.edu wrote:

Sure, the threads are probably "extracted" serially; but that doesn't take
much time. There's no reason that all the extracted threads can't then be
parcelled out across a set of processors and scheduled concurrently.
(Whether that actually works out to "parallel" depends on many factors; but
at least it's POSSIBLE.)

The point is that CV/MUTEX synchronization requires that the waiters WAKE
serially -- concurrently scheduling the broadcast waiters would merely
guarantee that all but one will need to reblock on the mutex. This
PROHIBITS parallel continuation. (Simple implementations in fact may do
that; more sophisticated implementations optimize the operation by
scheduling only one thread and directly blocking the remainder on the
mutex.)

A true read-write lock cannot generally guarantee parallel continuation, but
can be designed not to prohibit parallel continuation. A CV/MUTEX protocol,
while "simple and general", cannot.

Alexander Terekhov

unread,
Sep 23, 2002, 12:20:15 PM9/23/02
to

David Butenhof wrote:
[...]

> The point is that CV/MUTEX synchronization requires that the waiters WAKE
> serially -- concurrently scheduling the broadcast waiters would merely
> guarantee that all but one will need to reblock on the mutex. This
> PROHIBITS parallel continuation. (Simple implementations in fact may do
> that; more sophisticated implementations optimize the operation by
> scheduling only one thread and directly blocking the remainder on the
> mutex.)

"more sophisticated" == "wait morphing" and, probably, "simple
implementations" == "thundering herd", correct?

http://sunsite.uakom.sk/sunworldonline/swol-09-1999/swol-09-insidesolaris.html
("Kernel synchronization primitives")

regards,
alexander.

t...@cs.ucr.edu

unread,
Sep 24, 2002, 4:05:32 AM9/24/02
to
David Butenhof <David.B...@compaq.com> wrote:
+ t...@cs.ucr.edu wrote:

+> David Butenhof <David.B...@compaq.com> wrote:
+> [...]
+> + A real implementation of pthread_rwlock_* CAN (but need not) support
+> true + parallel scheduling of concurrent readers across a set of available
+> CPUs.
+>
+> Hmmmmm. Extractiing those readers from the ready queue (or wherever)
+> seems inherently serial. What am I missing?

+ Sure, the threads are probably "extracted" serially; but that doesn't take
+ much time. There's no reason that all the extracted threads can't then be
+ parcelled out across a set of processors and scheduled concurrently.
+ (Whether that actually works out to "parallel" depends on many factors; but
+ at least it's POSSIBLE.)

+ The point is that CV/MUTEX synchronization requires that the waiters WAKE
+ serially -- concurrently scheduling the broadcast waiters would merely
+ guarantee that all but one will need to reblock on the mutex. This
+ PROHIBITS parallel continuation. (Simple implementations in fact may do
+ that; more sophisticated implementations optimize the operation by
+ scheduling only one thread and directly blocking the remainder on the
+ mutex.)

If I understood correctly, the OP was interested in a passive-waiting
protocol.

+ A true read-write lock cannot generally guarantee parallel continuation, but
+ can be designed not to prohibit parallel continuation. A CV/MUTEX protocol,
+ while "simple and general", cannot.

So, in a true read-write lock, readers do not recheck their predicate
after waiting, and, therefore, true read-write locks cannot be based
on possibly spurious signalling. Also, true read-write locks use the
tail-waiting optimization, where waiters do not reacquire the mutex
after waiting.


class RWlock : Monitor {

// no spurious signals or broadcasts.
// tail_wait() unlocks Monitor::mutex but doesn't relock it.

Condition okToRead;
Condition okToWrite;
bool writing;
int readerCount;

public:

RWlock()
: okToRead(this),
okToWrite(this),
writing(false),
readerCount(0)
{}

start_read() {
Monitor::lock();
if ( writing || okToWrite.count() ) {
okToRead.tail_wait() // no relocking
}
}

start_write() {
Monitor::lock();
if ( writing || okToRead.count() ) {
okToWrite.tail_wait() // no relocking
}
}

end_read() {
Monitor::lock();
assert( readerCount && ! writing );
if ( ! --readerCount ) {
writing = true;
okToWrite.signal()
}
Monitor::unlock();
}

end_write() {
Monitor::lock();
assert( writing && ! readerCount );
if ( okToRead.count() ) {
readerCount = okToRead.count(); // unoptimized
writing = false;
okToRead.broadcast()
} else {
okToWrite.signal()
}
Monitor::unlock();
}

};

Tom Payne

Alexander Terekhov

unread,
Sep 24, 2002, 5:40:27 AM9/24/02
to

t...@cs.ucr.edu wrote:
[...]

> class RWlock : Monitor {
>
> // no spurious signals or broadcasts.
> // tail_wait() unlocks Monitor::mutex but doesn't relock it.
// tail_signal() unlocks Monitor::mutex and signals condvar.
// tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.

>
> Condition okToRead;
> Condition okToWrite;
> bool writing;
> int readerCount;
>
> public:
>
> RWlock()
> : okToRead(this),
> okToWrite(this),
> writing(false),
> readerCount(0)
> {}
>
> start_read() {
> Monitor::lock();
> if ( writing || okToWrite.count() ) {
> okToRead.tail_wait() // no relocking
> }
else {
++readerCount;

}
> }
>
> start_write() {
> Monitor::lock();
> if ( writing || okToRead.count() ) {
> okToWrite.tail_wait() // no relocking
> }
else {
writing = true;

}
> }
>
> end_read() {
> Monitor::lock();
> assert( readerCount && ! writing );
/*

> if ( ! --readerCount ) {
*/
if ( ! --readerCount && okToWrite.count() ) {
> writing = true;
/*
> okToWrite.signal()
> }
> Monitor::unlock();
*/
okToWrite.tail_signal();
}
else {

Monitor::unlock();
}
> }
>
> end_write() {
> Monitor::lock();
> assert( writing && ! readerCount );
> if ( okToRead.count() ) {
> readerCount = okToRead.count(); // unoptimized
> writing = false;
/*

> okToRead.broadcast()
> } else {
> okToWrite.signal()
> }
> Monitor::unlock();
*/
okToRead.tail_broadcast();
}
else if ( okToWrite.count() ) {
okToWrite.tail_signal();
}
else {
writing = false;
Monitor::unlock();
}
> }
>
> };

---
class RWlock : Monitor {

// no spurious signals or broadcasts.
// tail_wait() unlocks Monitor::mutex but doesn't relock it.

// tail_signal() unlocks Monitor::mutex and signals condvar.
// tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.

Condition okToRead;
Condition okToWrite;
bool writing;
int readerCount;

public:

RWlock()
: okToRead(this),
okToWrite(this),
writing(false),
readerCount(0)
{}

start_read() {
Monitor::lock();
if ( writing || okToWrite.count() ) {
okToRead.tail_wait() // no relocking
}

else {
++readerCount;
}
}

start_write() {
Monitor::lock();
if ( writing || okToRead.count() ) {
okToWrite.tail_wait() // no relocking
}

else {
writing = true;
}
}

end_read() {
Monitor::lock();
assert( readerCount && ! writing );

if ( ! --readerCount && okToWrite.count() ) {
writing = true;
okToWrite.tail_signal();
}
else {
Monitor::unlock();
}
}

end_write() {
Monitor::lock();
assert( writing && ! readerCount );
if ( okToRead.count() ) {
readerCount = okToRead.count(); // unoptimized
writing = false;

okToRead.tail_broadcast();
}
else if ( okToWrite.count() ) {
okToWrite.tail_signal();
}
else {
writing = false;
Monitor::unlock();
}
}

};

---

regards,
alexander.

Alexander Terekhov

unread,
Sep 24, 2002, 5:56:07 AM9/24/02
to

Alexander Terekhov wrote:
[...]
> class RWlock : Monitor {
>
> // no spurious signals or broadcasts.
> // tail_wait() unlocks Monitor::mutex but doesn't relock it.
> // tail_signal() unlocks Monitor::mutex and signals condvar.
> // tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.
>
> Condition okToRead;
> Condition okToWrite;
> bool writing;
> int readerCount;
>
> public:
>
> RWlock()
> : okToRead(this),
> okToWrite(this),
> writing(false),
> readerCount(0)
> {}
>
> start_read() {
> Monitor::lock();
> if ( writing || okToWrite.count() ) {
> okToRead.tail_wait() // no relocking
> }
> else {
> ++readerCount;
Monitor::unlock();

> }
> }
>
> start_write() {
> Monitor::lock();
> if ( writing || okToRead.count() ) {
> okToWrite.tail_wait() // no relocking
> }
> else {
> writing = true;
Monitor::unlock();

> }
> }
>
> end_read() {
> Monitor::lock();
> assert( readerCount && ! writing );
> if ( ! --readerCount && okToWrite.count() ) {
> writing = true;
> okToWrite.tail_signal();
> }
> else {
> Monitor::unlock();
> }
> }
>
> end_write() {
> Monitor::lock();
> assert( writing && ! readerCount );
> if ( okToRead.count() ) {
> readerCount = okToRead.count(); // unoptimized
> writing = false;
> okToRead.tail_broadcast();
> }
> else if ( okToWrite.count() ) {
> okToWrite.tail_signal();
> }
> else {
> writing = false;
> Monitor::unlock();
> }
> }
>
> };

regards,
alexander.

Alexander Terekhov

unread,
Sep 24, 2002, 7:55:43 AM9/24/02
to

Alexander Terekhov wrote: < My bad, twice! >
[...]
> > bool writing;
> > int readerCount;
[...]

> > start_write() {
> > Monitor::lock();
> > if ( writing || okToRead.count() ) {

Yeah, "never thread [&& drink] before lunch". ;-)

< another try >

class RWlock : Monitor {

// no spurious signals or broadcasts.
// tail_wait() unlocks Monitor::mutex but doesn't relock it.
// tail_signal() unlocks Monitor::mutex and signals condvar.
// tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.

Condition okToRead;
Condition okToWrite;
int count; // ==-1:writing ==0:free >=+1:reading

public:

RWlock()
: okToRead(this),
okToWrite(this),
count(0),
{}

start_read() {
Monitor::lock();
if ( 0 > count || okToWrite.count() ) {


okToRead.tail_wait() // no relocking
}
else {

++count;
Monitor::unlock();
}
}

start_write() {
Monitor::lock();
if ( 0 != count ) {


okToWrite.tail_wait() // no relocking
}
else {

count = -1;
Monitor::unlock();
}
}

end_read() {
Monitor::lock();
assert( 0 < count );
if ( 0 == --count && okToWrite.count() ) {
count = -1;
okToWrite.tail_signal();
}
else {
Monitor::unlock();
}
}

end_write() {
Monitor::lock();
assert( 0 > count );
if ( 0 != (count = okToRead.count()) ) {
okToRead.tail_broadcast();
}
else if ( 0 != okToWrite.count() ) {
count = -1;
okToWrite.tail_signal();
}
else {
Monitor::unlock();
}
}

};

regards,
alexander.

David Butenhof

unread,
Sep 24, 2002, 8:08:29 AM9/24/02
to
t...@cs.ucr.edu wrote:

> So, in a true read-write lock, readers do not recheck their predicate
> after waiting, and, therefore, true read-write locks cannot be based
> on possibly spurious signalling. Also, true read-write locks use the
> tail-waiting optimization, where waiters do not reacquire the mutex
> after waiting.

A read-write lock is a LOCK, not a "signalling mechanism" like a CV. The
important distinction is that the implementation knows (and controls) the
predicate, like a mutex. In contrast, a CV is designed to allow the caller
to communicate changes to a predicate about which the implementation can
know (and assume) nothing.

In a read-write lock package based on a mutex and CV, then the caller waits
for the lock (read or write) by waiting on the CV, and on wakeup must
(serially) reacquire the mutex to validate the predicate.

With a "true" read-write lock, there are various alternatives. While a
tail-wait is the "cleanest", it's not the only possibility. The
implementation might instead validate via low-level hardware mechanisms
rather than by taking a "scheduling lock", for example, using
compare-and-swap or simply memory barrier ordering.

t...@cs.ucr.edu

unread,
Sep 24, 2002, 12:20:14 PM9/24/02
to
David Butenhof <David.B...@compaq.com> wrote:
[...]
+ With a "true" read-write lock, there are various alternatives. While a
+ tail-wait is the "cleanest", it's not the only possibility. The
+ implementation might instead validate via low-level hardware mechanisms
+ rather than by taking a "scheduling lock", for example, using
+ compare-and-swap or simply memory barrier ordering.

Thanks. I'd be interested in references on implementing rwlocks via
barriers and/or compare-and-swap instructions.

Tom Payne

Ziv Caspi

unread,
Sep 24, 2002, 3:50:47 PM9/24/02
to
On Mon, 23 Sep 2002 10:24:15 +0200, Alexander Terekhov
<tere...@web.de> wrote:

>http://groups.google.com/groups?selm=3C3C9B63.DFDC9920%40web.de
>(Subject: Re: Multi-Processor Concurrency Problem)

Yes, Intel doesn't guarantee it. This is besides the point -- you get
that guarantee from the Interlocked* functions. If ever Intel were to
issue a processor in which the current implementation of Interlocked*
would fail, you can be sure they'd do their best to have MS issue a
Windows release that supports it (possibly by modifying the
implementation of the Interlocked* functions).

Your logic in that post could also be taken further: Intel doesn't
guarantee anything about future processors, so not only can't we use
Interlocked functions, we *also* can't use critical-sections -- who
knows, perhaps a future processor will break Windows' current
implementation of EnterCriticalSection.

>
>> >"....
>> > Consequently, the multiprocessor race condition above can be repaired
>> > as follows:
>> >
>> > void CacheComputedValue()
>> > {
>> > if (!fValueHasBeenComputed) {
>> > iValue = ComputeValue();
>> > InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
>> > }
>> > }"
>>
>> This statement in MSDN is, in fact, true.
>
><quote>
>
> BOOL FetchComputedValue(int *piResult)
> {
> if (fValueHasBeenComputed)
> {
> *piResult = iValue;
> return TRUE;
> }
> else
> return FALSE;
> }
>
></quote>
>
>> The Win32 Interlocked* model assures callers that there's a memory
>> barrier there. ....
>
>http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
>(Subject: Re: Can multiple threads set a global variable simultaneously?)

I think I already answered your concerns here.

>http://www.crhc.uiuc.edu/ece412/papers/models_tutorial.pdf
>(Shared Memory Consistency Models: A Tutorial,
> see "Figure 1: What value can a read return?")

This is irrelevant to our discussion. Win32 *assures* you that
Interlocked will put a barrier in all architectures that require it...

>http://groups.google.com/groups?selm=G3Q49.14%24Fr2.86101%40news.cpqcorp.net
>(Subject: Re: DCL and code optimization and etc.)

...*including* the Alpha architecture to which David was referring.

Ziv

t...@cs.ucr.edu

unread,
Sep 25, 2002, 2:01:45 AM9/25/02
to
Alexander Terekhov <tere...@web.de> wrote:

+ Alexander Terekhov wrote: < My bad, twice! >
+ [...]
+> > bool writing;
+> > int readerCount;
+ [...]
+> > start_write() {
+> > Monitor::lock();
+> > if ( writing || okToRead.count() ) {

+ Yeah, "never thread [&& drink] before lunch". ;-)

+ < another try >

+ class RWlock : Monitor {

+ // no spurious signals or broadcasts.
+ // tail_wait() unlocks Monitor::mutex but doesn't relock it.
+ // tail_signal() unlocks Monitor::mutex and signals condvar.
+ // tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.

I don't like such naked signalling/broadcasting. A passive
Condition's queue of waiters is a shared data structure that requires
lock protection (or some very clever use of low-level instructions).
If we don't protect the Condition with the Monitor's lock, then the
Condition needs its own lock, which adds overhead.

Tom Payne

Alexander Terekhov

unread,
Sep 25, 2002, 6:30:04 AM9/25/02
to

Ziv Caspi wrote:
>
> On Mon, 23 Sep 2002 10:24:15 +0200, Alexander Terekhov
> <tere...@web.de> wrote:
>
> >http://groups.google.com/groups?selm=3C3C9B63.DFDC9920%40web.de
> >(Subject: Re: Multi-Processor Concurrency Problem)
>
> Yes, Intel doesn't guarantee it. This is besides the point -- you get
> that guarantee from the Interlocked* functions.

What guarantee, exactly?

[...]


> >> >"....
> >> > Consequently, the multiprocessor race condition above can be repaired
> >> > as follows:
> >> >
> >> > void CacheComputedValue()
> >> > {
> >> > if (!fValueHasBeenComputed) {
> >> > iValue = ComputeValue();
> >> > InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
> >> > }
> >> > }"
> >>
> >> This statement in MSDN is, in fact, true.
> >
> ><quote>
> >
> > BOOL FetchComputedValue(int *piResult)
> > {
> > if (fValueHasBeenComputed)
> > {
> > *piResult = iValue;
> > return TRUE;
> > }
> > else
> > return FALSE;
> > }
> >
> ></quote>
> >
> >> The Win32 Interlocked* model assures callers that there's a memory
> >> barrier there. ....
> >
> >http://groups.google.com/groups?threadm=c29b5e33.0202140541.7df2bb9d%40posting.google.com
> >(Subject: Re: Can multiple threads set a global variable simultaneously?)
>
> I think I already answered your concerns here.

Nope. Here's my "concerns":

---
Well, but *what* (if any) MEMORY SYNCHRONIZATION semantics do they
provide?

ACQUIRE (mutex.lock)?

RELEASE (mutex.unlock)?

ACQUIRE-RELEASE (both)?
---

And, BTW, I guess, now I should really expand this list:

ACQUIRE (mutex.lock)?

RELEASE (mutex.unlock)?

ACQUIRE-RELEASE (both)?

ZIV-CASPI-MAGIC ("a thing" that would make the CacheComputedValue/
FetchComputedValueexample "example" above work
on IPF(IA64)/etc.)?

> >http://www.crhc.uiuc.edu/ece412/papers/models_tutorial.pdf
> >(Shared Memory Consistency Models: A Tutorial,
> > see "Figure 1: What value can a read return?")
>
> This is irrelevant to our discussion. Win32 *assures* you that
> Interlocked will put a barrier in all architectures that require it...
>
> >http://groups.google.com/groups?selm=G3Q49.14%24Fr2.86101%40news.cpqcorp.net
> >(Subject: Re: DCL and code optimization and etc.)
>
> ...*including* the Alpha architecture to which David was referring.

Yeah, now read this Butenhof's article again. Please pay some special
attention to phrases in >>CAPS<<. (especially the last one. ;-) )

regards,
alexander.

Alexander Terekhov

unread,
Sep 25, 2002, 6:57:54 AM9/25/02
to

t...@cs.ucr.edu wrote:
[...]

> + class RWlock : Monitor {
>
> + // no spurious signals or broadcasts.
> + // tail_wait() unlocks Monitor::mutex but doesn't relock it.
> + // tail_signal() unlocks Monitor::mutex and signals condvar.
> + // tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.
>
> I don't like such naked signalling/broadcasting.

I meant:

condition::tail_broadcast() {
waiter* w = this->queue;
this->queue = 0;
unlock( this->monitor_mutex );
/**/
while ( 0 != w ) { /* "schedule" w...*/ w = w->next; }
}

Here's another version (without "tail_wait"/"lock hand off"
for writers -- I don't like such determinism ;-) ):

class RWlock : Monitor {

// tail_wait() [no spurious wakeups] unlocks Monitor::mutex but doesn't relock it.
// tail_signal() unlocks Monitor::mutex and schedule ONE waiter.
// tail_broadcast() unlocks Monitor::mutex and schedule ALL waiters.

Condition okToRead;
Condition okToWrite;
int count; // ==-1:writing ==0:free >=+1:reading

public:

RWlock()
: okToRead(this),
okToWrite(this),
count(0)
{}

start_read() {
Monitor::lock();
if ( 0 > count || okToWrite.count() ) {
okToRead.tail_wait(); // no relocking
}
else {
++count;
Monitor::unlock();
}
}

start_write() {
Monitor::lock();
while ( 0 != count ) {
okToWrite.wait(); // relocking


}
count = -1;
Monitor::unlock();
}

end_read() {
Monitor::lock();
assert( 0 < count );

if ( 0 == --count && 0 != okToWrite.count() ) {
okToWrite.tail_signal();
}
else {
Monitor::unlock();
}
}

end_write() {
Monitor::lock();
assert( 0 > count );
if ( 0 != (count = okToRead.count()) ) {
okToRead.tail_broadcast();
}
else if ( 0 != okToWrite.count() ) {

t...@cs.ucr.edu

unread,
Sep 25, 2002, 9:46:41 AM9/25/02
to
Alexander Terekhov <tere...@web.de> wrote:

+ t...@cs.ucr.edu wrote:
+ [...]
+> + class RWlock : Monitor {
+>
+> + // no spurious signals or broadcasts.
+> + // tail_wait() unlocks Monitor::mutex but doesn't relock it.
+> + // tail_signal() unlocks Monitor::mutex and signals condvar.
+> + // tail_broadcast() unlocks Monitor::mutex and broadcasts condvar.
+>
+> I don't like such naked signalling/broadcasting.

+ I meant:

+ condition::tail_broadcast() {
+ waiter* w = this->queue;
+ this->queue = 0;
+ unlock( this->monitor_mutex );
+ /**/
+ while ( 0 != w ) { /* "schedule" w...*/ w = w->next; }
+ }

Okay. That has the advantage that the lock available while the
thundering herd of readers is being scheduled.

+ Here's another version (without "tail_wait"/"lock hand off"
+ for writers -- I don't like such determinism ;-) ):

In which case, I presume, signal works pretty much the same as
brodcast, i.e., extracts the first waiter, releases the lock, and then
schedule's the waiter. Very efficient.

For shorter-term waiting, one could simply used implementations of
Condition that are based on spin-waiting.

By the way, has anyone looked at short-term/long-term condition based
on spin-down locking, where the waiter initially spins but shifts to a
queue of passive waiters after spinning for a predetermined time?

Tom Payne

Ziv Caspi

unread,
Sep 27, 2002, 5:29:46 AM9/27/02
to
On Wed, 25 Sep 2002 12:30:04 +0200, Alexander Terekhov
<tere...@web.de> wrote:

>
>Ziv Caspi wrote:
>> Yes, Intel doesn't guarantee it. This is besides the point -- you get
>> that guarantee from the Interlocked* functions.
>
>What guarantee, exactly?

That Interlocked* operations put a memory barrier.

>> I think I already answered your concerns here.
>
>Nope. Here's my "concerns":

I apologize.

>
>---
>Well, but *what* (if any) MEMORY SYNCHRONIZATION semantics do they
>provide?
>
>ACQUIRE (mutex.lock)?
>
>RELEASE (mutex.unlock)?
>
>ACQUIRE-RELEASE (both)?
>---
>
>And, BTW, I guess, now I should really expand this list:
>
>ACQUIRE (mutex.lock)?
>
>RELEASE (mutex.unlock)?
>
>ACQUIRE-RELEASE (both)?
>
>ZIV-CASPI-MAGIC ("a thing" that would make the CacheComputedValue/
> FetchComputedValueexample "example" above work
> on IPF(IA64)/etc.)?

The example is partial, and therefore (as I've already said)
inadequate. It explains how to put an Interlocked* operation in the
write path. Another interlocked operation is also needed in the read
path, but the article is silent on this issue when it should (IMHO)
explicitly mention it.

So ZIV-CASPI-MAGIC (in this case) (it's not actually my magic) is to
perform the read using an interlocked operation. This can be done by
InterlockedCompareExchange whose compare value and set value are the
same.

This is a known bug in the doc.

>Yeah, now read this Butenhof's article again.

I was thrown off as this thread started on x86 issues, and then
migrated to general discussion on processors that don't preserve write
order or read order. As I said, I apologize.

Alexander Terekhov

unread,
Sep 27, 2002, 7:15:10 AM9/27/02
to

Ziv Caspi wrote:
[...]

> >> Yes, Intel doesn't guarantee it. This is besides the point -- you get
> >> that guarantee from the Interlocked* functions.
> >
> >What guarantee, exactly?
>
> That Interlocked* operations put a memory barrier.

Okay. < Intel's SSE/SSE2 stuff, IA32 >

"....
The SFENCE, LFENCE, and MFENCE instructions provide a
performance-efficient way of insuring load and store memory
ordering between routines that produce weakly-ordered results
and routines that consume that data. The functions of these
instructions are as follows:

- SFENCE-Serializes all store (write) operations that
occurred prior to the SFENCE instruction in the program
instruction stream, but does not affect load operations.

- LFENCE-Serializes all load (read) operations that occurred
prior to the LFENCE instruction in the program instruction
stream, but does not affect store operations.

- MFENCE-Serializes all store and load operations that
occurred prior to the MFENCE instruction in the program
instruction stream."

Which one? ;-)

[...]


> >ZIV-CASPI-MAGIC ("a thing" that would make the CacheComputedValue/
> > FetchComputedValueexample "example" above work
> > on IPF(IA64)/etc.)?
>
> The example is partial, and therefore (as I've already said)
> inadequate. It explains how to put an Interlocked* operation in the
> write path. Another interlocked operation is also needed in the read

> path, .... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^

I don't think so. In kinda-proposed "java atomic terms", you'd need:

< ``Safety'' of *concurrent* "iValue = ComputeValue()" thing ASIDE >

A) AtomicInt.get() -- ACQUIRE semantics (fValueHasBeenComputed read).
B) AtomicInt.set() -- RELEASE semantics (fValueHasBeenComputed write).

regards,
alexander.

Ziv Caspi

unread,
Oct 3, 2002, 3:33:07 PM10/3/02
to

None. As I said, a LOCK prefix is used. With respect to memory, it has
a similar effect as MFENCE, IIRC.

>
>[...]
>> >ZIV-CASPI-MAGIC ("a thing" that would make the CacheComputedValue/
>> > FetchComputedValueexample "example" above work
>> > on IPF(IA64)/etc.)?
>>
>> The example is partial, and therefore (as I've already said)
>> inadequate. It explains how to put an Interlocked* operation in the
>> write path. Another interlocked operation is also needed in the read
>> path, .... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ^^^^
>
>I don't think so. In kinda-proposed "java atomic terms", you'd need:
>
>< ``Safety'' of *concurrent* "iValue = ComputeValue()" thing ASIDE >
>
>A) AtomicInt.get() -- ACQUIRE semantics (fValueHasBeenComputed read).
>B) AtomicInt.set() -- RELEASE semantics (fValueHasBeenComputed write).

I don't understand what you mean by that. In the MSDN example, there
are two independent reads from memory which can, in some
architectures, by re-ordered. That's why you need an interlocked
operation. (I believe the Alpha docs actually provide this scenario as
a code example, but I don't have them handy to make sure.)

Ziv

Alexander Terekhov

unread,
Oct 4, 2002, 7:41:17 AM10/4/02
to

Ziv Caspi wrote:
>
> On Fri, 27 Sep 2002 13:15:10 +0200, Alexander Terekhov
> <tere...@web.de> wrote:

[...IA32's SEE/SSE2 mbars...]

> >Which one? ;-)
>
> None. As I said, a LOCK prefix is used. With respect to memory, it has
> a similar effect as MFENCE, IIRC.

Well, Intel's IA32 docu says:

"....
Beginning with the P6 family processors, when the LOCK
prefix is prefixed to an instruction and the memory area
being accessed is cached internally in the processor, the
LOCK# signal is generally not asserted. Instead, only the
processor's cache is locked. Here, the processor's cache
coherency mechanism insures that the operation is carried
out atomically with regards to memory.
...."

Can you show me something "official" related to *guarantees*
w.r.t. memory acesss {re-}ordering and the LOCK-thing?

The stuff I've found "merely" says that:

"....
For areas of memory where weak ordering is acceptable, the
write back (WB) memory type can be chosen. Here, reads can
be performed speculatively and writes can be buffered and
combined. For this type of memory, cache locking is performed
on atomic (locked) operations that do not split across cache
lines, which helps to reduce the performance penalty associated
with the use of the typical synchronization instructions, such
as XCHG, that lock the bus during the entire read-modify-write
operation. With the WB memory type, the XCHG instruction locks
the cache instead of the bus if the memory access is contained
within a cache line.

It is recommended that software written to run on Pentium 4,
Intel Xeon, and P6 family processors assume the processor-
ordering model or a weaker memory-ordering model. The Pentium 4,
Intel Xeon, and P6 family processors do not implement a strong
memory-ordering model, except when using the UC memory type.
Despite the fact that Pentium 4, Intel Xeon, and P6 family
processors support processor ordering, Intel does not guarantee
that future processors will support this model."

regards,
alexander.

Ziv Caspi

unread,
Oct 6, 2002, 5:12:54 AM10/6/02