Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

We have to be more precise in computer science...

10 views
Skip to first unread message

Ramine

unread,
Dec 8, 2014, 7:09:20 PM12/8/14
to
Hello,


We have to be more precise in computer science...

To be more sure i have just benchmarked a cache-line
transfer between cores and you will not believe it !
it is so expensive on x86 ! cause it takes around
800 CPU cycles , so i think i have correctly reasoned
when i have said that the following reader-writer lock
(of Joe Duffy an architect at Microsoft) is a garbage.

So please reread carefully my reasonning that follows
cause it is correct and true:


I must say that we have to be carefull, because i have just
read the following webpage about a more scalable reader/writer lock
by an architect at microsoft called Joe Duffy... but you have to be
carefull because this reader/writer lock is not really scalable,
it is a garbage, and i will think as an architect and explain to you why...

Here is the webpage, and my explanation follows...


http://joeduffyblog.com/2009/02/20/a-more-scalable-readerwriter-lock-and-a-bit-less-harsh-consideration-of-the-idea/


So look inside the EnterWriteLock() of the reader/writer above,
you will notice that it is first executing Interlocked.Exchange(ref
m_writer, 1), that means it is atomicaly making m_writer equal 1,
so that to block readers from entering the reader section,
but this is garbage, cause look after that he is doing this:

for (int i = 0; i < m_readers.Length; i++)

while (m_readers[i].m_taken != 0) sw.SpinOnce();


So after making m_writers equal 1 so that to block the readers,
he is transfering many cache-lines between cores, and this is really
expensive and it will make the serial part of the Amdahl's law bigger
and bigger when more and more cores will be used , so this will not
scale, so it is garbage.

The Dmitry Vyukov distributed reader-writer mutex doesn't have this
weakness, because look at the source code here:

http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-writer-mutex


Because he is doing this on the distr_rw_mutex_wrlock() side:

for (i = 0; i != mtx->proc_count; i += 1)
pthread_rwlock_wrlock(&mtx->cell[i].mtx);


So we have to be smart here and notice with me that as the "i" counter
variable goes from 0 to proc_count, the reader side will still be
allowed to enter and to enter again the reader section on scenarios with
more contention, so in contrast with the above reader-writer lock, this
part of the distributed lock is not counted as only a serial part of the
Amdahl's law, because it allows also the reader threads to enter
and to enter again the reader section, so this part contains a parallel
part of the Amdahl's law, and this makes this distributed reader-writer
lock to effectively scale. That's even better with my Distributed
sequential lock , because it scales even better than the distributed
reader-writer mutex of Dmitry Vyukov.



Hope you have understood my architect way of thinking.



Thank you for your time.



Amine Moulay Ramdane.







0 new messages