A new algorithm for for a distributed priority and ,non priority LOCK...

aminer

unread,

Sep 24, 2013, 1:33:49 PM9/24/13

to

Hello,

Here is my new algorithm for a distributed priority and
non priority LOCK, this algorithm reduce a lot the
cache coherency traffic and is good, so follow me please...

First you you allocate an array with the same number of elements
as the numbers of cores and the elements in the array must be 64 bytes
cache aligned and must be a record containing one element that is the
a flag for the first CAS that will be used as a critical section .

You first initialise the distributed priority LOCK by setting
a flag for the second CAS that will be used as a critical section to 0
and you set the flag of the first CAS that will be used also as a
critical section to 1 (to permit the first thread to enter the lock())

The lock() function will look like this:

Every thread that enters the lock() must acquire his processor
number by using GetProcessorNumber() function, but this function
will be optimized to amortize the number of calls to it, and that's
easy to do.

You enter the lock() function by pushing the processor number of the
threads into a queue or priority queue that will be optimized, this
queue will be using a CAS on the push() and will not use any CAS on the
pop(), we don't need it on the pop() cause only one thread will pop()
from the unlock() function, after that you enter a repeat loop where you
test with the first CAS and you test also with the second CAS the flag
of the corresponding element(flag) of the array using the processor
number of the threads as an index , the flag for the second CAS will be
set to 1 so the first thread will enter the lock() and the other threads
will test in a repeat loop for the first CAS and the second CAS and
there flags will have been set to zero so they will wait..

After that the first thread will arrive to the unlock() function and
he will pop() the processor number from the optimized priority queue
or non priority queue and set the flag for the first CAS to 0 for the
corresponding processor core this will allow a thread to enter the lock
, if there is no elements in the queue the thread will set the flag of
the second CAS to zero and this will allow a thread to enter the lock.

So as you have noticed my algorithm is efficient also cause if there
is threads waiting, the cache-coherence traffic will be reduced a lot
cause we are using local variables in each element of the array alligned
to 64 byte.

So if you have noticed, in fact i am using 2 CASes , so my algorithm
is good.

This was my algorithm for a distributed priority and non priority LOCK
that is efficient, and i will code it as soon as possible.

Thank you,
Amine Moulay Ramdane.

aminer

unread,

Sep 24, 2013, 2:44:10 PM9/24/13

to

i mean: set the flag for the first CAS to 1

> corresponding processor core this will allow a thread to enter the lock
> , if there is no elements in the queue the thread will set the flag of
> the second CAS to zero and this will allow a thread to enter the lock.

i mean: set the flag for the second CAS to 1

aminer

unread,

Sep 24, 2013, 2:48:14 PM9/24/13

to

I correct some typos, please read again...

Hello,

Here is my new algorithm for a distributed priority and
non priority LOCK, this algorithm reduce a lot the

cache-coherence traffic and is good, so follow me please...

or non priority queue and set the flag for the first CAS to 1 for the

corresponding processor core this will allow a thread to enter the lock
, if there is no elements in the queue the thread will set the flag of

the second CAS to 1 and this will allow a thread to enter the lock.

aminer

unread,

Sep 24, 2013, 3:40:32 PM9/24/13

to

Hello,

Be smart please, to reduce a lot the cache-coherence traffic,
you have to choose a queue that reduces a lot the cache-coherence
traffic, and for that you have to code a queue as a linklist
(similar to to the MCS queue) that reduces a lot the cache-coherence
traffic, but you have to avoid the lockfree queues cause they will
higher the cache-coherence. This is how you have to code my distributed
LOCK and it will efficient and be good.

aminer

unread,

Sep 24, 2013, 3:44:51 PM9/24/13

to

I correct:

Hello,

Be smart please, to reduce a lot the cache-coherence traffic,
you have to choose a queue that reduces a lot the cache-coherence
traffic, and for that you have to code a queue as a linklist
(similar to to the MCS queue) that reduces a lot the cache-coherence
traffic, but you have to avoid the lockfree queues cause they will

higher the cache-coherence traffic. This is how you have to code my

distributed LOCK and it will efficient and be good.

Thank you,
Amine Moulay Ramdane.

On 9/24/2013 11:48 AM, aminer wrote:
>