Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

There is a solution to satisfy the energy efficiency requirement

4 views
Skip to first unread message

aminer

unread,
Oct 12, 2013, 9:55:26 PM10/12/13
to

Hello,

I wrote:
"Question:
But Amine, why do you want to satisfy many requirements for your FIFO
queue, such
as minimizing efficiently the cache-coherence traffic,
and the FIFO fairness, and also the energy efficiency ? why
the energy efficiency ?
answer:
Energy efficieny is also important, cause we must not think just
for today , we must think for the future when there will be many
more cores, this requirement of energy efficiency will become much more
important, so as i told you it's sad that those waitfree and lockfree
FIFO queue algorithms must use spin-wait when there no item in the queue
and the threads must wait for new items etc. so as i told you, to
satisfy the requirement of energy efficiency we must use my FIFO fair
SemaCondvar or FIFO fair Semaphores inside the FIFO queue, but since my
SemaCondvar
and Smeaphores are slow , this will slow the FIFO queue, and that's sad."



I think there is a solution to satisfy the energy efficiency requirement
, you have to use my manual event object that is supported by FreePascal
and Delphi and that is portable,
so after you push an item in the queue you have to call SetEvent() of
the manual event object, and in the pop() side you have to use something
like this:

--

if self.count <> 0
then continue;

if self.count=0
then
begin
self.event.waitfor(INFINITE);
self.event.resetevent;
end;
end;

---

This method is more energy efficient, so when there is no items in the
queue the push() method will wait on the manual event object so
it will not use CPU ressources like with the spin-wait mechanism,
and i think you have to avoid Semaphores cause semaphores are slow.

Here is the complete source code:

===


unit Lockfree_MPMC;

{$IFDEF FPC}
{$ASMMODE intel}
{$ENDIF}


interface

uses
syncobjs,
{$IFDEF Delphi}expbackoff,sysutils;{$ENDIF}
{$IFDEF FPC}expbackoff,sysutils;{$ENDIF}

{$I defines.inc}

const margin=1000; // limited to 1000 threads...
{$IFDEF CPU64}
INFINITE = qword($FFFFFFFFFFFFFFFF);
{$ENDIF CPU64}
{$IFDEF CPU32}
INFINITE = longword($FFFFFFFF);
{$ENDIF CPU32}
type
{$IFDEF CPU64}
long = qword;
{$ENDIF CPU64}
{$IFDEF CPU32}
long = longword;
{$ENDIF CPU32}


tNodeQueue = tObject;
typecache1 = array[0..15] of longword;

// TLockfree_MPMC = class(TFreelist)
TLockfree_MPMC = class
private
tail:long;
tmp1:typecache1;
head: long;
fMask : long;
fSize : long;
temp:long;
backoff1,backoff2:texpbackoff;
event:TSimpleEvent;
tab : array of tNodeQueue;
procedure setobject(lp : long;const aobject : tNodeQueue);
function getLength:long;
function getSize:long;
function getObject(lp : long):tNodeQueue;
public
{$IFDEF CPU64}
constructor create(aPower : int64 =20); {allocate tab with size
equal 2^aPower, for 20 size is equal 1048576}
{$ENDIF CPU64}
{$IFDEF CPU32}
constructor create(aPower : integer =20); {allocate tab with size
equal 2^aPower, for 20 size is equal 1048576}
{$ENDIF CPU32}


destructor Destroy; override;
function push(tm : tNodeQueue):boolean;
function pop(var obj:tNodeQueue;wait:boolean=false):boolean;
property length : long read getLength;
property count: long read getLength;
property size : long read getSize;

end;


implementation

function LockedIncLong(var Target: long): long;
asm
{$IFDEF CPU32}
// --> EAX Target
// <-- EAX Result
MOV ECX, EAX
MOV EAX, 1
//sfence
LOCK XADD [ECX], EAX
inc eax
{$ENDIF CPU32}
{$IFDEF CPU64}
// --> RCX Target
// <-- EAX Result
MOV rax, 1
//sfence
LOCK XADD [rcx], rax
INC rax
{$ENDIF CPU64}
end;

function CAS(var Target:long;Comp ,Exch : long): boolean;assembler;stdcall;
asm
{$IFDEF CPU64}
mov rax, comp
lock cmpxchg [Target], Exch
setz al
{$ENDIF CPU64}
{$IFDEF CPU32}
mov eax, comp
mov ecx,Target
mov edx,exch
lock cmpxchg [ecx], edx
setz al

{$ENDIF CPU32}

end; { CAS }



{function CAS(var Target: long; Comperand: long;NewValue: long ):
boolean; assembler;stdcall;
asm
mov ecx,Target
mov edx,NewValue
mov eax,Comperand
//sfence
lock cmpxchg [ecx],edx
JNZ @@2
MOV AL,01
JMP @@Exit
@@2:
XOR AL,AL
@@Exit:
end;}
{$IFDEF CPU64}
constructor TLockfree_MPMC.create(aPower : int64 );
{$ENDIF CPU64}
{$IFDEF CPU32}
constructor TLockfree_MPMC.create(aPower : integer );
{$ENDIF CPU32}


begin
if aPower < 10
then
begin
writeln('Constructor argument must be greater or equal to 10');
halt;
end;
if (aPower < 0) or (aPower > high(integer))
then
begin
writeln('Constructor argument is incorrect');
halt;
end;
backoff1:=texpbackoff.create(1,2);
backoff2:=texpbackoff.create(1,2);
{$IFDEF CPU64}
fMask:=not($FFFFFFFFFFFFFFFF shl aPower);{$ENDIF CPU64}
{$IFDEF CPU32}
fMask:=not($FFFFFFFF shl aPower);
{$ENDIF CPU32}

fSize:=(1 shl aPower) - margin;
setLength(tab,1 shl aPower);
tail:=0;
head:=0;
temp:=0;
Event := TSimpleEvent.Create;

end;

destructor TLockfree_MPMC.Destroy;

begin
event.free;
backoff1.free;
backoff2.free;
setLength(tab,0);
inherited Destroy;
end;


procedure TLockfree_MPMC.setObject(lp : long;const aobject : tNodeQueue);
begin
tab[lp and fMask]:=aObject;
end;

function TLockfree_MPMC.getObject(lp : long):tNodeQueue;
begin
result:=tab[lp and fMask];
end;


function TLockfree_MPMC.push(tm : tNodeQueue):boolean;
var lasttail,newtemp:long;
i,j:integer;
begin

if getlength >= fsize
then
begin
result:=false;
exit;
end;
result:=true;

newTemp:=LockedIncLong(temp);

lastTail:=newTemp-1;
//asm mfence end;
setObject(lastTail,tm);

repeat

if CAS(tail,lasttail,newtemp)
then
begin
event.setevent;
exit;
end;
sleep(0);
until false;

end;


function TLockfree_MPMC.pop(var obj:tNodeQueue;wait:boolean=false):boolean;

var lastHead : long;
begin
repeat
lastHead:=head;
if tail<>head
then
begin
obj:=getObject(lastHead);

if CAS(head,lasthead,lasthead+1)
then
begin
result:=true;
exit;
end
else sleep(0); //backoff2.delay;

end
else
begin
if wait=false
then
begin
result:=false;
exit;
end;

if self.count <> 0
then continue;

if self.count=0
then
begin
self.event.waitfor(INFINITE);
self.event.resetevent;
end;
end;
until false;
end;

function TLockfree_MPMC.getLength:long;
var head1,tail1:long;
begin
head1:=head;
tail1:=tail;
if tail1 < head1
then result:= (High(long)-head1)+(1+tail1)
else result:=(tail1-head1);
end;

function TLockfree_MPMC.getSize:long;

begin
result:=fSize;
end;

end.
===




Thank you,
Amine Moulay Ramdane.




aminer

unread,
Oct 12, 2013, 9:57:57 PM10/12/13
to

Hello.

So as you have noticed this method that i am using is
more efficient cause you will not wait on the manual event object until
self.count = 0.


Thank you,
Amine Moulay Ramdane.

aminer

unread,
Oct 12, 2013, 10:28:23 PM10/12/13
to
On 10/12/2013 6:55 PM, aminer wrote:
> if self.count=0
> then
> begin
> self.event.waitfor(INFINITE);
> self.event.resetevent;
> end;
> end;


This method doesn't work , cause if a thread1 was on
"self.event.waitfor(INFINITE)" and another thread2 corossed
the "if self.count=0" , and if thread2 has not arrive yet to
"self.event.waitfor(INFINITE)" , and thread1 has
received two setevent() from the push() method and it has called after
that "self.event.resetevent" , so thread2 will stop and wait
on "self.event.waitfor(INFINITE)" even if there is items in the queue,
so this is a problem.



I will still think more about a method to satisfy the efficency
requirement without using SemaCondvar or Semaphore cause they are slow.
0 new messages