Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ParallelSort library version 3.0

3 views
Skip to first unread message

aminer

unread,
Oct 24, 2012, 11:08:10 AM10/24/12
to
Hello,

Parallel Sort Library was updated to version 3.0.

In this new version i have implemented a Parallel hybrid divide-and-conquer
merge algorithm that performs 0.9-5.8 times better than sequential merge, on
a quad-core processor, with larger arrays outperforming by over 5 times.
Parallel processing combined with a hybrid algorithm approach provides a
powerful high performance result.


Description:

Parallel Sort Library that supports Parallel Quicksort, Parallel HeapSort
and Parallel MergeSort on Multicores systems.

Parallel Sort Library uses my Thread Pool Engine and quicksort many array
parts - of your array - in parallel using Quicksort or HeapSort or
MergeSort and after that it finally merge them - with the merge()
procedure -

In the previous parallelsort version i have parallelized only the sort part,
but in this new parallelsort version i have parallelized also the merge
procedure part and it gives better performance.

I have implemented a Parallel hybrid divide-and-conquer merge algorithm that
performs 0.9-5.8 times better than sequential merge, on a quad-core
processor, with larger arrays outperforming by over 5 times. Parallel
processing combined with a hybrid algorithm approach provides a powerful
high performance result.

And as you know , Quicksort is a divide and conquer algorithm that have the
following best case performance:

T(n) = T(n/2) + T(n/2) + O(n)
= 2T(n/2) + (n)

And from case 2 of Master theorem, the recurrence equation gives:

T(n) = (n lg n)


Parallelizing the Sorts

One way to parallelize the sorts is:

Divide the data among the processors
Sort the data on the individual processors.
Merge the various data

Note that the merge operation is a reduction operation !


And please look at test.pas inside the zip, a parallel quicksort on many
cores, you have to compile and run gendata.pas before running test.pas, and
please set the number of threads to 8 times the number of cores to be more
optimal, i have giving you an exemple on how to do it , look inside
test.pas...

You can download ParallelSort library from:

http://pages.videotron.com/aminer/



Thank you,
Amine Moulay Ramdane.



aminer

unread,
Oct 24, 2012, 11:49:29 AM10/24/12
to
Hello,


For exemple i have added the following in ParallelSort library:


The idea:

Let's assume we want to merge sorted arrays X and Y. Select X[m]
median element in X. Elements in X[ .. m-1] are less than or equal to
X[m]. Using binary search find index k of the first element in Y greater
than X[m]. Thus Y[ .. k-1] are less than or equal to X[m] as well.
Elements in X[m+1..] are greater than or equal to X[m] and Y[k .. ]
are greater. So merge(X, Y) can be defined as
concat(merge(X[ .. m-1], Y[ .. k-1]), X[m], merge(X[m+1.. ], Y[k .. ]))
now we can recursively in parallel do merge(X[ .. m-1], Y[ .. k-1]) and
merge(X[m+1 .. ], Y[k .. ]) and then concat results.


I have implemented a Parallel hybrid divide-and-conquer
merge algorithm that performs 0.9-5.8 times better than sequential merge, on
a quad-core processor, with larger arrays outperforming by over 5 times.
Parallel processing combined with a hybrid algorithm approach provides a
powerful high performance result.


And now ParallelSort library gives better performance and scalability.


Please look at the source code to understand more...

You can download ParallelSort library from:

http://pages.videotron.com/aminer/



Thank you,
Amine Moulay Ramdane.



"aminer" <ami...@toto.com> wrote in message
news:k68sh0$ca2$1...@dont-email.me...

aminer

unread,
Oct 24, 2012, 11:57:15 AM10/24/12
to
Hello,

My Parallel BucketSort works with strings only at the moment...

But ParallelSort library works with all sort of types, like
Integers , strings, doubles etc.


I have tried to bring you the best, so i hope that you will
be happy with the new version of ParallelSort library.


You can download ParallelSort library and all my other parallel libraries
from:

http://pages.videotron.com/aminer/



Thank you,
Amine Moulay Ramdane.



"aminer" <ami...@toto.com> wrote in message
news:k68sh0$ca2$1...@dont-email.me...

aminer

unread,
Oct 24, 2012, 12:25:30 PM10/24/12
to
Hello,

Here is what have changed in version 3.0:

I have implemented a Binary search like this:

==
function BinSearch(var a: TTabPointer; value:pointer;left,
right:long;SCompare: TSortCompare): Integer;
var
First: long;
Last: long;
Pivot: long;
Found: Boolean;

begin

First := left;
Last := right;
Found := False;
Result := -1;

while (First <= Last) and (not Found) do
begin
Pivot := (First+Last)div 2;
if scompare(a[Pivot],value)=0 then
begin
Found := True;
Result := Pivot;
end
else if scompare(a[Pivot],value)>0 then
Last := Pivot - 1
else
First := Pivot + 1;
end;
end;

==

And i have added a ne SequentialMerge like this:

===

Procedure SequentialMerge(var to1:TTabPointer; var temp:TTabPointer; lowX,
highX,lowY, highY, lowTo:long;SCompare: TSortCompare);

var highto:long;
begin
highTo := lowTo + highX - lowX + highY - lowY + 1;

for lowto:=lowto to highto
do
begin

if (lowX > highX)
then
begin
to1[lowTo] := temp[lowY];
inc(lowY);
end
else if (lowY > highY)
then
begin
to1[lowTo] := temp[lowX];
inc(lowX);
end
else
begin
if scompare(temp[lowX],temp[lowY])<0
then
begin
to1[lowTo]:=temp[lowX];
inc(lowX);
end
else
begin
to1[lowTo]:=temp[lowY];
inc(lowY);
end;

end;
end;
end;

===


And i have added the TParallelSort.merge_parallel1()
method that will be called in parallel. like this:


===
procedure TParallelSort.merge_parallel1(job:pointer);

begin
merge_parallel(tjob2(job).t,tjob2(job).p1,tjob2(job).r1,tjob2(job).p2,tjob2(job).r2,tjob2(job).a,tjob2(job).p3,tjob2(job).comp,tjob2(job).value,tjob2(job).event);
lockeddeclong1(tjob2(job).value);
if long(tjob2(job).value^)=0 then tjob2(job).event.setevent;
tjob2(job).free;

end;

==


And i have added a TParallelSort.merge_parallel() method that
uses my threadpool engine to call the TParallelSort.merge_parallel1 method:


===

procedure TParallelSort.merge_parallel(var t:TTabPointer; p1, r1:long;
p2, r2:long;var a:TTabPointer; p3:long;SCompare:
TSortCompare;value:pointer;event:TSimpleEvent);

var length1,length2,tmp,q1,q2,q3:long;
mergeLeft,mergeRight:TJob2;
begin
length1 := (r1 - p1) + 1;
length2 := (r2 - p2) + 1;

if ( length1 = 0 ) then exit;
//if ((( length1 + length2 ) <= 8000) ) or (long(value^)>=(nbrprocs1-1))
then
if (( length1 + length2 ) <= 8000) then

begin
SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare);
exit;
end;
if (length1 < length2) then
begin
merge_parallel(t, p2,r2,p1, r1, a,p3,SCompare,value,event);
exit;
end;


q1 := (p1 +r1) div 2;

q2 :=my_binary_search( t,t[q1 ], p2, r2,SCompare);
// q2 := binsearch( t,t[q1 ], p2, r2,SCompare);
{if (q2=-1)
then
begin
writeln('titi');
SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare);
exit;
end;}
q3 := p3 + ((q1 - p1)) + ((q2 - p2)) ;
a[q3] := t[q1];



//mergeLeft:=TParallelMerge.create();
mergeLeft:=TJob2.create();

mergeLeft.t:=t;
mergeLeft.p1:=p1;
mergeLeft.r1:=q1-1;
mergeLeft.p2:=p2;
mergeLeft.r2:=q2-1;
mergeLeft.a:=a;
mergeLeft.p3:=p3;
//mergeLeft.obj:=self;
mergeLeft.comp:=scompare;
mergeleft.value:=value;
mergeLeft.event:=event;
lockedinclong1(value);

{ if (long(value^)>=(nbrprocs1-2)) then
begin
lockeddeclong1(value);
SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare);
mergeleft.free;
exit;
end;}


mergeRight:=TJob2.create();

mergeRight.t:=t;
mergeRight.p1:=q1+1;
mergeRight.r1:=r1;
mergeRight.p2:=q2;
mergeRight.r2:=r2;
mergeRight.a:=a;
mergeRight.p3:=q3+1;
mergeRight.comp:=scompare;
mergeRight.value:=value;
mergeRight.event:=event;

lockedinclong1(value);
{if (long(value^)>=(nbrprocs1-2)) then
begin
lockeddeclong1(value);
SequentialMerge(a, t, p1, r1, p2, r2, p3,scompare);
mergeRight.free;
exit;
end;}


TP.execute(self.merge_parallel1,pointer(mergeLeft));
TP.execute(self.merge_parallel1,pointer(mergeRight));


end;

===




And i have changed the TParallelSort.parallelmerge() method
, now it calls TParallelSort.merge_parallel((), not merge1(), like this:


==

procedure TParallelSort.parallelmerge(obj:pointer) ;

var

local_count:long;
value:long;
event1:TSimpleEvent;
i:long;
begin
event1:=tsimpleevent.create;
value:=0;

//Merge1(Tjob1(obj).a1,Tjob1(obj).a2,Tjob1(obj).Low,Tjob1(obj).Mid,Tjob1(obj).Hi,Tjob1(obj).comp);
merge_parallel(Tjob1(obj).a1,Tjob1(obj).Low,Tjob1(obj).Mid,Tjob1(obj).Mid+1,Tjob1(obj).Hi,Tjob1(obj).a2,0,Tjob1(obj).comp,@value,event1);

event1.waitfor(INFINITE);
event1.resetevent;


for i:=Tjob1(obj).Low to Tjob1(obj).Hi
do
begin
Tjob1(obj).a1[i]:=Tjob1(obj).a2[i-Tjob1(obj).Low]
end;

setlength(Tjob1(obj).a2,0);

local_count:=LockedIncLong(count1);

if local_count=TJob1(obj).count then Tjob1(obj).event.setevent;


TJob1(obj).free;
event1.free;

end;


===




Thank you,
Amine Moulay Ramdane.


"aminer" <ami...@toto.com> wrote in message
news:k68sh0$ca2$1...@dont-email.me...
0 new messages