Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ParallelHashList was updated to version 1.3 ...

2 views
Skip to first unread message

aminer

unread,
May 17, 2012, 6:30:36 PM5/17/12
to

Hello,


In my previous version of ParallelHashList even if i have used
lock striping for the hash chains, synchronizing access to the counter
that computes the number of entries in the hashmap was reintroducing
the scalability problem of exclusive locking, this counter was called
a hot field because every mutative operation needs to access it.
In this new version 1.3 of parallelhashlist i have splited the global
counter to many counters to enhance the scalability... also i have changed
parallelhashlist to use only 100 lightweight MREWs
(multiple-readers -exclusive-writer)
this will lower the memory consumption and this will allow to parallelize
the writes and reads in separate chains , and also to parallelize the reads
in the same chain of the hashmap , so this will give a good performance
and a good scalability .

Description:

A Parallel HashList with O(1) (best case) and O(log(n)(worst case)
access that use a hash based method that uses lock striping and
100 lightweight MREWs (multiple-readers -exclusive-writer).
This will allow to parallelize the writes and reads in separate chains , and
also to parallelize the reads in the same chain.


You can download parallelhashlist from:

http://pages.videotron.com/aminer/


Language: FPC Pascal v2.2.0+ / Delphi 5+: http://www.freepascal.org/


Operating Systems: Win , Linux and Mac (x86).

and please take a look at the benchmarks here:

http://pages.videotron.com/aminer/parallelhashlist/queue.htm

Note: When i have done those benchmarks , there was not enough/much items
organized as a self-balancing tree in the individual chains of the
hashtable, so ,
almost all the items was found and inserted in O(1) , so the parallel part
in the
Amdahl equation was not much bigger compared to to the serial part. But you
will notice in pratice that as soon as you will have more items on the
chains of
the Hashtable, organized as self-balancing tree, with a worst case log(n) ,
the
parallel part will become bigger in the Amdahl equation and you will have
better
performance and scalability than the numbers in the graph of the benchmarks
...



Thank you.

Amine Moulay Ramdane.




aminer

unread,
May 17, 2012, 7:00:40 PM5/17/12
to

Hello again,


I have designed the Iterate() method to be scalable.

The iterate method calls the AIterateFunc function on each item in the hash

And you can use Iterate like this:

First declare two variables like this

var
:
IterFunc: TIterateFunc;
MyKey:pchar;


And after that declare your TIterateFunc like this:

---

function Iter(AUserData: Pointer; const AStr: string;
var APtr: Pointer): Boolean;

begin

Iter:=true;

writeln(Astr);

if Astr=Pchar(AUserData) then Iter:=false;

end;

---

and after that you add for example the following in the main body:



---

IterFunc:=Iter;

MyKey:='amine moulay ramdane 1';

hash.Iterate(MyKey,IterFunc);

---



That's all, i have just put a test1.pas example that uses the iterate()
method

inside the zipfile, you have to uncomment three lines in the test1.pas
example

to be able to use this method , please download again the zipfile from:

http://pages.videotron.com/aminer/





Thank you.

Amine Moulay Ramdane.



"aminer" <ami...@videotron.ca> wrote in message
news:jp3qm5$hb0$1...@dont-email.me...

aminer

unread,
May 17, 2012, 7:28:24 PM5/17/12
to


I wrote:
> I have designed the Iterate() method to be scalable.


I mean parallel.


Amine Moulay Ramdane.



"aminer" <ami...@videotron.ca> wrote in message
news:jp3vuq$iup$2...@dont-email.me...

aminer

unread,
May 17, 2012, 8:17:38 PM5/17/12
to


Hello

You can use the Parallel Iterate() method of parallelhashlist
when also you want to do a sort by keys...



Amine Moulay Ramdane.






"aminer" <ami...@videotron.ca> wrote in message
news:jp3vuq$iup$2...@dont-email.me...

aminer

unread,
May 17, 2012, 8:21:29 PM5/17/12
to


When do we use parallelhashlist ?

"We use hash tables when their magic fits our problem.

For example, caching frequently ends up using a hash table -- for example,
let's say we have 45,000 students in a university and some process needs to
hold on to records for all of them. If you routinely refer to student by ID
number, then a ID => student cache makes excellent sense. The operation you
are optimizing for this cache is fast lookup.

Hashes are also extraordinarily useful for storing relationships between
data when you don't want to go whole hog and alter the objects themselves.
For example, during course registration, it might be a good idea to be able
to relate students to the classes they are taking. However, for whatever
reason you might not want the Student object itself to know about that. Use
a studentToClassRegistration hash and keep it around while you do whatever
it is you need to do.

They also make a fairly good first choice for a data structure except when
you need to do one of the following:"






"aminer" <ami...@videotron.ca> wrote in message
news:jp3qm5$hb0$1...@dont-email.me...
>

aminer

unread,
May 17, 2012, 8:58:19 PM5/17/12
to

Hello again,

One final touch to parallelhashlist: i have just added this
to the constructor:

if AHashSize < 1000
then
begin
writeln('Constructor''s Hash size argument must be superior or equal to
1000');
exit;
end;



So as you have noticed the hash size argument must be superior or equal to
1000


Please download again the parallelhashlist zipfile from:

http://pages.videotron.com/aminer/



Amine Moulay Ramdane.




"aminer" <ami...@videotron.ca> wrote in message
news:jp3qm5$hb0$1...@dont-email.me...
>

aminer

unread,
May 17, 2012, 9:02:22 PM5/17/12
to


I wrote:
>One final touch to parallelhashlist


I mean a final touch to version 1.3



Amine Moulay Ramdane.


otron.ca> wrote in message news:jp46rd$slo$2...@dont-email.me...

aminer

unread,
May 17, 2012, 10:01:59 PM5/17/12
to


When i wrote:

"This will allow to parallelize the writes and reads in separate chains ,
and also to parallelize the reads in the same chain."

I was speaking about the previous version...

Here is the new description of parallelhashlist:


Description:

A parallel HashList with O(1) (best case) and O(log(n)(worst case)
access that use a hash based method that uses lock striping and
100 lightweight MREWs (multiple-readers -exclusive-writer) .
This allows multiple threads to write and read concurently.



Thank you.


Amine Moulay Ramdane.









"aminer" <ami...@videotron.ca> wrote in message
news:jp3qm5$hb0$1...@dont-email.me...
>

aminer

unread,
May 17, 2012, 10:23:37 PM5/17/12
to

I am using lock striping with 100 lightweight MREWs
(multiple-readers -exclusive-writer)

this allows up to 100 parallel writes, but this upper bound on writes will
not affect parallel reads...



Amine Moulay Ramdane.



"aminer" <ami...@videotron.ca> wrote in message
news:jp3qm5$hb0$1...@dont-email.me...
>
0 new messages