Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Please read again

8 views
Skip to first unread message

Ramine

unread,
Dec 11, 2014, 4:49:45 PM12/11/14
to
I have correcte some typos, please read again...

Hello,



If you have noticed i have just come up with a new algorithm of my
Parallel conjugate gradient solver library that is NUMA aware, this is
really a big improvement over my previous algorithm,my new algorithm
contains two parts that are the most expensive, and those two parts are:
a vector multiplication by a transpose of a matrix, and a vector
multiplication by a matrix, but when i have parallelized my previous
algorithm, i have parallelized just the memory data cache transfer from
the L2 cache-line hit to the CPU that costs around 10 CPU cycles for
every double type, and i have parallelized also the multiplication of
two doubles and addition of two doubles, but this was not enough, cause
what we have to do also is parallelize the memory data transfers from
the memory to the L2 cache , and this is what we call a NUMA aware
algorithm that really scale on NUMA architecture, and this is what i
have done in my new algorithm, the memory data transfers from memory to
the L2 cache was also parallelized and this have made my new algorithm
NUMA aware and really scalable on NUMA architecture. But to become NUMA
aware you need also to allocate memory of the arrays of your Matrix in
different NUMA nodes, but that's easy to do.

My Parallel conjugate gradient solver library supports dense matrices
and it is a library that solves linear system of equations and also it
solves large and very large dense linear system of equations.


You can download Parallel conjugate gradient solver library from:


https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-linear-system-solver


Thank you,
Amine Moulay Ramdane.


0 new messages