HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
15000 Ns
1 # of NBs
256 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
2 Ps
1 Qs
16.0 threshold
3 # of panel fact
0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
2 # of recursive stopping criterium
2 4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
3 # of recursive panel fact.
0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
0 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
_________________________________________________________________
Messenger安全保护中心,免费修复系统漏洞,保护Messenger安全!
http://im.live.cn/safe/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20090604/c1870b6b/attachment.html
The parameters used in HPL.dat are explained here
in detail:
http://www.netlib.org/benchmark/hpl/tuning.html
If, for some reason, you need to recompile your
HPL, you may use the instructions in
http://sgowtham.net/blog/2007/07/02/hpl-benchmark-for-single-processor-machines/
Hope this helps,
Best,
gowtham
--
Gowtham
Department of Physics
Michigan Techn University
Houghton, MI
This number hints at what's going on here. You have a threaded gotoblas.
> But when I changed the
> arguments(Ps or Qs or both,i.e 2, 3) of HPL.dat and mpirun -np 2
> -manchinefile ./machines ./xhpl, I got the bad performance
...which then over allocates your cores.
Try setting the number of threads per MPI-rank to 1 before mpirun:
$ export OMP_NUM_THREADS=1
$ mpirun -np X (X > 1...) ...
Also monitor what's actually running on your two nodes with top.
/Peter
> (about 10e-2
> GFLOPS). I don't know how to handle this problem. Attached is the
> machinefile and HPL.dat.Actually, I don't know the meaning of the arguments
> after 13 lines, so I always keep it default. Could anyone help me ?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20090604/1a56a535/attachment.bin
Seems to me that I also set OMP_NUM_THREADS and GOTO_NUM_THREADS to 1
and pass via "mpirun -x ..."
Getting the HPL parameters set correctly to maximize your teraflops
rating is a bit of an art... here are a couple of links that I found
handy, though I continued to tweak beyond this... you want to ratchet
some of your parameters up and monitor memory usage to the point where
you're using all available memory without paging...
http://www.advancedclustering.com/faq/how-do-i-tune-my-hpldat-file.html
http://www.intel.com/support/performancetools/sb/CS-025964.htm
For the meaning of the HPL.dat parameters,
see the "TUNING" file that comes with HPL,
or the online documentation on the netlib HPL web pages:
http://www.netlib.org/benchmark/hpl/tuning.html
The FAQs may also help:
http://www.netlib.org/benchmark/hpl/faqs.html
Performance is highly dependent on N, NB, and (P,Q).
I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------