Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

linpack benchmark

89 views
Skip to first unread message

Dr Beco

unread,
Nov 3, 2012, 2:50:01 PM11/3/12
to
Dear linuxers,


How do I run a linpack benchmark in my computer?

I was able to compile this old code (parts from 1978!)
http://www.netlib.org/benchmark/linpackc
and run, but I'm afraid it may be innacurate or do not have the
capacity to measure multiple cores.

So I tried to find a more recent code, till I, for my surprise, find
out the debian package:

$ dpkg -s hpcc
Package: hpcc
Status: install ok installed
Priority: extra
Section: science
Installed-Size: 1568
Maintainer: Debian Science Maintainers
<debian-scienc...@lists.alioth.debian.org>
Architecture: amd64
Version: 1.4.1-2
Depends: libatlas3gf-base, libc6 (>= 2.7), libopenmpi1.3, mpi-default-bin
Description: HPC Challenge benchmark
The High Performance Computing (HPC) Challenge benchmark runs a suite
of 7 tests that measure the performance of CPU, memory and network for
HPC clusters. Amongst others, it includes the High-Performance LINPACK
(HPL) benchmark, used by the Top500 ranking (http://www.top500.org/).
Homepage: http://icl.cs.utk.edu/hpcc/


But there is no manual page, nor I can find instructions to run.
Simple running hpcc gives an error:

$ hpcc
HPL WARNING from process # 0, on line 313 of function HPL_pdinfo:
>>> cannot open file hpccinf.txt <<<


Is there a guideline? I just want to see the FLOPS, nothing to complex
I presume.
And, of course, to configure the test to use full capabilities of the
machine (an intel i5). I suspect hpccinf.txt is for that, isn't it?


BTW, from the compiled program, I got most of the times:


$ ./linpakc
Rolled Double Precision Linpack

Rolled Double Precision Linpack

norm. resid resid machep x[0]-1 x[n-1]-1
1.7 7.41628980e-14 2.22044605e-16 -1.49880108e-14 -1.89848137e-14
times are reported for matrices of order 100
dgefa dgesl total kflops unit ratio
times for array with leading dimension of 201
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 858333 0.00 0.01
times for array with leading dimension of 200
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 inf 0.00 0.00
0.00 0.00 0.00 1716667 0.00 0.01
Rolled Double Precision 858333 Kflops ; 10 Reps


but sometimes the number is:
Rolled Double Precision 1716667 Kflops ; 10 Reps
and I even got a negative value once in a while, which raised
suspicious about the code.

$ ./linpakc
norm. resid resid machep x[0]-1 x[n-1]-1
1.7 7.41628980e-14 2.22044605e-16 -1.49880108e-14 -1.89848137e-14
Rolled Double Precision -2147483648 Kflops ; 10 Reps


compiled with:

gcc linpakc.c -o linpakc -DDP -DROLL -O4 -lm


Thanks,
Beco.




--
Dr Beco
A.I. researcher

--> . <--

"Look again at that dot. That's here. That's home. That's us. On it
everyone you love, everyone you know, everyone you ever heard of,
every human being who ever was, lived out their lives. The aggregate
of our joy and suffering, thousands of confident religions,
ideologies, and economic doctrines, every hunter and forager, every
hero and coward, every creator and destroyer of civilization, every
king and peasant, every young couple in love, every mother and father,
hopeful child, inventor and explorer, every teacher of morals, every
corrupt politician, every "superstar", every "supreme leader", every
saint and sinner in the history of our species lived there - on a mote
of dust suspended in a sunbeam." (Carl Sagan, 1934-1996)


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CALuYw2x4XO75NA3HEbz+SuSA...@mail.gmail.com

Dr Beco

unread,
Nov 4, 2012, 11:50:02 AM11/4/12
to
Hi,

After running linpack from hpcc debian package, I find some of my
computers to have a very different FLOPS as the one related to the old
code cited before.

Also, without any hpccinf.txt it is still possible to gather
information about your computer. It will be saved in a file named
hpccoutf.txt

Here (at the end of this email) some results (I cut the file to focus
on flops and cpu time) .

There is a lot of info. but hard to find documentation on it.

$ man hpcc
shows: No manual entry for hpcc

With so many processors out there, each one installed in a different
system with different amount of RAM, kinds of disks (normal, SSD,
flash, etc.), a benchmark like linpack came to fill a gap when
comparing systems (not only processors). Its also a very important
part of the computer history, if you look at the list of old tests,
per year.

I hope someone can shed a light on this topic.

Thanks,
Beco.

PS. Some parts of the output file:


$ cat hpccoutf.txt
########################################################################
This is the DARPA/DOE HPC Challenge Benchmark version 1.4.1 October 2003
Produced by Jack Dongarra and Piotr Luszczek
Innovative Computing Laboratory
University of Tennessee Knoxville and Oak Ridge National Laboratory
[...cut...]


Begin of MPIRandomAccess section.
Running on 1 processors (PowerofTwo)
CPU time used = 2.068129 seconds

Begin of StarRandomAccess section.
CPU time used = 0.256016 seconds
Average GUP/s 0.065661

Begin of SingleRandomAccess section.
CPU time used = 0.260016 seconds
Single GUP/s 0.064915

Begin of MPIRandomAccess_LCG section.
Running on 1 processors (PowerofTwo)
CPU time used = 2.132133 seconds
Found 0 errors in 4194304 locations (passed).

Begin of StarRandomAccess_LCG section.
CPU time used = 0.256016 seconds
Found 0 errors in 4194304 locations (passed).
Average GUP/s 0.064661

Begin of SingleRandomAccess_LCG section.
CPU time used = 0.260016 seconds
Single GUP/s 0.064008

Begin of PTRANS section.
Finished 5 tests, with the following results:
5 tests completed and passed residual checks.
0 tests completed and failed residual checks.
0 tests skipped because of illegal input values.
END OF TESTS.
End of PTRANS section.

Begin of StarDGEMM section.
Average Gflop/s 15.281532

Begin of SingleDGEMM section.
Single DGEMM Gflop/s 16.926896

Begin of StarSTREAM section.
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 2201 microseconds. (= 2201
clock ticks)
-------------------------------------------------------------
Results Comparison:
Expected : 2519423615566406144.000000
503884723113281280.000000 671846297484375040.000000
Observed : 2519423615622480384.000000
503884723094127232.000000 671846297499976832.000000
Solution Validates
-------------------------------------------------------------
Node(s) with error 0
Average Copy GB/s 10.771604
Average Scale GB/s 11.293547
Average Add GB/s 12.187004
Average Triad GB/s 12.354061

Begin of SingleSTREAM section.
Each test below will take on the order of 2207 microseconds. (= 2207
clock ticks)
-------------------------------------------------------------
Results Comparison:
Expected : 2519423615566406144.000000
503884723113281280.000000 671846297484375040.000000
Observed : 2519423615622480384.000000
503884723094127232.000000 671846297499976832.000000
Solution Validates
-------------------------------------------------------------
Single STREAM Copy GB/s 10.874678
Single STREAM Scale GB/s 11.325829
Single STREAM Add GB/s 12.187004
Single STREAM Triad GB/s 12.354061

Begin of MPIFFT section.
Gflop/s: 1.463

Begin of StarFFT section.
Average Gflop/s 2.549725

Begin of SingleFFT section.
Single FFT Gflop/s 2.343712



================================================================================
HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008
Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0048654 ...... PASSED
================================================================================

Finished 1 tests with the following results:
1 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.

End of HPL section.
Begin of Summary section.
VersionMajor=1
VersionMinor=4
VersionMicro=1
VersionRelease=f
LANG=C
Success=1
sizeof_char=1
sizeof_short=2
sizeof_int=4
sizeof_long=8
sizeof_void_ptr=8
sizeof_size_t=8
sizeof_float=4
sizeof_double=8
sizeof_s64Int=8
sizeof_u64Int=8
sizeof_struct_double_double=16
CommWorldProcs=1
MPI_Wtick=1.000000e-06
HPL_Tflops=0.0132394
HPL_time=0.845557
HPL_eps=1.11022e-16
HPL_RnormI=2.53869e-12
HPL_Anorm1=666.101
HPL_AnormI=663.835
HPL_Xnorm1=1494.04
HPL_XnormI=2.76479
HPL_BnormI=0.499975
HPL_N=2560
HPL_NB=80
HPL_nprow=1
HPL_npcol=1
HPL_depth=1
HPL_nbdiv=2
HPL_nbmin=4
HPL_cpfact=R
HPL_crfact=C
HPL_ctop=1
HPL_order=C
HPL_dMACH_EPS=1.110223e-16
HPL_dMACH_SFMIN=2.225074e-308
HPL_dMACH_BASE=2.000000e+00
HPL_dMACH_PREC=2.220446e-16
HPL_dMACH_MLEN=5.300000e+01
HPL_dMACH_RND=1.000000e+00
HPL_dMACH_EMIN=-1.021000e+03
HPL_dMACH_RMIN=2.225074e-308
HPL_dMACH_EMAX=1.024000e+03
HPL_dMACH_RMAX=1.797693e+308
HPL_sMACH_EPS=5.960464e-08
HPL_sMACH_SFMIN=1.175494e-38
HPL_sMACH_BASE=2.000000e+00
HPL_sMACH_PREC=1.192093e-07
HPL_sMACH_MLEN=2.400000e+01
HPL_sMACH_RND=1.000000e+00
HPL_sMACH_EMIN=-1.250000e+02
HPL_sMACH_RMIN=1.175494e-38
HPL_sMACH_EMAX=1.280000e+02
HPL_sMACH_RMAX=3.402823e+38
dweps=1.110223e-16
sweps=5.960464e-08
HPLMaxProcs=1
HPLMinProcs=1
DGEMM_N=1477
StarDGEMM_Gflops=15.2815
SingleDGEMM_Gflops=16.9269
PTRANS_GBs=0.712858
PTRANS_time=0.0182569
PTRANS_residual=0
PTRANS_n=1280
PTRANS_nb=80
PTRANS_nprow=1
PTRANS_npcol=1
MPIRandomAccess_LCG_N=4194304
MPIRandomAccess_LCG_time=4.33102
MPIRandomAccess_LCG_CheckTime=0.323538
MPIRandomAccess_LCG_Errors=0
MPIRandomAccess_LCG_ErrorsFraction=0
MPIRandomAccess_LCG_ExeUpdates=16777216
MPIRandomAccess_LCG_GUPs=0.00387373
MPIRandomAccess_LCG_TimeBound=-1
MPIRandomAccess_LCG_Algorithm=0
MPIRandomAccess_N=4194304
MPIRandomAccess_time=4.33569
MPIRandomAccess_CheckTime=0.319197
MPIRandomAccess_Errors=0
MPIRandomAccess_ErrorsFraction=0
MPIRandomAccess_ExeUpdates=16777216
MPIRandomAccess_GUPs=0.00386956
MPIRandomAccess_TimeBound=-1
MPIRandomAccess_Algorithm=0
RandomAccess_LCG_N=4194304
StarRandomAccess_LCG_GUPs=0.0646613
SingleRandomAccess_LCG_GUPs=0.0640075
RandomAccess_N=4194304
StarRandomAccess_GUPs=0.0656609
SingleRandomAccess_GUPs=0.0649147
STREAM_VectorSize=2184533
STREAM_Threads=1
StarSTREAM_Copy=10.7716
StarSTREAM_Scale=11.2935
StarSTREAM_Add=12.187
StarSTREAM_Triad=12.3541
SingleSTREAM_Copy=10.8747
SingleSTREAM_Scale=11.3258
SingleSTREAM_Add=12.187
SingleSTREAM_Triad=12.3541
FFT_N=1048576
StarFFT_Gflops=2.54973
SingleFFT_Gflops=2.34371
MPIFFT_N=524288
MPIFFT_Gflops=1.46342
MPIFFT_maxErr=1.39111e-15
MPIFFT_Procs=1
MaxPingPongLatency_usec=-1
RandomlyOrderedRingLatency_usec=-1
MinPingPongBandwidth_GBytes=-1
NaturallyOrderedRingBandwidth_GBytes=-1
RandomlyOrderedRingBandwidth_GBytes=-1
MinPingPongLatency_usec=-1
AvgPingPongLatency_usec=-1
MaxPingPongBandwidth_GBytes=-1
AvgPingPongBandwidth_GBytes=-1
NaturallyOrderedRingLatency_usec=-1
FFTEnblk=16
FFTEnp=8
FFTEl2size=1048576
M_OPENMP=-1
omp_get_num_threads=0
omp_get_max_threads=0
omp_get_num_procs=0
MemProc=64
MemSpec=-1
MemVal=-1
MPIFFT_time0=0
MPIFFT_time1=0.00453091
MPIFFT_time2=0.00598502
MPIFFT_time3=0.00165105
MPIFFT_time4=0.0168791
MPIFFT_time5=0.00342178
MPIFFT_time6=1.19209e-06
CPS_HPCC_FFT_235=0
CPS_HPCC_FFTW_ESTIMATE=0
CPS_HPCC_MEMALLCTR=0
CPS_HPL_USE_GETPROCESSTIMES=0
CPS_RA_SANDIA_NOPT=0
CPS_RA_SANDIA_OPT2=0
CPS_USING_FFTW=0
End of Summary section.
########################################################################
End of HPC Challenge tests.





--
Dr Beco
A.I. researcher

--> . <--

"Look again at that dot. That's here. That's home. That's us. On it
everyone you love, everyone you know..." (Carl Sagan, 1934-1996)


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CALuYw2xeKvLMLw9znvPHS0Riy8xr87BRox=jsQtz9U...@mail.gmail.com

Stan Hoeppner

unread,
Nov 4, 2012, 5:30:01 PM11/4/12
to
On 11/4/2012 10:42 AM, Dr Beco wrote:

> With so many processors out there, each one installed in a different
> system with different amount of RAM, kinds of disks (normal, SSD,
> flash, etc.), a benchmark like linpack came to fill a gap when
> comparing systems (not only processors).

That is incorrect. HPL exercises only the CPU/memory subsystem (mostly
CPU). It doesn't test interconnect performance. The program "solves a
dense system of linear equations" and that's it. It is a "show off"
test which demonstrates maximal FP throughput of the processor.

The HPC Challenge is a collection of 7 programs, including HPL, that
perform a variety tests to more accurately show the real world parallel
performance of a parallel HPC system.

Keep in mind that these tests are designed to show the differences in
performance among parallel supercomputers and compute clusters with
thousands to hundreds of thousands of cores. They are not targeted to
the type of single system "historical" performance data you seem to be
collecting. There are probably better programs available to meet your
needs.

Debian may have an HPCC package, but this list is not a good place to
ask for help as there are likely no HPC users/experts on the
debian-users list who are familiar with these things. I suggest you ask
on one of the Linux cluster lists, or simply read the HPCC documentation:

http://icl.cs.utk.edu/hpcc/

--
Stan


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/5096EC42...@hardwarefreak.com

Dr Beco

unread,
Nov 4, 2012, 6:10:02 PM11/4/12
to
On Sun, Nov 4, 2012 at 7:29 PM, Stan Hoeppner <st...@hardwarefreak.com> wrote:
> That is incorrect. HPL exercises only the CPU/memory subsystem (mostly
> CPU). It doesn't test interconnect performance. The program "solves a
> dense system of linear equations" and that's it. It is a "show off"
> test which demonstrates maximal FP throughput of the processor.
>
>
> http://icl.cs.utk.edu/hpcc/
>
> --
> Stan


Hi Stam,

Thanks for your time, and the link to the documentation. Now I see
there is 7 programs (tests) and I'm only interested in one, namely,
HPL, or the use of calculus (in this case a linear system, but anyone
would do the job) to estimate the FLOPS of a system.

I see it is such a rocket to kill a fly, but I think I might be able
to understand and adapt the results, as long as I get stable and
consistent results.

I'm finishing here. Any other doubt I'll ask in a more specialized
list, as you pointed out.

Cheers,
Beco





--
Dr Beco
A.I. researcher

--> . <--

"Look again at that dot. That's here. That's home. That's us. On it
everyone you love, everyone you know..." (Carl Sagan, 1934-1996)


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CALuYw2zU9hGaBctX88t_dW4C...@mail.gmail.com
0 new messages