Compiling CP2K with AMD Libraries

127 views
Skip to first unread message

Bich Phuong

unread,
May 22, 2019, 2:58:20 PM5/22/19
to cp...@googlegroups.com
Hello,

I like to compile CP2K with AMD libs to compare the performances with the one compiled with OpenBLAS, LAPACK and Scalapack
since I had the system with 2 x EPYC 7601.

I checked the install_cp2k_toolchain.sh and install_acml.sh but the acml doesn't exist anymore (https://developer.amd.com/amd-cpu-libraries/), thus can not use this tool.
I modify the local.sopt (tried with sopt firstly) arch file created by install_cp2k_toolchain.sh by replacing path of openblas and -lopenblas, tried to link either BLIS/BLIS+libflame/LibM but nothing works (the available libraries or compiled them by myself follow this documentation https://developer.amd.com/wp-content/resources/AMDCPULibrariesUserGuide_1.3.pdf).
It complaint about undefined references to some BLAS functions.
-----------------------
/home/phuong/cp2k-6.1/lib/amd/sopt/libdbcsrops.a(dbcsr_operations.o): In function `__dbcsr_operations_MOD_dbcsr_init_random':
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_operations.F:1104: undefined reference to `dlarnv_'
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_operations.F:1104: undefined reference to `dlarnv_'
/home/phuong/cp2k-6.1/lib/amd/sopt/libdbcsrops.a(dbcsr_blas_operations.o): In function `__dbcsr_blas_operations_MOD_dbcsr_lapack_larnv':
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_blas_operations.F:84: undefined reference to `zlarnv_'
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_blas_operations.F:82: undefined reference to `clarnv_'
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_blas_operations.F:78: undefined reference to `slarnv_'
/home/phuong/cp2k-6.1/src/dbcsr/ops/dbcsr_blas_operations.F:80: undefined reference to `dlarnv_'
collect2: error: ld returned 1 exit status
make[3]: *** [/home/phuong/cp2k-6.1/exe/amd/dbcsr_example_3.sopt] Error 1
make[3]: *** Waiting for unfinished jobs....
--------------------------

Does anyone have experiences with this work?
Which AMD library should I use to alter the ACML one?

Thank you.
amd.sopt

Alfio Lazzaro

unread,
May 23, 2019, 4:08:26 AM5/23/19
to cp2k
Hello Bich Phuong,
larnv is a LAPACK function. However, I don't see any LAPACK library in your arch file:

LIBS        = -lblis -lxsmmf -lxsmm -ldl -lpthread -lsmm_dnn -lxcf03 -lxc -lderiv -lint -lfftw3 -lstdc++



BTW, I see that you are lusing libxsmm and libsmm, there are mutual exclusive. You can just use libxsmm.

Alfio

Bich Phuong

unread,
May 23, 2019, 5:37:16 AM5/23/19
to cp...@googlegroups.com
Hi Alfio,

Thank you for your answer.
The libflame is AMD lib having LAPACK functions. I actually linked libflame together with blis before and still showed the same errors.
After reading your answer, I tried to link LAPACK + BLIS and eventually it works although I haven't tested with benchmarks yet. I assume that there are some problems with Libflame and I will try to compile it again.

I also appreciate your suggestion for libxsmm and libsmm. Actually we have cp2k as a application task in a cluster competition.
Do you know any way to boost performance a bit without changing the given input file? 



--
You received this message because you are subscribed to a topic in the Google Groups "cp2k" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cp2k/4vFftwwDEJg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cp2k+uns...@googlegroups.com.
To post to this group, send email to cp...@googlegroups.com.
Visit this group at https://groups.google.com/group/cp2k.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/add016e6-b3a7-48f5-8dd9-d38eb9b50330%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alfio Lazzaro

unread,
May 24, 2019, 3:39:55 AM5/24/19
to cp2k

I also appreciate your suggestion for libxsmm and libsmm. Actually we have cp2k as a application task in a cluster competition.
Do you know any way to boost performance a bit without changing the given input file? 


Well, libxsmm can still give you good performance on AMD too, so I suggest to use it.
Concerning other possible optimizations, it depends on what you are running. Could you share your output?

Alfio


 


To unsubscribe from this group and all its topics, send an email to cp...@googlegroups.com.

Bich Phuong

unread,
May 25, 2019, 3:06:01 PM5/25/19
to cp...@googlegroups.com
Hi,

I'm currently running some benchmarks, tests with different libraries and compilers on 1 node.
I don't know which input file/calculation they will give us during the competition.
also think about tunning with RDMA for job across nodes and try with KNEM to see if the performance changes.

This is output of benchmarks on 1 node (2U).
Thanks for your help.

To unsubscribe from this group and all its topics, send an email to cp2k+uns...@googlegroups.com.

To post to this group, send email to cp...@googlegroups.com.
Visit this group at https://groups.google.com/group/cp2k.
out_H2O-64
out_LiH
out_FIST

Bich Phuong

unread,
May 25, 2019, 5:42:51 PM5/25/19
to cp...@googlegroups.com
Another question regarding compiling cp2k with CUDA (I don't know if I should make another post, let me know if I should do).
I compiled cp2k with cuda for V100s. The dbcsr I compiled by myself using the develop branch of Shoshijak, (recursive clone it as a submodule of cp2k like it should be).
Everything was going well with DBCSR and I got parameters for V100.

For the cp2k local_cuda.psmp, I took from the one automatically generated by install_cp2k_toolchain.sh with --gpu-ver=P100.
Then I modify it (attached).

When compiled local_cuda.psmp, it showed this error:
-------------------------------------------------------------
/hpchome/phuong/cp2k/src/mscfg_types.F:241.81:
    Included at mscfg_types.F90:2:

      CALL dbcsr_distribution_get(dist_qs, row_dist=blk_distr, row_cluster=clus
                                                                           1
Error: Keyword argument 'row_cluster' at (1) is not in the procedure
/hpchome/phuong/cp2k/src/mscfg_types.F:244.81:
    Included at mscfg_types.F90:2:

      CALL dbcsr_distribution_get(dist_qs, col_dist=blk_distr, col_cluster=clus
                                                                           1
Error: Keyword argument 'col_cluster' at (1) is not in the procedure
/hpchome/phuong/cp2k/src/mscfg_types.F:290.46:
    Included at mscfg_types.F90:2:

                                  row_cluster=row_cluster_new, col_cluster=col_
                                              1
Error: Keyword argument 'row_cluster' at (1) is not in the procedure

------------------------------------------------------------------
Can you take a look as well?

P/s: I gave up to use BLIS + LibFLAME, basically could compile and test with sopt and ssmp version, but I do need the psmp or at least popt.
The problem with these libs are they could not work with Scalapack, can compile scalapack links with them without errors but the tests were failed ... Thus no psmp, popt. 
local_cuda.psmp

Alfio Lazzaro

unread,
May 26, 2019, 12:03:22 PM5/26/19
to cp2k
Concerning the file out_H2O-64, I see that you spend most of the time in:

calculate_rho_elec               198.213


You can try to optimize it by retuning the kernels, see under tools/autotune_grid.
The out_LiH is not completed.
No comments on the out_FIST.

Alfio

Alfio Lazzaro

unread,
May 26, 2019, 12:06:09 PM5/26/19
to cp2k
Please use the default Master version in CP2K (which is already in place), we don't allow DBCSR develop to be included in CP2K (only official release can be included).


Alfio
Reply all
Reply to author
Forward
0 new messages