Install GPU version HOOMD 4.3.0

145 views
Skip to first unread message

Xinyan Yang

unread,
Nov 13, 2023, 9:57:38 AM11/13/23
to hoomd-users
Hi,

I successfully built HOOMD from source on our cluster. I used CUDA 11.2.1 and the lib 'libcuda.so'. Here are the config outputs:

___________________________________________________________
-- The C compiler identification is GNU 8.4.0
-- The CXX compiler identification is GNU 8.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /software/gcc/8.4.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /software/gcc/8.4.0/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring HOOMD v4.3.0-6-gcee1960
-- The CUDA compiler identification is NVIDIA 11.2.67
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /software/cuda/cuda-11.2.0/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found cudart: /software/cuda/cuda-11.2.0/lib64/libcudart.so
-- Found nvrtc: /software/cuda/cuda-11.2.0/lib64/libnvrtc.so
-- Found cuda: /software/cuda/cuda-11.2.0/targets/x86_64-linux/lib/stubs/libcuda.so
-- Found cufft: /software/cuda/cuda-11.2.0/lib64/libcufft.so
-- Found cusolver: /software/cuda/cuda-11.2.0/lib64/libcusolver.so
-- Found cusparse: /software/cuda/cuda-11.2.0/lib64/libcusparse.so
-- Found compute-sanitizer: /software/cuda/cuda-11.2.0/bin/compute-sanitizer
-- Found CUDALibs: /software/cuda/cuda-11.2.0/lib64/libcudart.so  
-- Found MPI_C: /hpc/software/mpi/openmpi-4.1.4-gcc-11.2.0/lib/libmpi.so (found version "3.1")
-- Found MPI_CXX: /hpc/software/mpi/openmpi-4.1.4-gcc-11.2.0/lib/libmpi.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")  
-- Found cereal: /home/xys3549/hoomd-env-gpu/lib64/cmake/cereal
-- Found PythonInterp: /home/xys3549/hoomd-env-gpu/bin/python (found suitable version "3.8.4", minimum required is "3")
-- Found PythonLibs: /software/python/3.8.4/lib/libpython3.8.so
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: /home/xys3549/hoomd-env-gpu/include (found version "2.10.1")
-- Found pybind11: /home/xys3549/hoomd-env-gpu/share/cmake/pybind11 /home/xys3549/hoomd-env-gpu/include (version 2.10.1)
-- Installing hoomd python module to: /home/xys3549/hoomd-env-gpu/lib/python3.8/site-packages/hoomd
-- Found eigen: /usr/share/eigen3 /usr/include/eigen3 (version 3.3.7)
-- Found ZLIB: /usr/lib64/libz.so (found version "1.2.7")
-- Found LibXml2: /usr/lib64/libxml2.so (found version "2.9.1")
-- Found LLVM: /home/xys3549/hoomd-env/lib/python3.8/site-packages/lib/cmake/llvm /home/xys3549/hoomd-env/lib/python3.8/site-packages/lib/libLLVM.so /home/xys3549/hoomd-env/lib/python3.8/site-packages/lib/libclang-cpp.so /home/xys3549/hoomd-env/lib/python3.8/site-packages/include -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS (version 13.0.0)
-- Found plugin: example_plugins/pair_plugin
-- Found plugin: example_plugins/updater_plugin
-- Found plugin: example_plugins/shape_plugin
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xys3549/hoomd-blue/build

______________________________________________________________________
Currently Loaded Modules:
  1) openssl/1.1.1u   2) python/3.8.4   3) cmake/3.25.2   4) mpi/openmpi-4.1.4-gcc.11.2.0   5) cuda/cuda-11.2.0-openmpi-4.1.4   6) gcc/8.4.0
______________________________________________________________________

I think the installation is successful. But when I tried "import hoomd", errors showed up:

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "/home/xys3549/hoomd-blue/build/hoomd/__init__.py", line 81, in <module>

    from hoomd import hpmc

  File "/home/xys3549/hoomd-blue/build/hoomd/hpmc/__init__.py", line 32, in <module>

    from hoomd.hpmc import pair

  File "/home/xys3549/hoomd-blue/build/hoomd/hpmc/pair/__init__.py", line 11, in <module>

    from . import user

  File "/home/xys3549/hoomd-blue/build/hoomd/hpmc/pair/user.py", line 18, in <module>

    from hoomd.hpmc import _jit

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

______________________________________________________________________

I don't think it used 'libcuda.so.1' during installation. And there is no libcuda.so.1 existing on the cluster. We only have 'libcuda.so' and 'libcuda.so.7'.

Are there any ways to fix this issue?

Best,

Xinyan



Lourens Veen

unread,
Nov 13, 2023, 10:21:51 AM11/13/23
to hoomd...@googlegroups.com
Dear  Xinyan,

Could it be that you compiled on a login node, and are now doing this test on the login node as well? The libcuda.so.1 file is part of the CUDA driver, not the library, and if your login node doesn't have a GPU then it won't have the driver installed either, leading to this file not being found.

Could you try this test on a compute node, either by submitting a batch job or getting an interactive session if your cluster supports that? Be sure to load the CUDA module there as well, as you'll still need the library too.

Best,

Lourens


| Lourens Veen | Senior eScience Research Engineer Email: l.v...@esciencecenter.nl |
| Netherlands eScience Center | Science Park 402 | 1098 XH Amsterdam The Netherlands |


From: hoomd...@googlegroups.com <hoomd...@googlegroups.com> on behalf of Xinyan Yang <xinyanya...@gmail.com>
Sent: Monday, November 13, 2023 15:57
To: hoomd-users <hoomd...@googlegroups.com>
Subject: [hoomd-users] Install GPU version HOOMD 4.3.0
 
--
You received this message because you are subscribed to the Google Groups "hoomd-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hoomd-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hoomd-users/6c151d9e-a6d3-4b46-abb0-93335e0c388dn%40googlegroups.com.

Xinyan Yang

unread,
Nov 13, 2023, 10:48:48 AM11/13/23
to hoomd...@googlegroups.com
I tried a batch job. It works. Thank you!

Best,
Xinyan

Joshua Anderson

unread,
Nov 13, 2023, 10:59:14 AM11/13/23
to hoomd...@googlegroups.com
The components of HOOMD that perform run time compilation require linking to the CUDA driver library (this is a limitation of jitify). When you compile with ENABLE_GPU=on and ENABLE_LLVM=on, then you must compile and run on a system with the same NVIDIA graphics driver. As you found, on a cluster this is most often the compute nodes only.

If you don't use run time compiled potentials, then you can compile with ENABLE_LLVM=off and HOOMD will no longer depend directly on the NVIDIA graphics driver. It will load it at run time when it exists and report no GPUs available when the driver is not installed.
------
Joshua A. Anderson, Ph.D.
Research Area Specialist, Chemical Engineering, University of Michigan

Reply all
Reply to author
Forward
0 new messages