Compile cp2k with CUDA on mac osx

Samuel Lamphier

unread,

May 14, 2012, 3:00:51 PM5/14/12

to cp2k

Hello all,
I am trying to compile cp2k with CUDA on a macpro. I have 2 nvidia
quadro 4000 cards for GPU computation. However, when trying to
compile, the compiler is looking for -lrt which is the Posix real time
library as I understand. However... this is not supported on a mac.
How can I work around this?

Thanks in advance

Urban Borštnik

unread,

May 15, 2012, 3:40:31 AM5/15/12

to cp...@googlegroups.com

Hi,

You can edit the file cuda_tools/dbcsr_cuda_timing to remove calls to
the realtime funcions, and then not link to librt.

Cheers,
Urban.

>
> Thanks in advance
>

Message has been deleted

Samuel Lamphier

unread,

May 18, 2012, 11:40:20 AM5/18/12

to cp2k

All, after a number of modifications, cp2k with CUDA support was
compiled with no errors, however, upon testing I received the
following memory error:

CUDA Error: out of memory
ASSERTION FAILED: 1.EQ. 0

stack:
error in host_mem_alloc_i at line 166 with error type -1
message: Could not allocate host pinned memory
2 error in host_mem_alloc_i at line 166
1 called from dbcsr_init_lib

libdbcsr| Abnormal program termination, stopped by process number 6
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 6
[0]0:Return code = 0, signaled with Interrupt
[0]1:Return code = 0, signaled with Interrupt
[0]2:Return code = 0, signaled with Interrupt
[0]3:Return code = 0, signaled with Interrupt
[0]4:Return code = 0, signaled with Interrupt
[0]5:Return code = 0, signaled with Interrupt
[0]6:Return code = 1
[0]7:Return code = 0, signaled with Interrupt

Does anyone have any idea what would cause this, I have 2 NVIDIA
quadro 4000 mac pro and 24 GB of ram on the system (MacPro5.1 Snow
Lepord 10.6.8).

Urban Borštnik

unread,

May 21, 2012, 3:30:49 AM5/21/12

to cp...@googlegroups.com

Hi,

which defines did you use in compiling? Since the error you report is
an out-of-memory error, I assume you used both __CUDAPW and
__DBCSR_CUDA. Unfortunately, these are slightly incompatible withut a
modified input file because one reserves most of the memory for itself,
leaving the other one without any memory. Try setting the
GLOBAL/CUDA/MEMORY (in kiB) to some sane value. I would guess around
20000 and then adjust as necessary from there: if you get a similar out
of memery message the value is too big. If it's too small you will
probably get some other message.

Also note that using multiple cards from one process is unsupported.
You should use 2 MPI processes per box to take advantage of both cards.

Cheers,
Urban.

Reply all

Reply to author

Forward