BorisCUDALib/Reduction.cuh(138): error: no instance of overloaded function "atomicAdd" matches the argument list

Robert Ryans

unread,

Oct 22, 2025, 12:09:15 PM10/22/25

to Boris Computational Spintronics

Hi

I was asked to build Boris on one of our systems for a student. The PC has a 4080 and is running Ubuntu 22.04 LTS. I'm using arch=90 for building.

I have tried using both CUDA 12.9 and 13.0 and get the same error when starting to compile the cuda version - see below. The cpp version builds fine, it's a CUDA error on the first file it tries to compile. Google doesn't throw up much of use - other people see the overloaded function error but I'm not a C person so don't quite now what to try next.

Would be grateful for any suggestions!

Thanks

Robert

$ /usr/local/cuda-12.9/bin/nvcc -I/usr/local/cuda-12.9/targets/x86_64-linux/include/ -rdc=true -c -std=c++14 -IBorisLib -IBorisCUDALib -w -arch=sm_90 Boris/Atom_DiffEqCubic_EvalsCUDA_TEuler.cu -o Boris/Boris_cuo/Atom_DiffEqCubic_EvalsCUDA_TEuler.o

BorisCUDALib/Reduction.cuh(138): error: no instance of overloaded function "atomicAdd" matches the argument list

argument types are: (size_t *, size_t)

atomicAdd(&points_count, shared_memory_count[0]);

^

BorisCUDALib/atomics.cuh(183): note #3327-D: candidate function template "atomicAdd(cuVAL3<Type> *, const cuVAL3<Type> &)" failed deduction

__attribute__((device)) void atomicAdd(cuVAL3<Type>* result, const cuVAL3<Type>& value)

^

BorisCUDALib/atomics.cuh(176): note #3327-D: candidate function template "atomicAdd(cuVAL2<Type> *, const cuVAL2<Type> &)" failed deduction

__attribute__((device)) void atomicAdd(cuVAL2<Type>* result, const cuVAL2<Type>& value)

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/crt/sm_90_rt.h(112): note #3326-D: function "atomicAdd(float4 *, float4)" does not match because argument #1 does not match parameter

static __attribute__((device)) __inline__ float4 atomicAdd(float4 *__address, float4 val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/crt/sm_90_rt.h(109): note #3326-D: function "atomicAdd(float2 *, float2)" does not match because argument #1 does not match parameter

static __attribute__((device)) __inline__ float2 atomicAdd(float2 *__address, float2 val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/sm_60_atomic_functions.h(93): note #3326-D: function "atomicAdd(double *, double)" does not match because argument #1 does not match parameter

static __inline__ __attribute__((device)) double atomicAdd(double *address, double val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/sm_20_atomic_functions.h(88): note #3326-D: function "atomicAdd(float *, float)" does not match because argument #1 does not match parameter

static __inline__ __attribute__((device)) float atomicAdd(float *address, float val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/device_atomic_functions.h(169): note #3326-D: function "atomicAdd(unsigned long long *, unsigned long long)" does not match because argument #1 does not match parameter

static __inline__ __attribute__((device)) unsigned long long int atomicAdd(unsigned long long int *address, unsigned long long int val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/device_atomic_functions.h(91): note #3326-D: function "atomicAdd(unsigned int *, unsigned int)" does not match because argument #1 does not match parameter

static __inline__ __attribute__((device)) unsigned int atomicAdd(unsigned int *address, unsigned int val) { }

^

/usr/local/cuda-12.9/targets/x86_64-linux/include/device_atomic_functions.h(89): note #3326-D: function "atomicAdd(int *, int)" does not match because argument #1 does not match parameter

static __inline__ __attribute__((device)) int atomicAdd(int *address, int val) { }

^

1 error detected in the compilation of "Boris/Atom_DiffEqCubic_EvalsCUDA_TEuler.cu".

Robert Ryans

unread,

Oct 22, 2025, 12:28:48 PM10/22/25

to Boris Computational Spintronics

OK - I found the previous post which included the advice to

Find the file atomics.cuh in BorisCUDALib directory. There, on line 163, replace the line "#if __CUDA_ARCH__ <= 700" with "#if __CUDA_ARCH__ <= 800".

So I just edited that from 800 to 900 and that seems to be letting it build. That may or may not have been a reasonable thing to do. But it built.

Serban Lepadatu

unread,

Oct 23, 2025, 2:35:31 AM10/23/25

to Boris Computational Spintronics

Hi Robert,

Yes, that was what I was going to advise.

That being said, it's worth double checking you do need arch 900. I recently had the same problem for NVidia L40S GPU. This has compute capability 8.9. Compiling with architecture 900 did not work on this GPU, it still needs 800.

Kind regards,

Serban

Subham Das

unread,

Dec 24, 2025, 4:05:15 AM12/24/25

to Boris Computational Spintronics

Hi, I successfully compiled Boris for CUDA 13.1 and the RTX 4070 Ti SUPER. However, I had to make changes to line 163 of BorisCUDALib/atomics.cuh and set it to 900. Additionally, I had to change the C++14 to C++17 in line 81 of the makefile.

But, the following warning arose when I tried "make install".

nvlink warning : Stack size for entry function '_Z38Anisotropy_BiaxialCUDA_AFM_UpdateFieldR15ManagedMeshCUDAR18ManagedModulesCUDAb' cannot be statically determined

Will this affect the accuracy of the simulations? Thanks in advance.

Kind regards,

Subham

Reply all

Reply to author

Forward