(2.000239) ExternalFunctionSet.cpp:371: Assertion message: LLVM required to call external host functions from PTX.
solver: ocelot/ir/implementation/ExternalFunctionSet.cpp:371: void ir::ExternalFunctionSet::ExternalFunction::call(void*, const ir::PTXKernel::Prototype&): Assertion `false' failed.
solver: ocelot/cuda/implementation/CudaRuntimeInterface.cpp:811: virtual cudaError_t cuda::CudaRuntimeInterface::cudaDeviceGetLimit(size_t*, cudaLimit): Assertion `0 && "unimplemented"' failed.
Hi Pawel,
Ocelot does support dynamic memory allocation in the device but it needs to have LLVM installed in order to be able to make calls from device code into host code (for malloc and other functions).
Regarding the heap size, we leave that interface unimplemented since Ocelot doesn't actually have a limit on the heap size other than the limit on the host implementation of malloc. I'll plan on removing that assertion so you can use code that sets the heap size with Ocelot, but it won't actually impose a limit on the heap size.
Greg
--
You received this message because you are subscribed to the Google Groups "gpuocelot" group.
To view this discussion on the web visit https://groups.google.com/d/msg/gpuocelot/-/gsBjDpE7NcsJ.
To post to this group, send email to gpuo...@googlegroups.com.
To unsubscribe from this group, send email to gpuocelot+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gpuocelot?hl=en.
nvcc --cuda -arch=sm_21 malloc.cu
g++ -o malloc malloc.cu.cpp.ii `OcelotConfig -l`
Assertion message: LLVM support not compiled into ocelot.
#include <stdlib.h>
#include <stdio.h>
#include <cuda.h>
__global__ void mallocTest() {
size_t size = 123;
char* ptr = (char*)malloc(size);
memset(ptr, 0, size);
printf("Thread %d got pointer: %p\n", threadIdx.x, ptr);
free(ptr);
__syncthreads();
}
int main() {
int threads = 7;
mallocTest<<<1, threads>>>();
cudaDeviceSynchronize();
return 0;
}
__global__ void mallocTest() {
char* ptr = NULL;
printf("Thread %d got pointer: %p\n", threadIdx.x, ptr);
__syncthreads();
}
That shouldn't deadlock, but I would guess that there is a bug in the cudaDeviceSynchronize call. Can you try removing it and seeing if that fixes the deadlock?
Greg
To view this discussion on the web visit https://groups.google.com/d/msg/gpuocelot/-/uWD7mldVnGoJ.
==Ocelot== WARNING - No CUDA devices found or all devices disabled!
==Ocelot== Consider enabling the emulator in configure.ocelot.
Thread 0 got pointer: 0x7f832e79aef8
Thread 1 got pointer: 0x7f832e79aef8
Thread 2 got pointer: (nil)
Thread 3 got pointer: (nil)
Thread 4 got pointer: (nil)
Thread 5 got pointer: (nil)
Thread 6 got pointer: (nil)
To view this discussion on the web visit https://groups.google.com/d/msg/gpuocelot/-/dFuPm3JwcpMJ.