Hi Dan,
I have been working on SASSI since the last few weeks. And I have been facing an issue with my particular application. I'm trying to compile a .cu file, and then merge with .cpp file. For that sense, I compile using -dc flag first, then -dlink to create the respective objects, and finally compile with the .cpp file to create the executable. Executable is able to be created but when I run I got an error for memory copying. I'm injecting memory analysis using default SASSI handlers.
You can find my implementation in this github:
https://github.com/fninaparavecino/mcx.git and the compilation mechanism was:
usr/local/sassi7/bin/nvcc -c -gencode
arch=compute_35,code=sm_35 -Xptxas
--sassi-inst-before="memory" -Xptxas
--sassi-before-args="mem-info" -g -O3 -dc -o
mcx_core.o mcx_core.cu
/usr/local/sassi7/bin/nvcc
-I~/Desktop/SASSI/example/inc -c -gencode
arch=compute_35,code=sm_35 -g -O3 -dlink -o
mcx_core_dlink.o mcx_core.o
/usr/local/sassi7/bin/nvcc
-I/usr/local/sassi7/include -g -O3 -c -o mcx_utils.o
mcx_utils.c
/usr/local/sassi7/bin/nvcc
-I/usr/local/sassi7/include -g -O3 -c -o
mcx_shapes.o mcx_shapes.c
/usr/local/sassi7/bin/nvcc
-I/usr/local/sassi7/include -g -O3 -c -o tictoc.o
tictoc.c
/usr/local/sassi7/bin/nvcc
-I/usr/local/sassi7/include -g -O3 -c -o mcextreme.o
mcextreme.c
/usr/local/sassi7/bin/nvcc
-I/usr/local/sassi7/include -g -O3 -c -o
cjson/cJSON.o cjson/cJSON.c
/usr/local/sassi7/bin/nvcc mcx_core.o
mcx_core_dlink.o mcx_utils.o mcx_shapes.o tictoc.o
mcextreme.o cjson/cJSON.o -o ../bin/mcx
-L/usr/local/sassi7/lib64 -lcudart -lm -lstdc++ -m64
-L~/Desktop/SASSI/instlibs/lib
-L/usr/local/sassi7/extras/CUPTI/lib64 -lcupti
-lcudadevrt
When it runs segfault (signal 11) is found.
Using valgrind I got the following analysis:
...
lauching MCX simulation for time window
[0.00e+00ns 5.00e+00ns] ...
simulation run# 1 ... ==17665== Invalid
write of size 8
==17665== at 0x413A98: ???
==17665== Address 0x0 is not stack'd, malloc'd
or (recently) free'd
==17665==
==17665==
==17665== Process terminating with default
action of signal 11 (SIGSEGV)
==17665== Access not within mapped region at
address 0x0
==17665== at 0x413A98: ???
==17665== If you believe this happened as a
result of a stack
==17665== overflow in your program's main
thread (unlikely but
==17665== possible), you can try to increase
the size of the
==17665== main thread stack using the
--main-stacksize= flag.
==17665== The main thread stack size used in
this run was 8388608.
==17665==
==17665== HEAP SUMMARY:
==17665== in use at exit: 77,616,868 bytes
in 273,680 blocks
==17665== total heap usage: 291,353 allocs,
17,673 frees, 80,843,756 bytes allocated
Error happens in this particular line:
cudaMemcpyToSymbol(gproperty, cfg->prop,
cfg->medianum*sizeof(Medium), 0, cudaMemcpyHostToDevice);
mcx_cu_assess(cudaGetLastError(),__FILE__,__LINE__); // ->
triggers a cuda error
cudaMemcpyToSymbol(gdetpos, cfg->detpos,
cfg->detnum*sizeof(float4), 0, cudaMemcpyHostToDevice);
cudaGetLastError() reported the following cuda error:
MCX ERROR(-13):invalid device symbol in
unit mcx_core.cu:1172
Would you have any idea why SASSI is not able to actually inject properly the memory analysis for this kernel? It seems that having SASSI memory handler is making the executable run wrongly.
Any advice would be highly appreciate.
Fanny.