Hi Stamatis,
This is definitely a problem with linking the benchmark. Hopefully, I can give some pointers, but let me clarify some things first:
1) You must be running in full-system mode, since you're seeing IRQs and the kernel prints the segmentation fault.
2) I'll need to assume that you are using the
common.mk version that is included with the gem5-gpu benchmark repo. This would mean that the benchmark is statically linked, which is important.
3) The segfault indicates that the CPU code is trying to execute instruction pointer (PC) 0x0, which is obviously an invalid instruction address.
Alright, in addition to Konstantinos' recommendation of using gcc-4.4 (which we
also recommend), there are a few things you can try here:
1) For those taking notes, if you run this benchmark in SE mode, you will likely run into a panic: "Tried to execute unmapped address 0x0". SE mode may be quicker for tracking down such bugs.
2) Look at the benchmark's CPU assembly code with "objdump -d -C <binary>". I suspect that you will find something like
"4002f0: e8 0b fd bf ff callq 0 <__libc_tsd_LOCALE>"
near where the ceil() function should be called. This is an indicator that the linker didn't properly set up the linkage for the ceil() function. It basically just gave up and used 0x0 as the function address. You can check the binary's symbol table to establish this. You should see something like the following:
$ objdump -t <binary> | grep UND
...
0000000000000000 w *UND* 0000000000000000 ceil
...
Use this to your advantage to avoid re-running the benchmark each time you try to recompile to get the linking right.
The problem here is that gcc (all versions) confuses cmath/math.h functions with GPU math functions. GPU functions are defined CUDA include files, and they are declared as __device__ __host__, which means they can be executed on either the GPU or the CPU. Unfortunately, when compiling statically, gcc cannot decide whether to use the cmath/math.h, so it chooses the worst possible option: neither.
To fix this you'll need to play around with unmapping functions from the symbol table that cause the linker conflict. One route is to use the '-u <func>' compiler option to specifically define the function to be unmapped. Sometimes this works, because gcc will choose the CUDA version to unmap, leaving the cmath/math.h version for the CPU to use. Other times (as I believe is the case for you), it will unmap the cmath/math.h version, but not use the CUDA version. You can try reordering the include files in the benchmark source, reordering '-I' compiler options when calling gcc, or reordering the compilation steps to get gcc to select a version (your mileage may vary).
The root problem is that both versions of ceil are globally defined for your benchmark. Another option to fix this is by modifying the benchmark to include the cmath/math.h library inside of a namespace. Then, where ceil is called by CPU code, add the namespace ahead of the call. For example:
...
// At head of file, include math:
namespace cmath {
#include <cmath>
}
...
cmath::ceil(<params>)
...
Hope this helps,
Joel