Hi all,
For the parameter max_num_core_per_appl, my understanding is as the following:
-- max_num_core_per_appl : the numbers of core to be assigned to one GPU application. Is this right?
max_num_core_per_appl = 1
num_sim_small_cores = 4
num_sim_medium_cores = 0
num_sim_large_cores = 20
max_threads_per_core = 80
Since I ran only one application, I only have one trace file as the following:
newptx
14
-1
/usr/local/src/gpuocelot/tests/cuda4.1sdk/.release_build/bd/ptx_jul2015/cuda4.1/QuasirandomGenerator/_Z26quasirandomGeneratorKernelPfjj_0/Trace.txt
----------------------------------
But the output is like this:
....
src/process_manager.cc:539: (I=0 C=0): thread_count:1536
...
**Core 20 Thread 36 Finished: insts:12268 cycles:1166718 (18667472) seconds:76 -- 0.01 IPC (0.01 IPC) -- N/A KHz (0.16 KHz)
...
**Core 21 Thread 34 Finished: insts:11946 cycles:2302355 (36837664) seconds:263 -- 0.01 IPC (0.01 IPC) -- N/A KHz (0.05 KHz)
...
**Core 22 Thread 10 Finished: insts:11946 cycles:2302778 (36844432) seconds:263 -- 0.01 IPC (0.01 IPC) -- N/A KHz (0.05 KHz)
...
**Core 23 Thread 46 Finished: insts:11946 cycles:2303051 (36848800) seconds:263 -- 0.01 IPC (0.01 IPC) -- N/A KHz (0.05 KHz)
...
**Core 21 Thread 372 Finished: insts:12588 cycles:10453845 (167261504) seconds:1590 -- 0.00 IPC (0.00 IPC) -- N/A KH z (0.01 KHz)
...
**Core 22 Core_Total Finished: insts:4330124 cycles:9848051 (157568800) seconds:1534 -- 0.44 IPC (0.44 IPC) -- N/A KH z (2.82 KHz)
...
**Core 23 Core_Total Finished: insts:4475270 cycles:10153031 (162448480) seconds:1571 -- 0.44 IPC (0.44 IPC) -- N/A KH z (2.85 KHz)
...
**Core 20 Core_Total Finished: insts:5051438 cycles:10155485 (162487744) seconds:1571 -- 0.50 IPC (0.50 IPC) -- N/A KH z (3.22 KHz)
...
**Core 21 Core_Total Finished: insts:4619456 cycles:10453845 (167261504) seconds:1591 -- 0.44 IPC (0.44 IPC) -- N/A KH z (2.90 KHz)
For this , I have some questions:
Sine I only ran one application, and assigned one GPU for it, why all the 4 GPU cores are running?
Furthermore, the applications has 1536 threads, and the value of max_threads_per_core is set to 80. (1536 > 80 1536 > 80*4) How can the number of threads exceeds the threshold?
Thanks for your time!
Best regards!
Applee