OpenMP threads

198 views
Skip to first unread message

Matheus Alcântara Souza

unread,
Oct 24, 2018, 12:03:03 PM10/24/18
to Sniper simulator
Hi there!

After set up the simulator and run fft example, I tried to make a simple code to test OpenMP. The following one:

    int a[100000];
    SimRoiStart();
    #pragma omp parallel for
    for (int i = 0; i < 100000; i++) {
        a[i] = 2 * i;
    }
    SimRoiEnd();

The thing is that the number of threads is always equal to 1. I've tried to set it, for instance, using "OMP_NUM_THREADS=4" before the ../../run-sniper.
Also using "omp_set_num_threads(4)" after and before SimRoiStart(), also using "#pragma....num_threads(4)", but the results seemd the same:

                                     | Core 0     | Core 1     | Core 2     | Core 3    
  Instructions                       |     799912 |          0 |          0 |          0
  Cycles                             |     619451 |     619451 |     619451 |     619451
  IPC                                |       1.29 |       0.00 |       0.00 |       0.00
  Time (ns)                          |     232877 |     232877 |     232877 |     232877
  Idle time (ns)                     |       1779 |     232877 |     232877 |     232877
  Idle time (%)                      |       0.8% |     100.0% |     100.0% |     100.0%

Note that Cores 1, 2 and 3 are 100.0% idle!

If I check "omp_get_num_threads()" out of the ROI, I get the right number (4).
The provided "SimGetNumThreads()" return always 1.

What I am doing wrong? Thank you!


COMPILE CMD: $(CC) hello.o -lm -fopenmp -static -L/home/matheus/sniper/lib -o hello
RUN CMD: OMP_NUM_THREADS=4 ; ../../run-sniper -v -n 4 -c gainestown --roi -- ./hello

Matheus Alcântara Souza

unread,
Oct 25, 2018, 2:58:29 AM10/25/18
to Sniper simulator
Please, ignore the question. I believe it was a matter of linking and compiling in the right way.
I've changed the Makefile, using as basis the one from "mpi-omp", and it now works.
Below is the Makefile content, for those with similar problems:


TARGET=hello
include ../shared/Makefile.shared

OMP_CFLAGS=$(GRAPHITE_CFLAGS) -fopenmp
OMP_LDFLAGS=$(filter-out -static,$(GRAPHITE_LDFLAGS)) -fopenmp

.c.o:
$(CC) -c -o $@ $< $(OMP_CFLAGS)

hello: hello.o Makefile
$(CC) $(OMP_LDFLAGS) -o hello hello.o -lm 

run_$(TARGET):
OMP_NUM_THREADS=4 ../../run-sniper -v -n 4 -c gainestown --roi -- ./hello

Best,
Matheus

Ayushi Agarwal

unread,
Feb 18, 2022, 8:31:59 AM2/18/22
to Sniper simulator
Hi Matheus,

I have been trying to run some OpenMP benchmarks in a similar way. The above command I think is to schedule each thread on a different simulated core. Would that take care of synchronization between different threads of the same application, as I understand that sniper cores are inherently asynchronous and synchronize after some quantum.

I also see that in my sim.out file, in such a case, the number of L3 misses are greater than the DRAM accesses. Did you encounter any such issue?

Thanks for your help.

-Ayushi
Reply all
Reply to author
Forward
0 new messages