Running CPU benchmark and GPU benchmark simultaneously in full-system simulation.

1,427 views
Skip to first unread message

YongGon Kim

unread,
Dec 4, 2014, 6:47:11 AM12/4/14
to gem5-g...@googlegroups.com
Hello, i'm trying to run CPU bench and GPU bench on gem5gpu full-system simulation.

What i want to simulate is running CPU and GPU benchmarks simultaneously,
and investigate several informations like memory traffic.
(For example, what should memory controller do
 when there's unified memory controller for CPU-GPU heterogeneous architecture and cpu, gpu programs are running simultaneously)

I have compiled gem5gpu for X86 with default configuration and have used linux image(2.6.22.9) for full system simulation.
However, when I tried to run simple gpu and cpu program simulateneously, (rodinia gaussian bench for GPU, simple application(queen) for CPU) 
gem5gpu simulator operates in unexpected way.

For example, when i made terminal connection to full system simulator, i tried to run gpu application first as a background process. (type "/gem5_fusion_gaussian /matrix208.txt &" in m5term)
Then, GPU application runs as expected. But, shell prompt of m5term doesn't appear until the GPU application ends.
So, in this way, it's impossible to run CPU and GPU program simultaneously.

Second, i tried to run gpu and cpu application using following script.
---
#!/bin/sh
echo "Simulating fusion_gaussian"
/gpu-bench/gem5_fusion_gaussian /gpu-bench/matrix208.txt &
echo "Simulating queens"
/queens 16
m5 exit
----
I have used default fs_fusion.py with following command.
" ./build/VI_hammer/gem5.opt ../gem5-gpu/configs/fs_fusion.py --script=./configs/boot/cpu-gpu.rcS --restore-with-cpu=timing -r 1"

In this case, echo program works as expected. ("Simulating queens" appears immediately after the "Simulating fusion_gaussian" appears.)
However, it seems that queen program doesn't work well.
Result of queen program (which is expected to be printed out in stdout immediately)
doesn't show until the background GPU program ends.

My question is Why gem5gpu works in unexpected way as above for simultaneous CPU&GPU program runs?
- Is it just simple problem(e.g., problem relating standard input and output)? or Is there any simulator's inherent limitation?
- What should i do to simulate the simultaneous run of CPU&GPU benchmarks? please give me a hint about which one should i look at first.

Please help me to get through this confusion.
Thank you.

Jason Power

unread,
Dec 4, 2014, 9:50:49 AM12/4/14
to YongGon Kim, gem5-g...@googlegroups.com
Hello,

You may want to try creating the system with multiple CPU cores and pinning each application to a different CPU core. gem5-gpu assumes that the GPU process is the only process running on the CPU core while the GPU is active.

Cheers,

Jason

YongGon Kim

unread,
Dec 5, 2014, 10:03:06 AM12/5/14
to gem5-g...@googlegroups.com, ili...@gmail.com
Thank you so much for the advice. Now, it works well!

Also, i have quick questions relating gem5 simulator.

I just have used '--num-cpus 2' option for fs_fusion.py script. and used 'm5 pin' operation to assign my application to each cpu.
I'm curious about '--num-cpus' option. Does it make dual-core processor or symmetric multiprocessor?
and what should i do to distinguish these two architecture model?  Is there any option for this purpose?

I searched gem5 mailing list, but i couldn't find appropriate information. do you know anything about it? Thank you.


2014년 12월 4일 목요일 오후 11시 50분 49초 UTC+9, Jason Power 님의 말:

Jason Power

unread,
Dec 5, 2014, 10:35:44 AM12/5/14
to YongGon Kim, gem5-g...@googlegroups.com
It depends on the interconnect parameters whether or not you're modeling a SMP or multicore. The options for the interconnect when using Ruby can be found  in gem5/configs/topology.

Cheers,
Jason

Chuanwei Sun

unread,
Dec 10, 2014, 9:32:45 AM12/10/14
to gem5-g...@googlegroups.com
Hi, I'm new to gem5-gpu and trying to run CPU bench and GPU bench on gem5-gpu on se-system.
Have you successfully run CPU and GPU benchmarks simultaneously?
And I have done some testes, but I always failed.
Are there anything I need to pay attention ,while I am trying to configure the simulation.

在 2014年12月4日星期四UTC+8下午7时47分11秒,YongGon Kim写道:

Jason Power

unread,
Dec 10, 2014, 10:25:56 AM12/10/14
to Chuanwei Sun, gem5-g...@googlegroups.com
Hi Chuanwei,

Similar to the above answer, this should be possible if you pin your applications to different CPU cores. If there is only one CPU in the system, then it will not work as the CPU is stalled until the GPU kernel is complete.

Jason

YongGon Kim

unread,
Dec 11, 2014, 5:23:26 AM12/11/14
to gem5-g...@googlegroups.com
Previous discussion in https://groups.google.com/d/msg/gem5-gpu-dev/S16pVNEk5EE/yE3C8o9VyuUJ
was helpful to me.

I used '-num-cpus' option to use two cpu.

Also, I have modified 'se_fusion.py' to dedicate one cpu to GPU application.

I attached my diff result for 'se_fusion.py' in below.
Gpu_benchmark, gpu_option, gpu_stdout, gpu_errout are variables that i have made.
For example,  options.gpu_benchmark = "/home/ilios/workspace/gem5gpu/benchmarks/rodinia/gaussian/gem5_fusion_gaussian"
    options.gpu_option = "/home/ilios/workspace/gem5gpu/benchmarks/data/gaussian/matrix208.txt" , etc..

----
                 mem_ranges = [cpu_mem_range],
                 cache_line_size = options.cacheline_size)

+
+if options.gpu_benchmark != "":
+    if options.num_cpus != 2:
+        print >> sys.stderr, "Need --num_cpus must be 2! (for simultaneous cpu, gpu running) \n"
+        sys.exit(1)
+    process2 = LiveProcess()
+    process2.executable = options.gpu_benchmark
+    process2.cmd = [process2.executable] + options.gpu_option.split()
+    process2.output = options.gpu_stdout
+    process2.errout = options.gpu_errout
+    system.cpu[1].workload = process2
+
+
 # Create a top-level voltage domain
 system.voltage_domain = VoltageDomain(voltage = options.sys_voltage)
-----



2014년 12월 10일 수요일 오후 11시 32분 45초 UTC+9, Chuanwei Sun 님의 말:
Message has been deleted

DAVESH SHINGARI

unread,
Dec 22, 2014, 2:37:12 AM12/22/14
to gem5-g...@googlegroups.com
Hi Kim

I was just running queens for starting the simulation, but I couldn't see any update or couldn't find the results. I followed the following steps.

1. Compiled the queens for ARM by using "arm-linux-gnueabi-gcc -DUNIX -o queens queens.c"
2. Mounted the disk image "mount -o loop,offset=32256 ./disks/aarch32-ubuntu-natty-headless.img ./tempdir/" and put the queen in the root directory.
3. Created a checkpoint by booting the system and then created script as follows in gem5/config/boot/queens.rcS
-----------
#!/bin/sh echo "Simulating queens" /queens 16 m5 exit
-----------
4. There are 2 problems which I am having on running. One is that I couldn't see the echo statement and second is that when system boots and I login the machine, then in root directory I cant see the queen. I tried placing the queen in home directory, but then also I couldn't see the same.

Please let me know if you encountered the same issue and how did you run the test.

Thanks and Warm Regards
Davesh Shingari

Joel Hestness

unread,
Dec 22, 2014, 4:41:03 PM12/22/14
to DAVESH SHINGARI, gem5-gpu developers
Hi Davesh,
  I'm not exactly clear how you handled checkpointing. If you checkpointed using a terminal attached to the simulated system, restoring from that checkpoint will put you right back at that point in the terminal. If you checkpointed by passing the hack back script (--script=gem5/configs/boot/hack_back_ckpt.rcS), restoring from the checkpoint will result in the hack back script trying to re-read the script parameter file, so you can pass it to restoring simulation (--script=gem5/config/boot/queens.rcS). This should behave as expected if the queens benchmark is visible on the disk image.

  Also, if you're unable to see the queens benchmark, you might be running into the disk caching issue: gem5 caches the file system from the disk during boot. If you modified the disk after you collected the checkpoint but did not remount the disk in the restored simulated system, you won't be able to see modifications that you made to the disk after the checkpoint/restore. Unfortunately, if you're using the same disk for your root and benchmarks, you cannot remount the disk (Linux restriction). There are two ways you can deal with this:
1) Put all benchmarks on the disk before collecting your checkpoint
2) Add another disk in the gem5 configurations that will contain your benchmarks. In your run script (e.g. queens.rcS), mount or remount the benchmarks disk to clear the gem5 disk cache, and Linux will be able to see the changes made to the disk.

Hope this helps,
Joel


--
  Joel Hestness
  PhD Candidate, Computer Architecture
  Dept. of Computer Science, University of Wisconsin - Madison
  http://pages.cs.wisc.edu/~hestness/

YongGon Kim

unread,
Dec 23, 2014, 6:46:57 AM12/23/14
to gem5-g...@googlegroups.com
Hi Davesh,
i'm not familiar with gem5 so i'm not clear about your problem.
I think joel answered your question well. So, i will just attach my command line history in below.

1. Modify and compile queen program.
I have modified initial value of variables in queens.c to not find all answers and to print the answer.
Also, i just used gcc since i assume X86 architecture.
>> gcc -DUNIX -o queens queens.c -static

2. Download image and disk files.
i downloaded files from the link, http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2
Extract files to the gem5 root directory.

2. Set environment variable for gem5
export M5_PATH=$(root of gem5. Disks and binaries folder must exist in this folder)
export LINUX_IMAGE=$M5_PATH/disks/linux-x86.img

3. mount and copy queens
mount -o loop,offset=32256 linux-x86.img /mnt
copy and umount.

4. running in full system mode
./build/VI_hammer/gem5.opt ../gem5-gpu/configs/fs_fusion.py

5. or checkpointing in full system mode
./build/VI_hammer/gem5.opt ../gem5-gpu/configs/fs_fusion.py --scripts=./configs/boot/hack_back_ckpt.rcS

Thanks.

2014년 12월 22일 월요일 오후 4시 37분 12초 UTC+9, DAVESH SHINGARI 님의 말:

DAVESH SHINGARI

unread,
Dec 27, 2014, 3:33:22 PM12/27/14
to gem5-g...@googlegroups.com
Hi Joel and YongGon

Thanks a lot for the reply.
I tried what you suggested and Joel this time the queens was present (I created checkpoint and then inserted the queens earlier which was causing problem.)
But what I could see that when I run "/queens 8" in the terminal it gives result instantaneously. But when I execute the script it doesn't end. I couldn't even see the echo output (I logged as telnet 127.0.0.1 3456 and could see the terminal but no logs). Is there any other terminal for logs. And moreover I inserted m5 exit in the script, but the simulation didn't end. 
Please suggest how can I debug into the problem.

Thanks and Warm Regards
Davesh Shingari

Joel Hestness

unread,
Dec 29, 2014, 2:57:53 PM12/29/14
to DAVESH SHINGARI, gem5-gpu developers
Hi Davesh,
  This depends on how you collect your checkpoint, as I described previously. If you collect the checkpoint using the hack_back_ckpt.rcS script, then you should be able to specify the new runscript to execute in the restored system. However, if you collected the checkpoint by typing something into the simulated system terminal (e.g. with telnet or m5term), restoring from the checkpoint will just put you back at the terminal on checkpoint restore, but it won't run a script.

  Joel

DAVESH SHINGARI

unread,
Mar 23, 2015, 7:22:53 PM3/23/15
to gem5-g...@googlegroups.com, shingar...@gmail.com
Hi Joel and Jason

For running multiple program (one on CPU and one on GPU), I incorporated the changes as mentioned above in se_fusion.py. But I want to collect memory traces for the 2 application, so I need to include those changes in fs file. I couldn't locate the same type of  process enumeration there. 
When I read your command and I observe that you wrote "Similar to the above answer, this should be possible if you pin your applications to different CPU cores". Will pinning like we do in real linux Ubuntu machine work over here i.e. set_affinity. If not can you guide on how to do that in fs mode.
And for simulation I am planning on using 4 cores so the argument should be "--num-cpus=4". Should argument be passed for "--smt" also. I couldnt understand its description that "Only used if multiple programs are specified" How to specify multiple program or does it mean muti-threaded single program.

Jason Power

unread,
Mar 24, 2015, 12:01:52 PM3/24/15
to DAVESH SHINGARI, gem5-g...@googlegroups.com
Hi Davesh,

I don't follow what you need to change fs_fusion.py. Are you using full-system mode, not syscall emulation mode? In FS mode, there is not the same process enumeration because the operating system deals with processes. The CPU is agnostic to what process it's running.

Yes, I believe that set_affinity works in fs mode. Though you may want to search the gem5-users list for a definitive answer.

For your last question, you should read the code in the config files to find out. I believe that the description you are quoting only applies to SE mode, not FS mode. Thus, in SE mode, you can specify multiple programs and then execute them all on the same SMT core, with that option.  I do not believe you need to use SMT to accomplish your goal.

Hopefully this clears things up.

Jason

lucky

unread,
Jun 2, 2020, 9:01:19 PM6/2/20
to gem5-gpu Developers List
hello:
How do you created the system with multiple CPU cores and pinning each application to a different CPU core?

在 2014年12月4日星期四 UTC+8下午7:47:11,YongGon Kim写道:

TwilighT

unread,
Mar 23, 2022, 5:17:16 AM3/23/22
to gem5-gpu Developers List
Hi kim 
I am using gem5gpu 
I am not so familiar with it 
I have a question related to how you reating the system with multiple CPU cores and pinning each application to a different CPU core?
and how can i run collaborative benchmarks like chai in it?
i run chai benchmark in fs mode but there is an error with it that icant figure out 
would you please help me to overcome this problem:(
thanks regard
Screenshot from 2022-03-19 19-45-59.png
Reply all
Reply to author
Forward
0 new messages