About using Ruby + Garnet

520 views
Skip to first unread message

Evania Tsai

unread,
Nov 30, 2016, 4:13:15 PM11/30/16
to gem5-gpu Developers List
Hi,
Sorry I am new to gem5-gpu, not familiar with the command which should be used if I would like to use gem5-gpunwith ruby + garnet.
Currently I keep encountering with the error as belowed:

fatal: Garnet only supports uniform bw across all links and NIs
 @ tick 0
[BaseGarnetNetwork:build/X86_VI_hammer_GPU/mem/ruby/network/garnet/BaseGarnetNetwork.cc, line 51]
Memory Usage: 4319016 KBytes
Program aborted at cycle 0
Aborted (core dumped)

with command:
/home/evania/gem5-gpu/gem5/build/X86_VI_hammer_GPU/gem5.opt /home/evania/gem5-gpu/gem5-gpu/configs/se_fusion.py -c  /home/evania/gem5-gpu/benchmarks/rodinia/backprop/gem5_fusion_backprop -o "16" --ruby --cpu-type=detailed --restore-with-cpu=timing --num-cpus=4 --clusters=8 --topology=Cluster --garnet-network=fixed

Could you give me advice what kind of command should I use to let it execute successfully?
Thanks for your help.
Best,
Evania

Jason Lowe-Power

unread,
Nov 30, 2016, 4:40:22 PM11/30/16
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

I doubt this is as simple as a set of command line parameters. To use gem5-gpu + garnet you'll probably need to modify the python config scripts.

I believe the problem you're running into is that the config file gem5-gpu/configs/gpu_protocol/VI_hammer_fusion.py specifies the intBW and extBW of the links. It looks like Garnet doesn't support this. However, it seems that all of the links are the same bandwidth, so it's possible that some link that doesn't matter (e.g., DMA) has an unspecified BW and if you fixed that things would work. I'm not super familiar with Garnet, so I don't know the exact problem.

I would start by investigating the config scripts like gem5-gpu/configs/gpu_protocol/VI_hammer_fusion.py and understand what they are doing so you can extend them to use Garnet. 

Jason

Matt

unread,
Dec 1, 2016, 2:17:07 PM12/1/16
to Jason Lowe-Power, Evania Tsai, gem5-gpu Developers List
In addition to what Jason suggests, you should also take a look at src/mem/ruby/network/garnet/BaseGarnetNetwork.py in the gem5 directory and ensure the ni_flit_size parameter matches both extBW and intBW.

-Matt

Evania Tsai

unread,
Dec 5, 2016, 9:25:28 AM12/5/16
to gem5-gpu Developers List, ja...@lowepower.com, evani...@gmail.com
Hi,
Thanks so much for advice. I made it run!
And I would like to know is there any output or data in stats.txt related to garnet in m5out?
In addition, I would like to know is it possible that gem5-gpu can be built with the latest gem5? (Because garnet has 2.0 version in latest gem5)
Thanks for help!
Best,
Evania


Matt Poremba於 2016年12月2日星期五 UTC+8上午3時17分07秒寫道:

Jason Lowe-Power

unread,
Dec 5, 2016, 9:44:35 AM12/5/16
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

We have not updated gem5-gpu to the latest version of gem5 in quite a while. However, it should be pretty straightforward. I can't think of any recent gem5 changes that will cause any headaches when updating. You should be able to do it pretty simply.

For the stats, I'm not sure. You can look in stats.txt, or in the Garnet code, you can look at regStats() to see what stats it exports.

Cheers,
Jason

Evania Tsai

unread,
Dec 6, 2016, 2:53:39 PM12/6/16
to gem5-gpu Developers List, ja...@lowepower.com
Hi,
Thanks for answering.
I just tried downloading the gem5 directory without update to the version to keep it as the lastest version but it seems I fail to have the gem5 patch (not very sure about it)
Because after then I would like to build gem5-gpu with protocol VI_Hammer or MOESI_hammer, I can't build it successfully.
With MOESI_hammer the error would be
home/evania/gem5-gpu_new/gem5-gpu/src/gpu/gpgpu-sim/cuda_core.hh:45:37: fatal error: mem/ruby/system/System.hh: No such file or directory
 #include "mem/ruby/system/System.hh"

 with VI_hammer, it would terminate quickly and the error is
Syntax error at /home/evania/gem5-gpu_new/gem5-gpu/src/mem/protocol/VI_hammer-CPUCache.sm:36:1817
>>{<<

May I ask how to fix this kind problem related to building gem5-gpu?   Sorry I am not familiar with and not good at this ...
Best,
Evania

Jason Lowe-Power

unread,
Dec 6, 2016, 6:11:51 PM12/6/16
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

The steps are something like this:
1. pop all patches off of gem5
2. updated gem5 (hg pull -u http://repo.gem5.org/gem5)
3. hg qpush -a
4. Fix all patches that fail to apply cleanly (this may take a while)
5. Build gem5.

Usually, if the patches apply cleanly gem5 will build. But, if there are build errors you'll have to dig into the code to figure it out.

Hopefully this gets you started in the right direction.

Jason

Evania Tsai

unread,
Dec 19, 2016, 12:32:46 PM12/19/16
to gem5-gpu Developers List, evani...@gmail.com, ja...@lowepower.com
Hi,
Thanks so much for these hints.
May I ask further about how to fix the patches...?
Sorry I haven't experienced before...
Thanks for your help.
Best,
Evania

Jason Lowe-Power於 2016年12月7日星期三 UTC+8上午7時11分51秒寫道:

Jason Lowe-Power

unread,
Dec 20, 2016, 11:03:03 AM12/20/16
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

It's hard to explain how to do it. I just take it one patch at a time. Usually, if the patch applies cleanly then things will "just work". Though, this isn't always the case. I push a patch, then I fix any conflicts. Then I move on to the next patch.

I hope this helps some.

Cheers,
Jason
--

Jason

Evania Tsai

unread,
Dec 22, 2016, 2:18:11 AM12/22/16
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Thanks for the reply.
Sorry I have another question about the ruby topology.
From the website and other posts in google group, is it true that it is needed to modify the code to connect the cores on our own if I would like to use topology=Mesh instead of using topology=Cluster(Because the default is Cluster)? So if I just type the command topology=mesh, it doesn't make effect? Sorry I am confused...
If so, can I ask further how to connect the cores...?
Thanks for your help.
Best,
Evania

Jason Lowe-Power於 2016年12月21日星期三 UTC+8上午12時03分03秒寫道:

Jason Lowe-Power

unread,
Dec 24, 2016, 8:40:22 AM12/24/16
to Evania Tsai, gem5-gpu Developers List
If you use VI_hammer, yes, you need to modify the config scripts to use the mesh topology. The scripts for VI_hammer (i.e., VI_hammer_fusion.py) explicitly use the cluster topology in the scripts. If you're using any of the coherence protocols from gem5, you should be able to use the parameter --topology exactly the same as normal, all topologies should be supported. The only problem you may run into is that some of the gem5 topologies assumes that components are created in a certain order (e.g., mesh-dir-corners puts the directories at corners based on the order they are created). So, if you use one of these topologies, you'll need to study the config script in gem5 to understand the order you need to create the controllers.

Cheers,
Jason
--

Jason

Evania Tsai

unread,
Dec 29, 2016, 8:57:10 AM12/29/16
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Thanks for your detailed explanation!!
So even I use the config scripts from gpu_protocol directory (i.e. gpu_protocol/MOESI_hammer_fusion.py), then it would still cause the error, right?
 But the error I encounter is
" from Ruby import create_topology
ImportError: No module named Ruby..."
I think I build gem5-gpu with ruby though...?
May I ask any suggestion for shedding the lights on this bug?
And... may I ask if there is any script or example that I can look up to to set the controllers? 
Thanks so much.
Best,
Evania

Jason Lowe-Power於 2016年12月24日星期六 UTC+8下午9時40分22秒寫道:

Joel Hestness

unread,
Dec 30, 2016, 7:19:12 PM12/30/16
to Evania Tsai, gem5-gpu Developers List
Hi Evania,
  Did you specify which Ruby protocol to use when you built gem5-gpu (as in the quick start guide)? It should automatically build all the Ruby controllers.

  You might compare the MOESI_hammer_fusion.py file to the VI_hammer_fusion.py file also. Make sure that the way you're importing Ruby is the same as one of the scripts that works.

  Joel

Evania Tsai

unread,
Jan 1, 2017, 3:32:28 PM1/1/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Yes I specify with Ruby=True when I built gem5-gpu. Is it right?
And I would like to ask further about how to set controllers if I would like to use mesh topology (Last time you said I do need to modify the scripts so...)?
Is there any tutorial or anything that I can refer to ? Thanks a lot!
Best,
Evania

Joel Hestness於 2016年12月31日星期六 UTC+8上午8時19分12秒寫道:

Jason Lowe-Power

unread,
Jan 2, 2017, 9:39:00 AM1/2/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

There isn't any good documentation on how to write config scripts for Ruby. To understand general config scripts in gem5 you can check out Learning gem5: http://learning.gem5.org/. For Ruby-specific stuff, I would just study the included scripts (e.g., gem5/configs/ruby) to understand what they are doing. I might try to re-write one of the scripts in a more object-oriented fashion to understand better what is in the scripts.

Jason 
--

Jason

Evania Tsai

unread,
Jan 3, 2017, 10:16:06 AM1/3/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Ok thanks so much for your reply.
I will try it, sorry for bothering you about this too much ><.
Best,
Evania

Jason Lowe-Power於 2017年1月2日星期一 UTC+8下午10時39分00秒寫道:

Evania Tsai

unread,
Jan 4, 2017, 5:56:34 PM1/4/17
to gem5-gpu Developers List, evani...@gmail.com

Hi,
I would like to know is there any advice for knowing which core the virtual channel (in ruby) is linked?
Or how to know the memory packet is from cpu or gpu ?
I have read this post
You said the ruby doesn't access the memory controller anymore... so how can I collect this information??
Thanks so much for your help.

Jason Lowe-Power

unread,
Jan 5, 2017, 12:41:47 PM1/5/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

For the most part all of the advice in that thread is still relevant. The only thing that has changed, is that you need to pass the original requestor when enqueuing the memory request. Then, in queueMemoryRead/queueMemoryWrite in AbstractController you can modify the packet creation to pass this information on to the memory controller, or just track it there.

Cheers,
Jason
--

Jason

Evania Tsai

unread,
Jan 8, 2017, 11:34:03 AM1/8/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Thanks for your advice, I will try to work on it.
Regarding to using the script (e.g. se_fusion.py, VI_hammer_fusion.py). I feel puzzled about the timing to use the VI_hammer_fusion.py.
Does it be used while building gem5-gpu with protocol (VI_hammer) or after built t,and then the simulation command would be feeded with VI_hammer_fusion.py not se_fusion.py?
Best,
Evania

Jason Lowe-Power於 2017年1月6日星期五 UTC+8上午1時41分47秒寫道:

Jason Lowe-Power

unread,
Jan 9, 2017, 10:43:23 AM1/9/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

You may want to look into this documentation I've been writing, http://learning.gem5.org, which explains how the gem5 configuration scripts work. Only the files in src/ are compiled. All of the files in config/ are interpreted by the embedded Python interpreter at runtime.

Jason
--

Jason

Evania Tsai

unread,
Jan 9, 2017, 9:50:33 PM1/9/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Thanks for reply.
After reading the documentation, may I ask that...
if I just run se_fusion.py it just simulates the classic cache, but if run se_fusion.py with --ruby, it would simulates ruby with the cache coherence I choose to build(using PROTOCOL=VI_hammer)?
Or I need to run with VI_hammer_fusion.py?
And does it indicate the hard-coded crossbar in VI_hammer in VI_hammer_fusion.py or just in the src code?
Sorry for my worse understanding... and thanks for your help.
Best,
Evania

Jason Lowe-Power於 2017年1月9日星期一 UTC+8下午11時43分23秒寫道:
Jason Lowe-Power於 2017年1月9日星期一 UTC+8下午11時43分23秒寫道:

Jason Lowe-Power

unread,
Jan 10, 2017, 11:17:14 AM1/10/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

Answers inline below.

On Mon, Jan 9, 2017 at 8:50 PM Evania Tsai <evani...@gmail.com> wrote:
Hi,
Thanks for reply.
After reading the documentation, may I ask that...
if I just run se_fusion.py it just simulates the classic cache, but if run se_fusion.py with --ruby, it would simulates ruby with the cache coherence I choose to build(using PROTOCOL=VI_hammer)?

This is correct.
 
Or I need to run with VI_hammer_fusion.py?
And does it indicate the hard-coded crossbar in VI_hammer in 
VI_hammer_fusion.py or just in the src code?

If you read the code in VI_hammer_fusion.py, you'll see that it uses the cluster topology. This file configures the system with the cluster topology.
If you read the more general code in gem5/configs/ruby/ you'll see that those files are more flexible and allow the topology to be specified on the command line. 
--

Jason

Evania Tsai

unread,
Jan 11, 2017, 2:33:57 AM1/11/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
Thanks for your detailed explanation. 
While I would like to use VI_hammer_fusion.py this script, it reports this error ImportError: No module named Cluster.
Is it also related to Ruby?
But I really add --RUBY=True when building gem5-gpu.
Is there any possible way to fix this?
Thanks for your help.
Best,
Evania

Jason Lowe-Power於 2017年1月11日星期三 UTC+8上午12時17分14秒寫道:

Jason Lowe-Power

unread,
Jan 12, 2017, 10:05:07 AM1/12/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

This is a surprising error. The cluster module is in gem5/configs/topologies, which should be part of the Python path (you can check by printing os.path).

It's possible that you don't have the correct version of gem5 and you may need to run gem5 update <gem5-revision found in the Google doc>.

Another possibility: You should run se_fusion.py or fs_fusion.py. These are the main runscripts. These scripts call Ruby.py (gem5/configs/ruby/) which in turn takes the compiled coherence protocol (VI_hammer) and calls VI_hammer_fusion.py and VI_hammer.py. I suggest you read through these scripts step-by-step to understand what is going on.

Jason

--

Jason

Evania Tsai

unread,
Jan 18, 2017, 12:03:00 AM1/18/17
to gem5-gpu Developers List, evani...@gmail.com
Hi,
May I ask if it is available that i use ../gem5-gpu/configs/gpu_protocol/VI_hammer_fusion.py directly...?
After checking my gem5 version(I rebuild one), I still encounter the same error.
Where does Import Cluster depend?? I tried to add like from m5.util import addToPath, fatal
addToPath('../../gem5/configs/common')
addToPath('../../gem5/configs/ruby')
addToPath('../../gem5/configs/topologies')
addToPath('gpu_protocol') in VI_hammer.py, but it still reports Import Cluster error.
Thanks so much for your help.
Best,
Evania
 

Jason Lowe-Power於 2017年1月12日星期四 UTC+8下午11時05分07秒寫道:

Jason Lowe-Power

unread,
Jan 18, 2017, 11:15:12 AM1/18/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

You cannot execute VI_hammer_fusion.py as the only script to gem5. It does not contain enough code to initialize all of gem5. It only initializes part of the coherence protocol, not the rest of the system. You should use gem5 with the se_fusion.py or fs_fusion.py Python config files.

For addToPath, it's a very finicky function. Cluster.py should be in gem5/configs/topologies (though, you can check for yourself to be sure). I would include a print statement after addToPath to make sure the right path is added and fix it if it's not. You can check what the Python path is by running: "import sys; print sys.path". See https://docs.python.org/2/library/sys.html#sys.path.

Jason
--

Jason

Evania Tsai

unread,
Jul 25, 2017, 4:41:14 AM7/25/17
to gem5-gpu Developers List, evani...@gmail.com
Hi Jason, 
May I ask about if I use other protocol instead of VI_hammer fusion.py, then if I use ruby topology like mesh. How does CPU and GPU connect into this Mesh? Their location on it. Or it only adopts CPU controller on it? 
Thanks in advance.
Best,
Evania

Jason Lowe-Power於 2017年1月19日星期四 UTC+8上午12時15分12秒寫道:
<div dir="ltr" class="gma

Jason Lowe-Power

unread,
Jul 25, 2017, 10:04:47 AM7/25/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

You can read the code in gem5/configs/topologies to find your answers. In short, you have to specify your controller in the "right" order to get the controllers in the location you want. This doesn't matter for homogeneous multi-CPU systems, but will for heterogeneous systems.

Jason

Evania Tsai

unread,
Jul 28, 2017, 12:31:19 AM7/28/17
to gem5-gpu Developers List, evani...@gmail.com
Hi Jason, 
Thanks for your reply, I will look into it.
May I ask about the assertion error?
Recently I keep encountering the error as below:
 RequestStatus Sequencer::insertRequest(PacketPtr, RubyRequestType): Assertion `m_outstanding_count == (m_writeRequestTable.size() + m_readRequestTable.size())' failed.

I just thought if it is the version of gcc/g++, sometimes I recompiled it can work but sometimes it failed...so my thought is wrong, right? 
So what might be the main cause regarding this kind of error?
Thanks in advance.
Best,
Evania

Jason Lowe-Power於 2017年7月25日星期二 UTC+8下午10時04分47秒寫道:

Evania Tsai

unread,
Aug 2, 2017, 3:58:13 PM8/2/17
to gem5-gpu Developers List
Hi,
May I inquire that if there is possible means to know that the memory accessed by CPU first then will be exploited by which GPU? Thanks in advance.
Best,
Evania

Jason Lowe-Power

unread,
Aug 3, 2017, 10:51:25 AM8/3/17
to Evania Tsai, gem5-gpu Developers List
Hi Evania,

I don't think there is anything already in the code to do what you're asking. If you want to know after the fact what data was first touched by the CPU then used by the GPU you should be able to easily add some information to the memory that tracks this. If you're looking for a way to *predict* what will be touched by the GPU, that is much harder. You could use something like the NVIDIA cudaMemcpy() functions to register the memory with the driver, then track its usage in the simulator.

Jason
Reply all
Reply to author
Forward
0 new messages