MESI_Two_Level in gem5-gpu

Yuchen Hao

unread,

Mar 19, 2016, 8:38:56 PM3/19/16

to gem5-gpu Developers List

Dear all,

I was wondering how MESI two level works in gem5-gpu. I notice there are private l1 and shared l2 configured on the cpu side, whereas only private l1s are configured for gpu, page walker and copy engine (no l2, different from VI-hammer and MOESI). My understanding was that ruby does not allow heterogeneous coherence configurations, so wouldn't the l1 on the gpu side expect shared l2 as well? Or is there a way to bypass the shared l2 even with l2 actually being instantiated somewhere in the system?

Regards,

Yuchen

Jason Lowe-Power

unread,

Mar 21, 2016, 10:43:03 AM3/21/16

to Yuchen Hao, gem5-gpu Developers List

Hi Yuchen,

The L2 is shared between the CPU and GPU in MESI two level. Think of the L2 cache in MESI two level as an LLC.

While all of the protocols included in gem5 are homogeneous, it is POSSIBLE to have heterogeneous protocols in Ruby (e.g., VI_hammer).

You may be able to bypass the L2 if you modify the protocol. It should be pretty easy to add a flag in the L1 controller to bypass the L2. Then, you would add a flag to the messages sent to the L2 that it should bypass. Then, at the L2 you would check the flag and either bypass or not.

Let us know if you have more questions,

Jason

--

Jason

7933...@qq.com

unread,

May 25, 2016, 11:28:37 PM5/25/16

to gem5-gpu Developers List, haoba...@gmail.com

Hi，
I am very interesting in this . In the folder (gem5-gpu/gem5-gpu/configs/gpu_protocol/) have a file named "MESI_Two_Level_fusion.py",is this the protocol file to modify?

Thanks !

Hao

在 2016年3月21日星期一 UTC+8下午10:43:03，Jason Power写道：

Jason Lowe-Power

unread,

May 26, 2016, 10:38:17 AM5/26/16

to 7933...@qq.com, gem5-gpu Developers List, haoba...@gmail.com

Hi Hao,

You need to modify the protocol files (*.sm) as well as the configuration file. I'd start by familiarizing yourself with how Ruby/SLICC works (http://gem5.org/SLICC). For MESI_Two_level you would need to modify the files gem5/src/mem/protocol/MESI_Two_Level*.sm

Cheers,

Jason

--

Jason

7933...@qq.com

unread,

Jun 7, 2016, 3:18:12 AM6/7/16

to gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com

Hello，

I just want to achieve the gpu's L1 cache request bypass the L2, so how can i distinguish L1 requests are form cpu or gpu?

Thanks

hao

在 2016年5月26日星期四 UTC+8下午10:38:17，Jason Power写道：

Jason Lowe-Power

unread,

Jun 7, 2016, 12:11:28 PM6/7/16

to 7933...@qq.com, gem5-gpu Developers List, haoba...@gmail.com

Hi Hao,

You can add a new field to the message type which denotes which kind of controller it comes from. Similarly, you can add a parameter to the controller (by modifying the declaration in the .sm file) so you know when to include the new field in the message. Then, you can check that field at the L2 cache.

Jason

7933...@qq.com

unread,

Jun 16, 2016, 11:02:05 AM6/16/16

to gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com, ja...@lowepower.com

Hello Jason,

Thank you very much for your reply,Now I can distinguish which kind of L1 controller the message comes from. Then,in the file MESI_Two_Leve-L2cache.sm,I add a if sentence in in_port(L1RequestL2NetWork_in,RequestMsg,L1RequestToL2Cache,rank=0) to judge ,if the message comes from gpu's L1 cache controller, do the bypass.But how to achieve this? Now I know ,in the in_port() l2 cache states are MT and SS, and in_msg.type are L1_GETX、L1_GETS、L1_GET_INSTR、L1_UPGRADE,action

a_issueFetchToMemory is described to fetch data from memory,but i donnot know how to trigger a transition to execute this action.Is my idea about how to bypass right?

Look forward to your reply,thank you!

Hao

在 2016年6月8日星期三 UTC+8上午12:11:28，Jason Lowe-Power写道：

Jason Lowe-Power

unread,

Jun 16, 2016, 11:30:29 AM6/16/16

to 7933...@qq.com, gem5-gpu Developers List, haoba...@gmail.com

Hi Hao,

I think that sounds reasonable. Check out http://gem5.org/SLICC. That should help you understand how SLICC works. To execute an action, you have to have a transition from the current state you are in, which transitions on the event you trigger. I think the wiki documentation will help make that more clear.

Jason

7933...@qq.com

unread,

Jul 3, 2016, 10:07:58 PM7/3/16

to gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com, ja...@lowepower.com

Hi Jason,

After modified these *.sm file and scons successfully, I ran a lot of GPU benchmarks.These benchmarks ran successfully.when I checked the stats.txt ,I found only the backprop's results has changed,the results of the other benchmarks did not change except host_inst_rate、host_op_rate、host_tick_rate、host_mem_usage and host_seconds.I donnot know what is wrong?
Thanks，
Hao
在 2016年6月16日星期四 UTC+8下午11:30:29，Jason Lowe-Power写道：

Jason Lowe-Power

unread,

Jul 5, 2016, 11:57:48 AM7/5/16

to 7933...@qq.com, gem5-gpu Developers List, haoba...@gmail.com

Well, it could be that your changes don't actually have much of an effect ;). I would first double check that you're running the configurations you expect by inspecting the config.ini file. Then, I would double check that you're using the right binary. Finally, I'd look into all of the stats, especially the Ruby stats, to see if the changes I made to the SLICC files were being triggered or not.

Hope this helps some.

Jason

7933...@qq.com

unread,

Jul 15, 2016, 5:36:33 AM7/15/16

to gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com, ja...@lowepower.com

Hi Jason,

Your reply is very useful to me,thank you very much!!!
I remodified these *.sm files, and all the benchmark's stats.txt have changed.

Thanks ,
Hao

在 2016年7月5日星期二 UTC+8下午11:57:48，Jason Lowe-Power写道：

Message has been deleted

Super fall

unread,

Mar 15, 2017, 10:21:57 AM3/15/17

to gem5-gpu Developers List

Hello guys,

Can I ask you how is a protocol defined as being heterogeneous in this simulator?

and is MESI_Two_Level heterogeneous in this manner?

regards,

Rokneddin

Trinayan Baruah

unread,

Mar 15, 2017, 8:33:43 PM3/15/17

to Super fall, gem5-gpu Developers List

A protocol is heterogeneous if it treats the cpu and gpu cores differently. MESI Two level does not so it is not heterogeneous although you can surely model a heterogeneous system with it if you don't care about the specific aspects. gem5-gpu heterogeneous protocol is VI hammer with VI on GPU and MOESI hammer on CPU side. I think what it does is treat the GPU L1's differently than the CPU.

Shashank Hegde

unread,

Apr 8, 2017, 8:42:12 PM4/8/17

to gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com, ja...@lowepower.com

My apologies about redundant questions, but my doubt is: When we look at RubySlicc debug excerpt I see all "Sender: " as L1Cache-* if I want to tag L1 of CPU and GPU separately, what is a good place to do so? Basically, I'm trying to partition GPU access to LLC with some bias based on CPU LLC miss rates.

Thanks

Jason Lowe-Power

unread,

Apr 11, 2017, 5:51:12 PM4/11/17

to Shashank Hegde, gem5-gpu Developers List, 7933...@qq.com, haoba...@gmail.com

Hi,

You can modify the CPU L1 cache protocol file (CPUCache.sm in VI_hammer) and the GPU L2 cache protocol file (GPUL2cache.sm in VI_hammer). You can modify the protocol to send an extra bit which says "this is from the GPU".

Alternatively, if you're using a homogeneous protocol where the GPU and CPU are using the same cache controllers, you can modify the RubySequencer. Here, you would need to add a parameter to the SimObject which says "I am a GPU". Then, when the RubySequence enqueues a request to the cache, you would add an additional field to the RubyRequest message which says "From GPU". Finally, you can propagate that bit throughout the cache hierarchy as you see fit.

Jason

shijianliu

unread,

Apr 12, 2017, 2:26:32 AM4/12/17

to gem5-gpu Developers List, shasha...@gmail.com, haoba...@gmail.com

Hello Jason

I also have some confusion about this question. Now I use the MESI_Two_Level homogeneous protocol, cpu and gpu have the same cache controllers, so I add a parameter(cpu_gpu) to the controller(by modified the declaration in the MESI_Two_Level-L1cache.sm ) ,then through the configuration file(gem5-gpu/gem5/configs/ruby/MESI_Two_Level.py and gem5-gpu/gem5-gpu/configs/gpu_protocol/MESI_Two_Level_fusion.py), I set different values to this parameter,then I can distinguish gpu L1 or cpu l1 cache controller through the value. Then in MSI_Two_Level-msg.sm, I add one flag(C_G) in RequestMsg structure,so in the MSI_Two_Level-L1cache.sm,when it issue the GETS to L2,according to the value of the parameter(cpu_gpu), I set different value to the flag(out_msg.C_G),then in l2cache.sm in_port(), it can receive the in_msg.C_G. to jugdge cpu or gpu l1 cache, is it right? i don’t use the RubySequencer or RubyRequest, so I’m not sure what I did is right or wrong?

look forward your reply

thanks

在 2017年4月12日星期三 UTC+8上午5:51:12，Jason Lowe-Power写道：

Jason Lowe-Power

unread,

Apr 12, 2017, 10:08:29 AM4/12/17

to shijianliu, gem5-gpu Developers List, shasha...@gmail.com, haoba...@gmail.com

Sorry. Yes, everything you described is correct. You can do it without modifying the sequencers. It's been a while since I tried to do something like this :).

Cheers,
Jason

venk...@umn.edu

unread,

Aug 1, 2017, 7:00:02 PM8/1/17

to gem5-gpu Developers List, gao19...@163.com, shasha...@gmail.com, haoba...@gmail.com

Hi,

I want to bypass L1 cache for CPU and I am using MESI_Two_Level Protocol in ruby-slicc memory. I am trying to make changes in MESI_Two_Level-L1Cache.sm. Can it be done with making changes to Load Store event transitions alone? Or am I missing something that I need to do as well?

Thanks,

Hari

Jason Lowe-Power

unread,

Aug 2, 2017, 11:53:24 AM8/2/17

to venk...@umn.edu, gem5-gpu Developers List, gao19...@163.com, shasha...@gmail.com, haoba...@gmail.com

Hi Hari,

You can probably make the changes in MESI_Two_Level-L1cache.sm, but you'll probably have to add new transitions, new events, and new actions. I think what you'll want to do is add a new parameter (e.g., "is_gpu"). Then, in the in_port for the mandatory queue, if the controller type "is_gpu", then trigger a "bypass" event. Finally, you will need to add the appropriate transitions and actions to support bypassing.

I hope this outline helps.

Jason

Harigem5gpu

unread,

Aug 16, 2017, 4:28:08 AM8/16/17

to gem5-gpu Developers List, venk...@umn.edu, gao19...@163.com, shasha...@gmail.com, haoba...@gmail.com

Hi Jason,

Thanks for your reply. I made some changes and I see all the transitions happening as per my requirement but somewhere the program aborts saying "panic: Tried to execute unmapped address 0.@ tick 2535500" I am not able to recognize the bug here as my ProtocolTrace and RubySlicc debug flags shows proper expected transitions till the program aborts. I made my own coherence request, actions and states for the above.

Actually, my requirement is to bypass L1 cache for all data access to l2, no data must be cached in L1.

Is there a simpler way to do this?

Thank you,

Hari

Jason Lowe-Power

unread,

Aug 19, 2017, 9:38:21 PM8/19/17

to Harigem5gpu, gem5-gpu Developers List, venk...@umn.edu, gao19...@163.com, shasha...@gmail.com, haoba...@gmail.com

That looks like you're getting the wrong data. This is surprising if you have access backing store on.

Jason

PS: sent from my phone. I'll have better access to email after Sept. 4th.

Message has been deleted

Jason Lowe-Power

unread,

Mar 28, 2018, 4:18:53 PM3/28/18

to ichenh...@gmail.com, gem5-gpu Developers List

On Wed, Mar 28, 2018, 4:15 PM <ichenh...@gmail.com> wrote:

Hi, Jason

I have some questions about running gem5-gpu.
1) '--caches --l2cache' will use classic coherence model, '--Ruby' will use Ruby. Which will be used when the two flags are not appear in command line?

Using se/fs.py? Then no caches at all.

2) When I build gem5-gpu with MESI_Two_Level, command line does not contain '--Ruby'. Which coherence model will be used?

The coherence protocol is whatever you compiled. If you run without --ruby then there are no caches.

3) '--split' means CPU and GPU share one physical DRAM, the DRAM is divided into two regions one of which is allocated to GPU? Like NUMA.

No. See the implementation of cudamalloc in the libcuda implementation. It just allocates memory.

4) '--access-host-pagetable' means CPU and GPU share one physical DRAM supporting unified virtual access? Like APU's hUMA.

Yes.

Chen

ichenh...@gmail.com

unread,

Apr 5, 2018, 12:32:17 AM4/5/18

to gem5-gpu Developers List

Hi, Jason. Thank you for reply.

Actually, I use se_fusion.py or fs_fusion.py.

I don't think I understand you exactly about the second answer. Do you mean I must use --Ruby when use the protocols supported by ruby?

Matteo M. Fusi

unread,

Apr 11, 2018, 9:20:39 AM4/11/18

to gem5-gpu Developers List

If you use the default versions of fs_fusion.py or se_fusion.py, then Ruby model will be used by default. If you see the code of both of them you'll find the assignment options.ruby = True . This line of code imposes that the system will be built using Ruby.

Best regards,

Matteo

Reply all

Reply to author

Forward