Caches

Lukasz G. Szafaryn

unread,

Mar 30, 2010, 8:48:56 PM3/30/10

to MV5sim

I am trying to understand how cache coherence object works in MV5. If
I have a system with coherent L1 and L2, when and where (what other
object) is the coherence object called from? What objects are
contacted as it checks other L1s. Is it directory? What file is it
located in? I know that caches in MV5 are based on BaseCache object.
However there is no reference to coherence object there, neither is
anywhere in interconnect objects.

Also, if I decide to add L2 to each core (we can assume that its
private), can I just connect it between L1 and the interconnect? I was
not able to do it. Functions that connect CPU to L1 caches and L1
caches to the network are not port-port and it seems like they would
not accept another level of cache there. Do I have to change the
interface in CPU object?

Jiayuan Meng

unread,

Mar 31, 2010, 2:58:43 AM3/31/10

to mv5...@googlegroups.com

On Tue, Mar 30, 2010 at 8:48 PM, Lukasz G. Szafaryn <lukasz.g...@gmail.com> wrote:

I am trying to understand how cache coherence object works in MV5. If
I have a system with coherent L1 and L2, when and where (what other
object) is the coherence object called from?

in mem/cache/dirbased/cache_imple.hh, when a cache receives a cpu-side request (either cpu towards L1, or L1 towards L2), it calls timingAccess(). The coherence protocol is called by protocol->fromUpStream(), which will later call blkState->triggerCpuSide(), which in turn calls coherence protocol-specific functions such as MSIBlkState::Shared_Read() (in mem/cache/dircache/protocol/msi/blk_state.hh).

When the cache receives a memory-side packet (L2->L1, or mem->L2), it calls handleResponse(), which then calls protocol->fromDownStream(), which then calls blkState->triggerMemSide().

What objects are
contacted as it checks other L1s. Is it directory?

yes. The directory that keeps all L1 information is located in the centralized L2.

What file is it
located in? I know that caches in MV5 are based on BaseCache object.
However there is no reference to coherence object there, neither is
anywhere in interconnect objects.

You will find in mem/cache/dirbased/cache.hh, that the directory based coherent cache takes two template classes: BlkState and DirState. They jointly defines a coherence protocol. The directories are hosted by the DirState. To be exact, a MESI coherence protocol among L1s is constructed by setting the BlkState of L1 to be MESIBlkState and the DirState of the L2 to be MESIDirState. If the L2 connects to mem, its BlkState is then BaseBlkState (no coherence). If L2 connects to another L3 and it needs MSI coherence, its BlkState is then set to MSIBlkState, and the L3's DirState is set to MSIDirState.

Also, if I decide to add L2 to each core (we can assume that its
private), can I just connect it between L1 and the interconnect?

Yes.

I was
not able to do it. Functions that connect CPU to L1 caches and L1
caches to the network are not port-port and it seems like they would
not accept another level of cache there. Do I have to change the
interface in CPU object?

I don't think so. You can connect the CPU to L1 caches using the old way. I think you can also connect L1's memory side port to L2's cpu side port and they should be able to interact. What makes you think it doesn't work?

Jiayuan

Lukasz G. Szafaryn

unread,

Apr 13, 2010, 7:25:19 PM4/13/10

to MV5sim

Where (in configuration file?) do you specify what BlkState and
DirState are for different levels of caches?

I saw that in your 3-level cache configuration file, you defined new
objects (L2wk, L2ctrl) and you have them working as parts of System
object (System.L2wk, System.L2ctrl). I know how you defined these
objects, but how did you modify System object to work with them (if
you even modified it)? I believe that I need to create new object for
distributed cache slice, say L2slice. How do I modify System object,
so it takes L2slices as a new component?

On Mar 31, 2:58 am, Jiayuan Meng <jerryh...@gmail.com> wrote:
> On Tue, Mar 30, 2010 at 8:48 PM, Lukasz G. Szafaryn <
>

Lukasz G. Szafaryn

unread,

Apr 13, 2010, 7:43:23 PM4/13/10

to MV5sim

Is MV5 using the updated MOESI protocol that was added in version
2.0b4 of M5? What coherence features did you add in MV5 beyond that
(directory coherence?)?

On Mar 31, 2:58 am, Jiayuan Meng <jerryh...@gmail.com> wrote:

> On Tue, Mar 30, 2010 at 8:48 PM, Lukasz G. Szafaryn <
>

Lukasz G. Szafaryn

unread,

Apr 13, 2010, 9:37:50 PM4/13/10

to MV5sim

I was able to follow how CPU-L1 coherence requests goes through
several files based on what you described. However, I still cant
figure out the general hierarchy of cache files (especially after
looking at how files #include one another in a circular fashion). Is
there any order in which they call each other? At least, what is the
top level file? Maybe I could go from there if I knew that. Maybe you
can answer this when you come in, or help me make a diagram of it.

On Mar 31, 2:58 am, Jiayuan Meng <jerryh...@gmail.com> wrote:

> On Tue, Mar 30, 2010 at 8:48 PM, Lukasz G. Szafaryn <
>

Jiayuan Meng

unread,

Apr 14, 2010, 1:25:53 AM4/14/10

to mv5...@googlegroups.com

Where (in configuration file?) do you specify what BlkState and
DirState are for different levels of caches?

in configs/fractal/frCommons.py, you will see something like:

if options.protocol=="msi":

block_protocol = 1 # MSI

That's how I define blkState and dirState.

I saw that in your 3-level cache configuration file, you defined new
objects (L2wk, L2ctrl) and you have them working as parts of System
object (System.L2wk, System.L2ctrl). I know how you defined these
objects, but how did you modify System object to work with them (if
you even modified it)? I believe that I need to create new object for
distributed cache slice, say L2slice. How do I modify System object,
so it takes L2slices as a new component?

I don't have to modify the System object. I only introduce new objects and include them into the system. Say, you create an object named "L2slice", you can just include it in the system as

system.l2slice = L2slice()

--
To unsubscribe, reply using "remove me" as the subject.

Jiayuan Meng

unread,

Apr 14, 2010, 1:27:10 AM4/14/10

to mv5...@googlegroups.com

Is MV5 using the updated MOESI protocol that was added in version
2.0b4 of M5? What coherence features did you add in MV5 beyond that
(directory coherence?)?

No, MV5 does not use the coherence protocol in M5, because I believe that was snoopy protocols without directories.

Jiayuan Meng

unread,

Apr 14, 2010, 1:37:22 AM4/14/10

to mv5...@googlegroups.com

sure.

in src/mem/cache/dirbased/:

cache_impl.hh, Cache::timingAccess()

calls

protocol/base_coherence_impl.hh

Coherence::fromUpStream()

calls

protocol/base_state.hh

BaseBlkState::triggerCpuSide()

calls

protocol/mesi (depends on the protocol)

BlkState::Shared_ReadEx(...) (depends on the current state and the message)

Jiayuan

Lukasz G. Szafaryn

unread,

Apr 26, 2010, 6:23:06 PM4/26/10

to MV5sim

I think I connected L2 slices properly. In each core I have
connections such as: CPU-L1-bus-L2slice. Each such core connects to a
crossbar or mesh. Would you confirm that in this setup I have
coherence between L1-L2slice at each core and L2slice(s) from
different cores? Does your directory coherence work with this setup
the way I connected it?

If the above works, I just need to implement data interleaving
properly. Where (in what files) would I specify data interleaving? I
may not be aware what all is involved in implementing distributed L2,
is there something else? I thought there must be, since you were not
able to implement fully distributed cache in limited time. What was
the reason?

Are the directories in your implementation distributed or centralized?
Where are they located conceptually (in each I L1 and D L1, somewhere
else for each core or in some central place for all cores) and
physically (what files correspond to these)?

--
Subscription settings: http://groups.google.com/group/mv5sim/subscribe?hl=en

Jiayuan Meng

unread,

Apr 27, 2010, 12:14:03 PM4/27/10

to mv5...@googlegroups.com

I think I connected L2 slices properly. In each core I have
connections such as: CPU-L1-bus-L2slice. Each such core connects to a
crossbar or mesh. Would you confirm that in this setup I have
coherence between L1-L2slice at each core and L2slice(s) from
different cores? Does your directory coherence work with this setup
the way I connected it?

Yes, assume by L2slice you meant a per-core L2 cache.

If the above works, I just need to implement data interleaving
properly. Where (in what files) would I specify data interleaving? I
may not be aware what all is involved in implementing distributed L2,
is there something else? I thought there must be, since you were not
able to implement fully distributed cache in limited time. What was
the reason?

Not sure what you meant by data interleaving. I didn't implement fully distributed cache because of limited time.

Are the directories in your implementation distributed or centralized?
Where are they located conceptually (in each I L1 and D L1, somewhere
else for each core or in some central place for all cores) and
physically (what files correspond to these)?

The directories are centralized and they are located at L2 for coherence among L1 caches. Physically, they are located at src/mem/cache/dirbased/protocol/blkstate.cc&dirstate.cc

Jiayuan

Lukasz G. Szafaryn

unread,

May 11, 2010, 12:47:42 PM5/11/10

to MV5sim

You mentioned that after initial discovery stage, each cache builds a
table with addresses of other caches. What file/function is that table
located it?

Also, you said that there is code (probably in lru.cc or lru.hh) that
determines what range of addresses each cache imports. What file/
function is it located it?

Jiayuan Meng

unread,

May 11, 2010, 1:10:25 PM5/11/10

to mv5...@googlegroups.com

Hi Lukasz:

The following is related to constructing the port map:
in src/mem/cache/dirbased/base_dircache.cc
void
BaseDirCohCache::DirCohCachePort::recvStatusChange(Port::Status status) (the part of it with srcInfo)

The data structure that store the map is
BaseDirCohCache::portMap

More information about the port map: src/mem/cache/dirbased/portmap.hh|cc

---------------------------------------------

The getDeviceAddressRange function is related to set the address range for each port. However, since I haven't used it, I'm not sure exactly how it works, but it might relate to your work:

in src/mem/cache/dirbased/base_dircache.cc
void
BaseDirCohCache::DirCohCachePort::recvStatusChange(Port::Status status) (the part of it with RangeChange)

in src/mem/cache/dirbased/cache_impl.hh
template <class TagStore, class BlkCoherence, class DirCoherence>
void
DirCohCache<TagStore, BlkCoherence, DirCoherence>::CpuSidePort::
getDeviceAddressRanges

You might ask M5's mailing list on how to use the address ranges. My guess is like this (but I don't know how it actually looks in the code):

1. at the initial stage, each port has an address range
2. when a packet is sent to the interconnect, the interconnect figures out which port is responsible for the packet's address, given the information about address ranges of each port.
3. the interconnect send the packet to the appropriate port.

Hope that helps,

Jiayuan

Lukasz G. Szafaryn

unread,

Apr 20, 2011, 6:38:51 PM4/20/11

to mv5...@googlegroups.com

Jiayuan,

I am trying to implement directory-based coherent distributed L2 cache. In that model, each distributed L2 cache slice keeps its own directory (subscription list of which L1 that hold its data), so that it can invalidate them later. So far, I constructed mechanism that forwards request to appropriate distributed cache slice. It currently resides in the bus, because that is where all other redirecting takes place. Now, I wanted to leverage some of your code to build the directory in each distributed L2 slice.

I have questions about your model:

1) what logical module was the directory located in? was it L2 cache?

2) where in code, (what file and function) is the directory included?

Thanks in advance,

Lukasz

Jiayuan Meng

unread,

Apr 20, 2011, 8:56:42 PM4/20/11

to mv5...@googlegroups.com

Hi Lukasz,

The coherence structures are specified by the "protocol" variable in a DirCohCache. The protocol basically is the directory structure. it's located in src/mem/cache/dirbased/protocol/mesi

You can refer to the ISPASS tutorial in the MV5 website on how to configure directory-based caches.

You are right that the default CPUs connect L1s directly to interconnects with a python function call. But there is no such rule. You can add your own function to the python BaseCPU class so that the L1s connect to an L2. There is no need to modify the CPU's C++ class though.

Jiayuan

Lukasz G. Szafaryn

unread,

Apr 27, 2011, 11:48:15 PM4/27/11

to mv5...@googlegroups.com

Hey Jiayuan,

As you know, I have been working with M5 due to compatibility issues. I will try to move entire directory-based cache from MV5 to M5. Have you tried to do that? What issues would you anticipate?

First, I tried to run directory-based cache in MV5 to see how it works. There is a class BaseDirCohCache, but I do not see you using it in any of the python configuration files. How do you use it?

Lukasz

From: mv5...@googlegroups.com [mailto:mv5...@googlegroups.com] On Behalf Of Jiayuan Meng
Sent: Wednesday, 20 April, 2011 20:57
To: mv5...@googlegroups.com
Subject: Re: Caches

Hi Lukasz,

Jiayuan Meng

unread,

Apr 28, 2011, 10:20:54 AM4/28/11

to mv5...@googlegroups.com

Hi Lukasz,

I haven't tried that yet. To use coherent caches, you use the BaseCache class defined in BaseCache.py and set "directory_based" to true. The cache_builder.cc will take that parameter can create a DirCohCache class using the specified coherence protocol. See the definition of L1 and L2 caches in configs/fractal/frCommon.py for an example.

Jiayuan

Lukasz G. Szafaryn

unread,

Oct 12, 2011, 2:07:11 PM10/12/11

to mv5...@googlegroups.com

Hey Jiayuan,

Since I can’t get your directory-based cache working with current M5, I am considering using MV5 and porting my distributed cache to it. Generally, and in terms of cache, what does MV5 not have compared to M5? I am trying to get a picture of the tradeoffs before I switch.

Jiayuan Meng

unread,

Oct 13, 2011, 12:36:00 PM10/13/11

to mv5...@googlegroups.com

Some uncacheable packets for full system simulations are not handled in MV5 caches. Before you switch, you can also take a look at the tutorial slides to see what are the limitations of MV5. The tutorial slides are downloadable at the tutorial page of https://sites.google.com/site/mv5sim/.

Reply all

Reply to author

Forward