Ok, the best way to understand this is if I go pretty in depth. So we
know from the bullet points that the cache is direct-mapped, which
means each set in the cache only contains 1 block. Each block is 8
bytes and the total size of the cache is 16 bytes, so that means that
there are 2 sets in the cache, each with one 8 byte block, to make up
the total size of 16 bytes.
When I did the hw problem (6.23) I made a chart like we had to for 6.21.
m (number of bits in the address): you can make this up depending on
how many sets and blocks you have, for this problem let's go with 8
C (cache size): this is given to us, its 16 bytes
B (block size): this is also given to us, its 8 bytes
E (number of lines per set): because we know this cache is direct mapped, this means there is 1 line per set
S (number of sets): 2 since we have a 16 byte cache and each set has 1 block of 8 bytes
A handy thing to remember is that C = B x E x S so you can find one from the others
t (number of tag index bits): the number of tag bits is just the bits left over after the set and offset bits, so we have 8 - (3 + 1) = 4 tag bits
s (number of set index bits): since we only have 2 sets, they will be labeled as set 0 and set 1, so we only need 1 bit
b (number of block offset bits): since each block only holds 8 bytes, we will need offset values for 0-7, so we'll need 3 bits
What I did next was looked at the addresses for the rows of each array in memory. We know that src starts at 0 and dst starts at 16. Each array is made up of 2 rows, with 2 elements in each row for a total of 16 bytes. So this is what it looks like in memory
0 : 00000000 : src[0][0]
4 : 00000100
8 : 00001000 : src[1][0]
12 : 00001100
16 : 00010000 : dst[0][0]
20 : 00010100
24 : 00011000 : dst[1][0]
28 : 00011100
Next we know that the order of the bits in each address looks like
t bits : s bits : b bits
Let start on the loop:
First we will need row 0 from src, if we look at the 4th bit, its a 0, so it will be placed in set 0 in the cache, a miss. Next we need row 0 from dst, the 4th bit from that address is also 0 so row 0 of src in set 0 will be overwritten with dst row 0, a miss. Next we need row 0 from src again and we already know it belongs in set 0 and since dst row 0 is now there, it will be overwritten with row 0 from src, another miss. Next we need row 1 from dst and the 4th bit from that one is a 1, so it belongs in set 1 and since set 1 is still empty since nothing has been put in it yet, dst row 1 will be put there and its another miss.
Now we need row 1 from src and we see that the 4th bit there is a 1 so it belongs in set 1, but dst row 1 is already in set 1, so it will be overwritten with src row 1 and it will be a miss. Next we need dst row 0 which we know belongs in set 0 and src row 0 is still in set 0, so it will be overwritten with dst row 0 and we have another miss. Now we need to get src row 1, which belongs in set 1, we check set 1 and see that it is still there since we have not accessed anything that needed to be placed in set 1 since the last time we needed src row 1, so now we finally have a
hit. Last, we need dst row 1, which we know belongs in set 1, so it will overwrite src row 1 that is there and we have our last and final miss.
I hope that helps, it took me a while to figure it all out, but when I did all of that for 6.23 it made alot of sense. Let me know if you have any questions
Addison