Cache miss rates

688 views
Skip to first unread message

Yatish Turakhia

unread,
Jul 5, 2012, 4:40:40 PM7/5/12
to snip...@googlegroups.com
Hi

I tried running FFT simulation with large input size on sniper and I got the following output:

                              | Core 0    
  Instructions                |  2420169315
  Cycles                      |  1699774353
  Time                        |   639012915
Branch predictor stats        |           
  num correct                 |   190323630
  num incorrect               |    11118704
  misprediction rate          |       5.52%
  mpki                        |        4.59
Cache Summary                 |           
  Cache L1-I                  |           
    num cache accesses        |  2562792676
    num cache misses          |       18545
    miss rate                 |       0.00%
    mpki                      |        0.01
  Cache L1-D                  |           
    num cache accesses        |   594631405
    num cache misses          |    43991144
    miss rate                 |       7.40%
    mpki                      |       18.18
  Cache L2                    |           
    num cache accesses        |    44009689
    num cache misses          |    23507894
    miss rate                 |      53.42%
    mpki                      |        9.71
  Cache L3                    |           
    num cache accesses        |    14723638
    num cache misses          |    14696137
    miss rate                 |      99.81%
    mpki                      |        6.07
DRAM summary                  |           
  num dram accesses           |    19991597
  average dram access latency |       70.30
  average dram queueing delay |       16.87


I was surprised to see such high cache miss rates for L2 and L3 caches. The values are even higher for small input sets. Is there any option of warming up the cache in sniper?

Wim Heirman

unread,
Jul 9, 2012, 3:46:27 AM7/9/12
to snip...@googlegroups.com
Hi Yatish,

What is the input set size you used here? Is this the application in
sniper/test/fft, or your own FFT application? Does it have ROI markers
(SimRoiBegin/SimRoiEnd), and did you specify the --roi command-line
parameter?

By default the caches are warmed up during the period before the
region of interest, so usually, if the data is initialized there,
parts of it will remain in the cache. Warmup can be disabled but this
has to be done explicitly by adding the --no-cache-warming parameter
to the run-sniper command line. If you want to warm up the caches
further by for instance doing a complete iteration of FFT before
starting a second, timed iteration, you can move the SimRoiStart
marker in the code.

Regards,
Wim
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Sniper simulator" group.
> To post to this group, send email to snip...@googlegroups.com
> To unsubscribe from this group, send email to
> snipersim+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/snipersim?hl=en

Yatish Turakhia

unread,
Jul 9, 2012, 10:38:15 PM7/9/12
to snip...@googlegroups.com
Hi,

So I was using Splash2-FFT with small input set (it already has the
ROI markers). The miss rates are high even for large input set. The L1
misses are low (<0.1%) but can you still explain why L2 miss rates be
so high?

Thanks!

-Yatish

Siddharth Garg

unread,
Jul 9, 2012, 10:42:47 PM7/9/12
to snip...@googlegroups.com
Hi Yatish

I think its because the L1 cache is in effect working so well. The
only time there is a miss in the L1 cache is when a data item has not
been seen before, which is therefore a miss in L2 as well. Once the
data is in the L1 its used repeatedly, adding to the denominator of
the L1 cache miss rate but not the L2.

My prediction is if you make the L1 very small, its hit rate will
increase significantly but that of the L2 will go down.

Wim Heirman

unread,
Jul 10, 2012, 5:22:42 AM7/10/12
to snip...@googlegroups.com
Yes that sounds logical. If the L1 is big enough such that it never
evicts anything, the L2 will only see cold misses but barely any
repeat loads of the same data (hence very little hits). You're better
to look at MPKI (misses per 1000 instructions) as this at least
indicates a performance impact, which a % miss rate can't do.

-Wim

Wafa Benboubaker

unread,
Jul 10, 2012, 4:56:01 PM7/10/12
to snip...@googlegroups.com
Hi,
how to make L1 biger what is the command line??

Trevor Carlson

unread,
Jul 11, 2012, 8:56:58 AM7/11/12
to snip...@googlegroups.com
Wafa,

    If you take a look at Chapter 5 in the Sniper manual [1], you'll see the options available for configuring the memory hierarchy in Sniper. You can then edit a configuration file (*.cfg) or use a command-line option to change the settings of the memory hierarchy.  See Chapter 4 in the manual [1] for more details with that.

Please ask if you have any other questions,
Trevor

[1] http://snipersim.org/w/Manual

--
--
You received this message because you are subscribed to the Google
Groups "Sniper simulator" group.
To post to this group, send email to snip...@googlegroups.com
To unsubscribe from this group, send email to
snipersim+...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/snipersim?hl=en

----------------------------------------------
Trevor E. Carlson
Computer Systems Laboratory, Ghent University
Electronics and Information Systems Department
mobile: +32 474 545 987

Wafa Benboubaker

unread,
Jul 11, 2012, 10:34:35 AM7/11/12
to snip...@googlegroups.com
Hi
 thanks for answer but when i run my application i want to see the result of simulation but it 's the same when i run ./main and the file sim.out is not generated?and i want to see the diffrence when i choose to set for example 2 or 3 level of memory cache and how to configure for a shared or private cache??
please answer me
i'm thankful for your help and i wish you the best

2012/7/11 Trevor Carlson <trevor....@elis.ugent.be>

Trevor E. Carlson

unread,
Jul 11, 2012, 2:50:03 PM7/11/12
to snip...@googlegroups.com
Wafa,

    The application running inside of Sniper (./run-sniper -- ./main) and outside of Sniper (./main) should perform in the same way.  The main difference is that when running inside of Sniper, we perform the timing simulation of the code to generate the sim.out file.

    To configure the caches, take a look at Chapter 5 - Configuration parameters, Section 5.1.2 - Caches, in the manual [1]. To change the number of cache levels, use the perf_model/cache/levels parameter. To change caches from shared to private, you can use the perf_model/l*_cache/shared_cores=1 parameter. Take a look at the sniper/config/nehalem.cfg file for a more detailed example.

-Trevor

[1] http://snipersim.org/w/Manual

Wafa Benboubaker

unread,
Jul 11, 2012, 5:27:48 PM7/11/12
to snip...@googlegroups.com
thanks for answer
but i still confused because when i run ./run-sniper -- ./main the file sim.out is not generated to see the result when you configure cache shared or private and when you configure another paramater.????
thanks Trevor

2012/7/11 Trevor E. Carlson <trevor....@elis.ugent.be>
328.png

Trevor E. Carlson

unread,
Jul 11, 2012, 5:34:11 PM7/11/12
to snip...@googlegroups.com
Wafa,

    Can you try running the test application to make sure that everything is working correctly.  Run 'cd sniper/test/fft' and then run 'make run' to compile and simulate the test program. This should create a sim.out file for you. If not, then there might be a problem somewhere else.

-Trevor

Wafa Benboubaker

unread,
Jul 11, 2012, 5:48:46 PM7/11/12
to snip...@googlegroups.com
Run 'cd sniper/test/fft' and then run 'make run'  works correctly and generate the file sim.out but for my application don't generate what should i do??
thanks Trevor
328.png

Trevor E. Carlson

unread,
Jul 11, 2012, 6:03:28 PM7/11/12
to snip...@googlegroups.com
Wafa,

    When I run a basic application, I see the following output.  Could you try running /bin/true and send the output that you see on the screen.  It should display ' [SNIPER] Setting instrumentation mode to DETAILED' (see details below).  Also, what files are added when you run (if any), and do you see any errors on your screen?

Trevor


Example output:

$ ./run-sniper -- /bin/true
[SNIPER] Start
...
[SNIPER] Enabling performance models
[SNIPER] Setting instrumentation mode to DETAILED
..
$ cat sim.out
                          | Core 0
  Instructions            |     112644
  Cycles                  |     275092
  Time                    |     103418
Branch predictor stats    |
...

Wafa Benboubaker

unread,
Jul 11, 2012, 6:14:35 PM7/11/12
to snip...@googlegroups.com
hi his the screen when i run ./run-sniper -- /bin/true
wafabenboubaker@wafabenboubaker-3000-N200:~/sniper-3.03$ ./run-sniper -- /bin/true
[SNIPER] Start
Running ['bash', '-c', 'export LD_LIBRARY_PATH="/home/wafabenboubaker/sniper-3.03/pin_kit/ia32/runtime:/home/wafabenboubaker/sniper-3.03/python_kit/lib:$LD_LIBRARY_PATH"; export PYTHONPATH=/home/wafabenboubaker/sniper-3.03/scripts:$PYTHONPATH; /home/wafabenboubaker/sniper-3.03/pin_kit/ia32/bin/pinbin -mt -injection child -xyzzy -enable_vsm 0 -t /home/wafabenboubaker/sniper-3.03/lib/pin_sim -c /home/wafabenboubaker/sniper-3.03/config/base.cfg --general/total_cores=1 --general/output_dir=. --config=/home/wafabenboubaker/sniper-3.03/config/nehalem.cfg --config=/home/wafabenboubaker/sniper-3.03/config/gainestown.cfg -- /bin/true']

[SNIPER] Enabling performance models
[SNIPER] Setting instrumentation mode to DETAILED
[SNIPER] Disabling performance models
[SNIPER] Leaving ROI after 5.75 seconds
[SNIPER] Simulated 0.1M instructions @ 20.1 KIPS (20.1 KIPS / target core - 49771.4ns/instr)
[SNIPER] Setting instrumentation mode to FAST_FORWARD
[SNIPER] End
[SNIPER] Elapsed time: 6.11 seconds
wafabenboubaker@wafabenboubaker-3000-N200:~/sniper-3.03$ cat sim.out
                              | Core 0   
  Instructions                |     115398
  Cycles                      |     260917
  Time                        |      98089
Branch predictor stats        |          
  num correct                 |      16690
  num incorrect               |       1587
  misprediction rate          |      8.68%
  mpki                        |      13.75
Cache Summary                 |          
  Cache L1-I                  |          
    num cache accesses        |     121533
    num cache misses          |        659
    miss rate                 |      0.54%
    mpki                      |       5.71
  Cache L1-D                  |          
    num cache accesses        |      55326
    num cache misses          |       1265
    miss rate                 |      2.29%
    mpki                      |      10.96
  Cache L2                    |          
    num cache accesses        |       1924
    num cache misses          |       1819
    miss rate                 |     94.54%
    mpki                      |      15.76
  Cache L3                    |          
    num cache accesses        |       1815
    num cache misses          |       1815
    miss rate                 |    100.00%
    mpki                      |      15.73
DRAM summary                  |          
  num dram accesses           |       1815
  average dram access latency |      68.83
  average dram queueing delay |      15.41
thanks Trevor

2012/7/12 Trevor E. Carlson <trevor....@elis.ugent.be>
328.png

Trevor E. Carlson

unread,
Jul 11, 2012, 6:16:24 PM7/11/12
to snip...@googlegroups.com
Wafa,

    Okay, this is good, that means that it is working properly. Now, what happens when you run your custom application in Sniper?

Trevor
Reply all
Reply to author
Forward
0 new messages