SMT support and mechanistic performance model

Yatish Turakhia

unread,

May 25, 2013, 11:53:56 PM5/25/13

to snip...@googlegroups.com

Hello,

I have following two questions related to Sniper simulator:

1. Does the current version 5.0 support SMT? What does "perf_model/core/logical_cpus" in sniper config files indicate?

2. The Sniper manual says that the simulator uses a mechanistic core model. Where can I read more about this core model used in Sniper? Is there any existing literature based on this?

Thanks in advance,

Yatish

Wim Heirman

unread,

May 26, 2013, 3:35:28 AM5/26/13

to snip...@googlegroups.com

Yatish,

Sniper does not support SMT. We have been working on something
internally, which is why the configuration variable exists, but it's
not yet working well enough for release so far.

The mechanistic core model means interval simulation. You can read all
the details in this publication [1].

Regards,
Wim

[1] http://snipersim.org/w/Paper:Hpca2010Genbrugge

> --
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Sniper simulator" group.
> To post to this group, send email to snip...@googlegroups.com
> To unsubscribe from this group, send email to
> snipersim+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/snipersim?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Sniper simulator" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to snipersim+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

T

unread,

Jun 20, 2014, 2:00:35 PM6/20/14

to snip...@googlegroups.com

Has the SMT feature been released? And is there a we to model node to node network? or is it only a NoC model that is available.

Thanks,

T

Wim Heirman

unread,

Jun 20, 2014, 2:32:08 PM6/20/14

to snip...@googlegroups.com

Yes, SMT modeling is possible in Sniper 6.0 using the ROB (instruction window centric) model.

We have examples for bus-based multi-socket machines (Gainestown), and NoC/mesh-based single-chip processors (KNC/Tilera-like). There is no support to model non-coherent multi-node setups.

Regards,

Wim

For more options, visit https://groups.google.com/d/optout.

Ani

unread,

Oct 4, 2015, 11:04:25 PM10/4/15

to Sniper simulator

Hi,

I am using Sniper 6.1, and I don't see any differences in terms of IPC between enabling SMT and pre-emptive pinned scheduling of multithreaded programs on a single core. I have a multi-threaded program that spawns two threads and I run it on a nehalem configuration using a single logical CPU and using 2 logical cpus (SMT 2). I am using the ROB model by passing the -c rob flag to run-sniper. Is there another way to run SMT simulations?

Thanks

Wim Heirman

unread,

Oct 5, 2015, 3:03:01 AM10/5/15

to snip...@googlegroups.com

Ani,

Can you post your complete run-sniper command line for both cases?

Thanks,

Wim

Ani

unread,

Oct 5, 2015, 10:49:52 AM10/5/15

to Sniper simulator

Ah I see. I think I understand what was going on.

I was adding the -c rob model and just changing the number of logical cpus. I think to get the preemptive scheduling of multithreaded programs on a single core single thread with Out-of-order pipeline, the core model in rob.cfg should be commented or changed to interval. To get the SMT model, the core model in rob.cfg should be rob and the number of logical cpus in nehalem.cfg should be increased or just add the smt models.

Is this correct? I think it is because I can see the IPC difference and the trend in IPC change seems to be correct as well.

Wim Heirman

unread,

Oct 5, 2015, 10:54:46 AM10/5/15

to snip...@googlegroups.com

No, you should be able to get both with the ROB model. If you use the interval model in one experiment, you'll get differences in how the core is modeled in addition to enabling/disabling SMT.

Did you increase the number of cores (-n or --general/total_cores)? To have a single core with two threads, you need to set -n 2 ("cores" in Sniper usually represent hardware contexts).

-Wim

Ani

unread,

Oct 5, 2015, 1:27:06 PM10/5/15

to Sniper simulator

Ok. So, let me recap what I did.

I have a multi-threaded application that spawns two threads.

First, I wanted to see what happens when the two threads are scheduled on a single thread single out-of-order core. In other words SMT 1 and therefore the number of logical cpus is set to 1. Below is the command line and relevant config

./run-sniper -c gainestown -c llc-qbs -c glp -c rob -s markers:stats --roi --no-cache-warming -d gd-interval-model/ -- ~/Programs/build/main-multithread

nehalem.cfg (gainestown includes nehalem)
-------------------
[perf_model/core]
logical_cpus = 1
type = interval
core_model=nehalem

rob.cfg (notice I uncommented the type to use interval model but included to use OoO)
[perf_model/core]
#type = rob

[perf_model/core/rob_timer]
in_order = false
issue_contention = true
mlp_histogram = false           # Collect histogram of memory-level parallelism (slow)
issue_memops_at_issue = true    # Issue memops to the memory hierarchy at issue time (false = before dispatch)
outstanding_loads = 48
outstanding_stores = 32
store_to_load_forwarding = true # Forward data to loads from stores that are still in the store buffer
address_disambiguation = true   # Allow loads to bypass preceding stores with an unknown address
rob_repartition = true          # For SMT model with static ROB partitioning, whether to repartition the ROB
                                # across all active threads (true), or keep everyone fixed at a 1/nthreads share (false)
simultaneous_issue = true       # Whether two different threads can execute in a single cycle. true = simultaneous multi-threading, false = fine-grained multi-threading
commit_width = 128              # Commit bandwidth (instructions per cycle), per SMT thread
rs_entries = 36

# When issue_memops_at_issue is enabled, memory issue times will be correct and the memory subsystem can enable more detailed modeling
[perf_model/l1_dcache]
outstanding_misses = 10

Now, to enable SMT-2, I changed the number of logical CPUs to 2 in nehalem.cfg, and changed the core_model from interval to rob in nehalem.cfg and used the same rob.cfg. I saw a marked performance improvement from the previous one and this makes sense to me.

What you are suggesting is adding -n 2 to the run-sniper command line? My understanding was -n 2 would double the number of cores. Therefore I would have 2 cores with SMT 2. But my intention is to model a single core with SMT 2. In other words, what is the difference between logical_cpus=2 and -n 2?

Wim Heirman

unread,

Oct 6, 2015, 7:07:32 AM10/6/15

to snip...@googlegroups.com

-n2 gives you two hardware contexts. When logical_cores=1, that means two cores, when logical_cores=2, it means one core with two threads. See the top of sim.out which should list CoreX/ThreadY when SMT is enabled.

Don't mix rob and interval runs (try a single-threaded non-SMT run with interval vs. rob and you'll see the difference).

-Wim

Ani

unread,

Oct 10, 2015, 6:45:38 PM10/10/15

to Sniper simulator

I see. Yes, I see the difference between single threaded run with interval vs rob and I see the difference in IPC. So, is the rob model a more accurate model compared to interval for both in-order and OoO cores? Which one woul you suggest to use and some cases when one would be used over the other? Thanks

Wim Heirman

unread,

Oct 11, 2015, 6:58:29 AM10/11/15

to snip...@googlegroups.com

In general, interval is faster but somewhat less accurate. It also does not support in-order or SMT, you need the ROB model if you want any of these features. For some more background on core models you can read: http://dl.acm.org/citation.cfm?id=2629677

-Wim

Reply all

Reply to author

Forward