Hyperthreading - effect on Latency and Throughput

1,694 views
Skip to first unread message

Rajiv Chauhan

unread,
Jun 3, 2014, 9:11:43 AM6/3/14
to mechanica...@googlegroups.com
Where can I find more information about Hyperthreading and its effect on Latency and throughput? According to my understanding, HT divides a CPU in 2 logical processors and each processor competes CPU resources. Is my assumption correct?. So, it looks natural that this will have adverse effect on latency and it may improve throughput. Will there be any case when it can improve the latency but degrade the throughput?

Matt Godbolt

unread,
Jun 3, 2014, 5:14:50 PM6/3/14
to mechanica...@googlegroups.com
It's not clear how to answer your question about latency and throughput as that depends on many things. However, it's possible to make some points about how hyperthreading works (at least on Intel CPUs):
  • Instruction fetches, decoders etc are co-operatively shared between HTs. That is the fetch/decode path alternates between HT threads on each CPU tick. This halves the rate at which instructions can be input at the top of the pipe.
  • HT threads compete for L1/L2 and TLB space, and for branch prediction resources.
  • My guess is the micro-op cache is also shared competitively, and that micro-ops are streamed from it on alternating cycles when it's used as a source.
  • Micro-operations are scheduled agnostic to which HT stream they came from. Micro-ops run when their inputs are ready and when the scheduler decides is best regardless of which logical processor they're for
  • I'm not quite sure how instruction retirement is affected by HTs. I'd guess this is independent of HT thread too.
More information on what happens can be gleaned from Agner Fog's microarchitecture paper at www.agner.org/optimize/microarchitecture.pdf 

Latency is probably always a little worse when HT is on. Throughput may be improved only if you can multithread your workload and that the workload was previously resource limited by something other than CPU execution resources. If CPU resources are the limiting factor, having two independent instruction streams competing for them won't help with either throughput or latency.

- matt

On Tue, Jun 3, 2014 at 8:11 AM, Rajiv Chauhan <rajiv....@gmail.com> wrote:
Where can I find more information about Hyperthreading and its effect on Latency and throughput? According to my understanding, HT divides a CPU in 2 logical processors and each processor competes CPU resources. Is my assumption correct?. So, it looks natural that this will have adverse effect on latency and it may improve throughput. Will there be any case when it can improve the latency but degrade the throughput?

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nitsan Wakart

unread,
Jun 4, 2014, 5:51:40 AM6/4/14
to mechanica...@googlegroups.com
HT improves the latency of sharing data between the 2 threads sharing the same core as the 'closest' shared location is L1.

Martin Thompson

unread,
Jun 4, 2014, 6:31:06 AM6/4/14
to mechanica...@googlegroups.com
To give an overly simplistic answer. 

Enabling HT can increase latency for single threaded performance due to partitioning of some CPU resources such as the TLB cache. For parallel workloads it has the potential to increase throughput by 40-100% due to the potential for extra threads to run concurrently.

If your workload can be made parallel, and is more memory than CPU intensive, then latency can potentially be reduced. 

Basically there is no clear cut answer and you have to understand your workload. However the solution is simple: test and measure for your workload.
Reply all
Reply to author
Forward
0 new messages