The answer is I have no idea. Someone needs to get hold of a box with one of these and run some benchmarks.
One issue is that you want the scheduler (both the OS scheduler and the Haskell scheduler) to know about the architecture, so that it can preferentially use cores from distinct pairs first. We don't bother doing this with hyperthreading so far, because in most cases it seems that trying to use hyperthreaded cores with GHC doesn't work well, so we use real cores only and assume that the OS scheduler gives us real cores in preference (Which it usually does).
Cheers,
Simon
> The answer is I have no idea. Someone needs to get hold of a box with one of these and run some benchmarks.
>
> One issue is that you want the scheduler (both the OS scheduler and the Haskell scheduler) to know about the architecture, so that it can preferentially use cores from distinct pairs first. We don't bother doing this with hyperthreading so far, because in most cases it seems that trying to use hyperthreaded cores with GHC doesn't work well, so we use real cores only and assume that the OS scheduler gives us real cores in preference (Which it usually does).
My understanding from reading about the architecture over the last few
months is that it's all fine. It's not at all like hyperthreading.
Each core in the module has its own integer pipeline and L1 data
cache. The things that are shared are the FP units and instruction
decode and L1 instruction cache. In fact I understand that a good
strategy for the OS scheduler is to fill up both cores of a module,
especially if the two threads share memory.
So yes, of course someone needs to run benchmarks, but I think the
prospects are pretty positive. More worrying than the module sharing
is the increased cache and memory latency and the deeper pipelines so
greater branch misprediction penalties (but that just makes it slower
in a single threaded way, doesn't affect threading).
Duncan