About SIMD code

14 views
Skip to first unread message

daniel.tian

unread,
Nov 7, 2011, 3:52:48 AM11/7/11
to MV5sim
Hi, Jiayuan:
I am confused by some basic concept. Is the SIMD core based on
alpha ISA with additional SIMD instructions? or the SIMD code is pure
SIMD instructions core?

With the default cmd:
build/ALPHA_SE/m5.fast configs/fractal/fractal_smp.py --rootdir=. --
bindir=./api/binsimd_blocktask/ --simd=True --CpuFrequency=1.00GHz --
DcacheAssoc=16 --DcacheBanks=4 --DcacheBlkSize=32 --DcacheHWPFdegree=1
--DcacheHWPFpolicy=none --DcacheLookupLatency=2ns --
DcachePropagateLatency=2ns --DcacheRepl=LRU --DcacheSize=16kB --
IcacheAssoc=4 --IcacheBlkSize=32 --IcacheHWPFdegree=1 --
IcacheHWPFpolicy=none --IcacheSize=16kB --IcacheUseCacti=False --
L2NetBandWidthMbps=456000 --L2NetFrequency=300MHz --
L2NetPortOutQueueLength=4 --L2NetRouterBufferSize=256 --
L2NetRoutingLatency=1ns --L2NetTimeOfFlight=13t --L2NetType=FullyNoC --
L2NetWormHole=True --MemNetBandWidthMbps=128000 --
MemNetFrequency=266MHz --MemNetPortOutQueueLength=4 --
MemNetRouterBufferSize=2048 --MemNetTimeOfFlight=130t --
benchmark=FILTER --l2Assoc=16 --l2Banks=16 --l2BlkSize=128 --
l2HWPFDataOnly=False --l2HWPFdegree=1 --l2HWPFpolicy=none --l2MSHRs=64
--l2Repl=LRU --l2Size=4096kB --l2TgtsPerMSHR=32 --l2lookupLatency=2ns
--l2propagateLatency=12ns --l2tol1ratio=2 --localAddrPolicy=1 --
maxThreadBlockSize=0 --numHWTCs=16 --numSWTCs=2 --numcpus=4 --
physmemLatency=50ns --physmemSize=1024MB --portLookup=0 --
protocol=mesi --randStackOffset=True --restoreContextDelay=0 --
retryDcacheDelay=10 --stackAlloc=3 --switchOnDataAcc=True --warpSize=8

1. There is SIMD core. But how many SIMD core exist?
2. Which part of the code , like in filter.cpp, will run on SIMD
core? Is the kernel part?
3. Why nthread all always return 0 in "int nthreads =
smp_query_hwthreads();" ?

Thank you very much.
Xiaonan

Jiayuan Meng

unread,
Nov 7, 2011, 10:22:58 PM11/7/11
to mv5...@googlegroups.com
> Is the SIMD core based on
> alpha ISA with additional SIMD instructions? or the SIMD code is pure
> SIMD instructions core?

SIMD core is based on alpha ISA with additional SIMD instructions (for
control flow divergence, which are usually inserted by SIMD
compilers).

>
>    With the default cmd:
> build/ALPHA_SE/m5.fast  configs/fractal/fractal_smp.py --rootdir=. --
> bindir=./api/binsimd_blocktask/ --simd=True --CpuFrequency=1.00GHz --
> DcacheAssoc=16 --DcacheBanks=4 --DcacheBlkSize=32 --DcacheHWPFdegree=1
> --DcacheHWPFpolicy=none --DcacheLookupLatency=2ns --
> DcachePropagateLatency=2ns --DcacheRepl=LRU --DcacheSize=16kB --
> IcacheAssoc=4 --IcacheBlkSize=32 --IcacheHWPFdegree=1 --
> IcacheHWPFpolicy=none --IcacheSize=16kB --IcacheUseCacti=False --
> L2NetBandWidthMbps=456000 --L2NetFrequency=300MHz --
> L2NetPortOutQueueLength=4 --L2NetRouterBufferSize=256 --
> L2NetRoutingLatency=1ns --L2NetTimeOfFlight=13t --L2NetType=FullyNoC --
> L2NetWormHole=True --MemNetBandWidthMbps=128000 --
> MemNetFrequency=266MHz --MemNetPortOutQueueLength=4 --
> MemNetRouterBufferSize=2048 --MemNetTimeOfFlight=130t --
> benchmark=FILTER --l2Assoc=16 --l2Banks=16 --l2BlkSize=128 --
> l2HWPFDataOnly=False --l2HWPFdegree=1 --l2HWPFpolicy=none --l2MSHRs=64
> --l2Repl=LRU --l2Size=4096kB --l2TgtsPerMSHR=32 --l2lookupLatency=2ns
> --l2propagateLatency=12ns --l2tol1ratio=2 --localAddrPolicy=1 --
> maxThreadBlockSize=0 --numHWTCs=16 --numSWTCs=2 --numcpus=4 --
> physmemLatency=50ns --physmemSize=1024MB --portLookup=0 --
> protocol=mesi --randStackOffset=True --restoreContextDelay=0 --
> retryDcacheDelay=10 --stackAlloc=3 --switchOnDataAcc=True --warpSize=8
>
> 1. There is SIMD core. But how many SIMD core exist?

if you specify simd=True, any core with hardware thread contexts
(numHWTCs>0) will be SIMD cores.

In this case, all four cores are SIMD-capable. Each core has 16
hardware thread contexts for SIMD execution, and they are grouped into
two warps (because warpSize=8). Also, each core has two software
thread contexts, which are just there to execute the single-threaded
sequential portion of the code.

> 2. Which part of the code , like in filter.cpp, will run on SIMD
> core?  Is the kernel part?

Yes, the kernel part will be executed in SIMD

> 3. Why nthread all always return 0 in  "int nthreads =
> smp_query_hwthreads();"  ?

The number of hwthreads is only counted when HW thread contexts are
used (i.e. registered in the emulated system). If you call it after
"launch()", then it shouldn't be 0 anymore.

Jiayuan

Reply all
Reply to author
Forward
0 new messages