Hi Mike,
LDC can bind LLVM intrinsics to D functions if their signature is expressable
in D. Like for memcpy:
pragma(intrinsic, "llvm.memcpy.i32")
void llvm_memcpy_i32(void* dst, void* src, uint len, uint alignment);
As far as I remember, the problem with the SSE intrinsics is that there's no D
type that matches llvm's vector types
(http://www.llvm.org/docs/LangRef.html#t_vector). And that means getting
explicit access to the intrinsic from D code will not be easy.
I think Tomas looked into this at some point, maybe he can offer some advice
to get you started.
Note that the LLVM optimizer should already make use of the intrinsics.
Regards,
Christian
The problem was indeed introducing the vector types into the D type system.
Fixed-size array seems like a perfect fit, but in D1 they are treated
as reference types (and are not returnable).
I never found the best way to go about it... and adding a completely
new type into the frontend seemed like a lot of work
Tomas
LLVM provides these x86 simd intrinsics by default:
"llvm.x86.mmx.emms",
"llvm.x86.mmx.femms",
"llvm.x86.mmx.maskmovq",
"llvm.x86.mmx.movnt.dq",
"llvm.x86.mmx.packssdw",
"llvm.x86.mmx.packsswb",
"llvm.x86.mmx.packuswb",
"llvm.x86.mmx.padds.b",
"llvm.x86.mmx.padds.w",
"llvm.x86.mmx.paddus.b",
"llvm.x86.mmx.paddus.w",
"llvm.x86.mmx.pavg.b",
"llvm.x86.mmx.pavg.w",
"llvm.x86.mmx.pcmpeq.b",
"llvm.x86.mmx.pcmpeq.d",
"llvm.x86.mmx.pcmpeq.w",
"llvm.x86.mmx.pcmpgt.b",
"llvm.x86.mmx.pcmpgt.d",
"llvm.x86.mmx.pcmpgt.w",
"llvm.x86.mmx.pmadd.wd",
"llvm.x86.mmx.pmaxs.w",
"llvm.x86.mmx.pmaxu.b",
"llvm.x86.mmx.pmins.w",
"llvm.x86.mmx.pminu.b",
"llvm.x86.mmx.pmovmskb",
"llvm.x86.mmx.pmulh.w",
"llvm.x86.mmx.pmulhu.w",
"llvm.x86.mmx.pmulu.dq",
"llvm.x86.mmx.psad.bw",
"llvm.x86.mmx.psll.d",
"llvm.x86.mmx.psll.q",
"llvm.x86.mmx.psll.w",
"llvm.x86.mmx.pslli.d",
"llvm.x86.mmx.pslli.q",
"llvm.x86.mmx.pslli.w",
"llvm.x86.mmx.psra.d",
"llvm.x86.mmx.psra.w",
"llvm.x86.mmx.psrai.d",
"llvm.x86.mmx.psrai.w",
"llvm.x86.mmx.psrl.d",
"llvm.x86.mmx.psrl.q",
"llvm.x86.mmx.psrl.w",
"llvm.x86.mmx.psrli.d",
"llvm.x86.mmx.psrli.q",
"llvm.x86.mmx.psrli.w",
"llvm.x86.mmx.psubs.b",
"llvm.x86.mmx.psubs.w",
"llvm.x86.mmx.psubus.b",
"llvm.x86.mmx.psubus.w",
"llvm.x86.sse2.add.sd",
"llvm.x86.sse2.clflush",
"llvm.x86.sse2.cmp.pd",
"llvm.x86.sse2.cmp.sd",
"llvm.x86.sse2.comieq.sd",
"llvm.x86.sse2.comige.sd",
"llvm.x86.sse2.comigt.sd",
"llvm.x86.sse2.comile.sd",
"llvm.x86.sse2.comilt.sd",
"llvm.x86.sse2.comineq.sd",
"llvm.x86.sse2.cvtdq2pd",
"llvm.x86.sse2.cvtdq2ps",
"llvm.x86.sse2.cvtpd2dq",
"llvm.x86.sse2.cvtpd2ps",
"llvm.x86.sse2.cvtps2dq",
"llvm.x86.sse2.cvtps2pd",
"llvm.x86.sse2.cvtsd2si",
"llvm.x86.sse2.cvtsd2si64",
"llvm.x86.sse2.cvtsd2ss",
"llvm.x86.sse2.cvtsi2sd",
"llvm.x86.sse2.cvtsi642sd",
"llvm.x86.sse2.cvtss2sd",
"llvm.x86.sse2.cvttpd2dq",
"llvm.x86.sse2.cvttps2dq",
"llvm.x86.sse2.cvttsd2si",
"llvm.x86.sse2.cvttsd2si64",
"llvm.x86.sse2.div.sd",
"llvm.x86.sse2.lfence",
"llvm.x86.sse2.loadu.dq",
"llvm.x86.sse2.loadu.pd",
"llvm.x86.sse2.maskmov.dqu",
"llvm.x86.sse2.max.pd",
"llvm.x86.sse2.max.sd",
"llvm.x86.sse2.mfence",
"llvm.x86.sse2.min.pd",
"llvm.x86.sse2.min.sd",
"llvm.x86.sse2.movmsk.pd",
"llvm.x86.sse2.movnt.dq",
"llvm.x86.sse2.movnt.i",
"llvm.x86.sse2.movnt.pd",
"llvm.x86.sse2.mul.sd",
"llvm.x86.sse2.packssdw.128",
"llvm.x86.sse2.packsswb.128",
"llvm.x86.sse2.packuswb.128",
"llvm.x86.sse2.padds.b",
"llvm.x86.sse2.padds.w",
"llvm.x86.sse2.paddus.b",
"llvm.x86.sse2.paddus.w",
"llvm.x86.sse2.pavg.b",
"llvm.x86.sse2.pavg.w",
"llvm.x86.sse2.pcmpeq.b",
"llvm.x86.sse2.pcmpeq.d",
"llvm.x86.sse2.pcmpeq.w",
"llvm.x86.sse2.pcmpgt.b",
"llvm.x86.sse2.pcmpgt.d",
"llvm.x86.sse2.pcmpgt.w",
"llvm.x86.sse2.pmadd.wd",
"llvm.x86.sse2.pmaxs.w",
"llvm.x86.sse2.pmaxu.b",
"llvm.x86.sse2.pmins.w",
"llvm.x86.sse2.pminu.b",
"llvm.x86.sse2.pmovmskb.128",
"llvm.x86.sse2.pmulh.w",
"llvm.x86.sse2.pmulhu.w",
"llvm.x86.sse2.pmulu.dq",
"llvm.x86.sse2.psad.bw",
"llvm.x86.sse2.psll.d",
"llvm.x86.sse2.psll.dq",
"llvm.x86.sse2.psll.dq.bs",
"llvm.x86.sse2.psll.q",
"llvm.x86.sse2.psll.w",
"llvm.x86.sse2.pslli.d",
"llvm.x86.sse2.pslli.q",
"llvm.x86.sse2.pslli.w",
"llvm.x86.sse2.psra.d",
"llvm.x86.sse2.psra.w",
"llvm.x86.sse2.psrai.d",
"llvm.x86.sse2.psrai.w",
"llvm.x86.sse2.psrl.d",
"llvm.x86.sse2.psrl.dq",
"llvm.x86.sse2.psrl.dq.bs",
"llvm.x86.sse2.psrl.q",
"llvm.x86.sse2.psrl.w",
"llvm.x86.sse2.psrli.d",
"llvm.x86.sse2.psrli.q",
"llvm.x86.sse2.psrli.w",
"llvm.x86.sse2.psubs.b",
"llvm.x86.sse2.psubs.w",
"llvm.x86.sse2.psubus.b",
"llvm.x86.sse2.psubus.w",
"llvm.x86.sse2.sqrt.pd",
"llvm.x86.sse2.sqrt.sd",
"llvm.x86.sse2.storel.dq",
"llvm.x86.sse2.storeu.dq",
"llvm.x86.sse2.storeu.pd",
"llvm.x86.sse2.sub.sd",
"llvm.x86.sse2.ucomieq.sd",
"llvm.x86.sse2.ucomige.sd",
"llvm.x86.sse2.ucomigt.sd",
"llvm.x86.sse2.ucomile.sd",
"llvm.x86.sse2.ucomilt.sd",
"llvm.x86.sse2.ucomineq.sd",
"llvm.x86.sse3.addsub.pd",
"llvm.x86.sse3.addsub.ps",
"llvm.x86.sse3.hadd.pd",
"llvm.x86.sse3.hadd.ps",
"llvm.x86.sse3.hsub.pd",
"llvm.x86.sse3.hsub.ps",
"llvm.x86.sse3.ldu.dq",
"llvm.x86.sse3.monitor",
"llvm.x86.sse3.mwait",
"llvm.x86.sse41.blendpd",
"llvm.x86.sse41.blendps",
"llvm.x86.sse41.blendvpd",
"llvm.x86.sse41.blendvps",
"llvm.x86.sse41.dppd",
"llvm.x86.sse41.dpps",
"llvm.x86.sse41.extractps",
"llvm.x86.sse41.insertps",
"llvm.x86.sse41.movntdqa",
"llvm.x86.sse41.mpsadbw",
"llvm.x86.sse41.packusdw",
"llvm.x86.sse41.pblendvb",
"llvm.x86.sse41.pblendw",
"llvm.x86.sse41.pcmpeqq",
"llvm.x86.sse41.pextrb",
"llvm.x86.sse41.pextrd",
"llvm.x86.sse41.pextrq",
"llvm.x86.sse41.phminposuw",
"llvm.x86.sse41.pmaxsb",
"llvm.x86.sse41.pmaxsd",
"llvm.x86.sse41.pmaxud",
"llvm.x86.sse41.pmaxuw",
"llvm.x86.sse41.pminsb",
"llvm.x86.sse41.pminsd",
"llvm.x86.sse41.pminud",
"llvm.x86.sse41.pminuw",
"llvm.x86.sse41.pmovsxbd",
"llvm.x86.sse41.pmovsxbq",
"llvm.x86.sse41.pmovsxbw",
"llvm.x86.sse41.pmovsxdq",
"llvm.x86.sse41.pmovsxwd",
"llvm.x86.sse41.pmovsxwq",
"llvm.x86.sse41.pmovzxbd",
"llvm.x86.sse41.pmovzxbq",
"llvm.x86.sse41.pmovzxbw",
"llvm.x86.sse41.pmovzxdq",
"llvm.x86.sse41.pmovzxwd",
"llvm.x86.sse41.pmovzxwq",
"llvm.x86.sse41.pmuldq",
"llvm.x86.sse41.pmulld",
"llvm.x86.sse41.ptestc",
"llvm.x86.sse41.ptestnzc",
"llvm.x86.sse41.ptestz",
"llvm.x86.sse41.round.pd",
"llvm.x86.sse41.round.ps",
"llvm.x86.sse41.round.sd",
"llvm.x86.sse41.round.ss",
"llvm.x86.sse42.crc32.16",
"llvm.x86.sse42.crc32.32",
"llvm.x86.sse42.crc32.64",
"llvm.x86.sse42.crc32.8",
"llvm.x86.sse42.pcmpestri128",
"llvm.x86.sse42.pcmpestria128",
"llvm.x86.sse42.pcmpestric128",
"llvm.x86.sse42.pcmpestrio128",
"llvm.x86.sse42.pcmpestris128",
"llvm.x86.sse42.pcmpestriz128",
"llvm.x86.sse42.pcmpestrm128",
"llvm.x86.sse42.pcmpgtq",
"llvm.x86.sse42.pcmpistri128",
"llvm.x86.sse42.pcmpistria128",
"llvm.x86.sse42.pcmpistric128",
"llvm.x86.sse42.pcmpistrio128",
"llvm.x86.sse42.pcmpistris128",
"llvm.x86.sse42.pcmpistriz128",
"llvm.x86.sse42.pcmpistrm128",
"llvm.x86.sse.add.ss",
"llvm.x86.sse.cmp.ps",
"llvm.x86.sse.cmp.ss",
"llvm.x86.sse.comieq.ss",
"llvm.x86.sse.comige.ss",
"llvm.x86.sse.comigt.ss",
"llvm.x86.sse.comile.ss",
"llvm.x86.sse.comilt.ss",
"llvm.x86.sse.comineq.ss",
"llvm.x86.sse.cvtpd2pi",
"llvm.x86.sse.cvtpi2pd",
"llvm.x86.sse.cvtpi2ps",
"llvm.x86.sse.cvtps2pi",
"llvm.x86.sse.cvtsi2ss",
"llvm.x86.sse.cvtsi642ss",
"llvm.x86.sse.cvtss2si",
"llvm.x86.sse.cvtss2si64",
"llvm.x86.sse.cvttpd2pi",
"llvm.x86.sse.cvttps2pi",
"llvm.x86.sse.cvttss2si",
"llvm.x86.sse.cvttss2si64",
"llvm.x86.sse.div.ss",
"llvm.x86.sse.ldmxcsr",
"llvm.x86.sse.loadu.ps",
"llvm.x86.sse.max.ps",
"llvm.x86.sse.max.ss",
"llvm.x86.sse.min.ps",
"llvm.x86.sse.min.ss",
"llvm.x86.sse.movmsk.ps",
"llvm.x86.sse.movnt.ps",
"llvm.x86.sse.mul.ss",
"llvm.x86.sse.rcp.ps",
"llvm.x86.sse.rcp.ss",
"llvm.x86.sse.rsqrt.ps",
"llvm.x86.sse.rsqrt.ss",
"llvm.x86.sse.sfence",
"llvm.x86.sse.sqrt.ps",
"llvm.x86.sse.sqrt.ss",
"llvm.x86.sse.stmxcsr",
"llvm.x86.sse.storeu.ps",
"llvm.x86.sse.sub.ss",
"llvm.x86.sse.ucomieq.ss",
"llvm.x86.sse.ucomige.ss",
"llvm.x86.sse.ucomigt.ss",
"llvm.x86.sse.ucomile.ss",
"llvm.x86.sse.ucomilt.ss",
"llvm.x86.sse.ucomineq.ss",
"llvm.x86.ssse3.pabs.b",
"llvm.x86.ssse3.pabs.b.128",
"llvm.x86.ssse3.pabs.d",
"llvm.x86.ssse3.pabs.d.128",
"llvm.x86.ssse3.pabs.w",
"llvm.x86.ssse3.pabs.w.128",
"llvm.x86.ssse3.palign.r",
"llvm.x86.ssse3.palign.r.128",
"llvm.x86.ssse3.phadd.d",
"llvm.x86.ssse3.phadd.d.128",
"llvm.x86.ssse3.phadd.sw",
"llvm.x86.ssse3.phadd.sw.128",
"llvm.x86.ssse3.phadd.w",
"llvm.x86.ssse3.phadd.w.128",
"llvm.x86.ssse3.phsub.d",
"llvm.x86.ssse3.phsub.d.128",
"llvm.x86.ssse3.phsub.sw",
"llvm.x86.ssse3.phsub.sw.128",
"llvm.x86.ssse3.phsub.w",
"llvm.x86.ssse3.phsub.w.128",
"llvm.x86.ssse3.pmadd.ub.sw",
"llvm.x86.ssse3.pmadd.ub.sw.128",
"llvm.x86.ssse3.pmul.hr.sw",
"llvm.x86.ssse3.pmul.hr.sw.128",
"llvm.x86.ssse3.pshuf.b",
"llvm.x86.ssse3.pshuf.b.128",
"llvm.x86.ssse3.psign.b",
"llvm.x86.ssse3.psign.b.128",
"llvm.x86.ssse3.psign.d",
"llvm.x86.ssse3.psign.d.128",
"llvm.x86.ssse3.psign.w",
"llvm.x86.ssse3.psign.w.128",