[LLVMbugs] [Bug 12359] New: byte shuffles generate inefficient code for swapping in zeros with sse3

0 views

Skip to first unread message

bugzill...@llvm.org

unread,

Mar 26, 2012, 11:25:44 AM3/26/12

to llvm...@cs.uiuc.edu

http://llvm.org/bugs/show_bug.cgi?id=12359

Bug #: 12359
Summary: byte shuffles generate inefficient code for swapping
in zeros with sse3
Product: libraries
Version: trunk
Platform: PC
OS/Version: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
AssignedTo: unassig...@nondot.org
ReportedBy: sro...@vmware.com
CC: llvm...@cs.uiuc.edu
Classification: Unclassified

This code:
define <16 x i8> @shuf(<16 x i8> %inval1) {
entry:
%0 = shufflevector <16 x i8> %inval1, <16 x i8> zeroinitializer, <16 x i32>
<i32 0, i32 4, i32 3, i32 2, i32 16, i32 16, i32 3, i32 4, i32 0, i32 4, i32 3,
i32 2, i32 16, i32 16, i32 3, i32 4>
ret <16 x i8> %0
}

gets compiled to:
pxor %xmm1, %xmm1
pshufb .LCPI0_0(%rip), %xmm1
pshufb .LCPI0_1(%rip), %xmm0
por %xmm1, %xmm0
ret

(I didn't include the .LCPI constants here, but note that it will put 0x80 in
them for bytes which will come from the "other" vector so the pshufb results
can be ored.)

This is inefficient, since all values taken from the zeroinitializer vector
(i.e. zeros) could instead be directly encoded in the first pshufb, so this
code could just be (with the exact same constant even):
pshufb .LCPI0_1(%rip), %xmm0
ret

--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
LLVM...@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply all

Reply to author

Forward

0 new messages