best way to express bit rotation?

0 views
Skip to first unread message

Jason Sachs

unread,
Nov 11, 2017, 2:16:32 PM11/11/17
to Numba Public Discussion - Public
What's the best way to express bit rotation in numba?

The accepted way to do so in C is described in https://blog.regehr.org/archives/1063

uint32_t rotl32c (uint32_t x, uint32_t n)
{
  assert (n<32);
  return (x<<n) | (x>>(-n&31));
}

and variants like this are very efficient in C, especially for fixed values of n:

#include <stdint.h>

uint64_t rol1(uint64_t x)
{
return (x << 1) | (x >> 63);
}

uint64_t ror1(uint64_t x)
{
return (x >> 1) | (x << 63);
}

uint64_t ror3(uint64_t x)
{
return (x >> 3) | (x << 61);
}

which yields (see https://godbolt.org/g/7wkZ36):

rol1(unsigned long):
mov rax, rdi
rol rax
ret
ror1(unsigned long):
mov rax, rdi
ror rax
ret
ror3(unsigned long):
mov rax, rdi
ror rax, 3
ret

but if I try to do it in numba, I get this:

@numba.njit(numba.u8(numba.u8))
def rol1(x):
    return (x << 1) | (x >> 63)

print rol1.inspect_asm().values()[0]

	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 11
	.globl	___main__.rol1$1.uint64
	.align	4, 0x90
___main__.rol1$1.uint64:
	leaq	(%rcx,%rcx), %rax
	sarq	$63, %rcx
	orq	%rcx, %rax
	movq	%rax, (%rdi)
	xorl	%eax, %eax
	retq


I suppose with a very short function like that it doesn't matter due to the interfacing between Python and numba-compiled code, but if the rotate code is within a hot loop in larger section of numba-compiled code, then it would helpful to understand how to write it efficiently.
Reply all
Reply to author
Forward
0 new messages