Bit manipulation instructions

938 views
Skip to first unread message

Gnanasekar R

unread,
Jun 27, 2018, 12:01:53 AM6/27/18
to isa...@groups.riscv.org
In the RISCV ISA I dont see any bit manipulation instructions. E.g., clearing a bit cannot be done with a single instruction and currently would require more than one instruction to create a mask and then 'and' it. Is it being thought of or is there no plan to introduce such instructions. Just trying to understand the rationale behind not having such instructions.

In embedded applications usually there are many bit manipulation code. Not having such instruction may increase the code size considerably. On the mailing list archives I saw some old discussions on bit manipulation instructions. Is it being actively pursued? Does that mean RISCV is not widely being used in embedded space as of now(where code density also matters)?

Bruce Hoult

unread,
Jun 27, 2018, 12:05:59 AM6/27/18
to Gnanasekar R, RISC-V ISA Dev
Yes, there are plans to add bit manipulation instructions.

At the moment the best resource for possible instructions is probably https://github.com/cliffordwolf/xbitmanip


On Tue, Jun 26, 2018 at 9:01 PM, Gnanasekar R <gnanase...@gmail.com> wrote:
In the RISCV ISA I dont see any bit manipulation instructions. E.g., clearing a bit cannot be done with a single instruction and currently would require more than one instruction to create a mask and then 'and' it. Is it being thought of or is there no plan to introduce such instructions. Just trying to understand the rationale behind not having such instructions.

In embedded applications usually there are many bit manipulation code. Not having such instruction may increase the code size considerably. On the mailing list archives I saw some old discussions on bit manipulation instructions. Is it being actively pursued? Does that mean RISCV is not widely being used in embedded space as of now(where code density also matters)?

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CANUXKs-Bw3xC94AqPrvOq%2BhgZQ70Ci%2BU72o9mQuT6wy8O5_d8w%40mail.gmail.com.

Gnanasekar R

unread,
Jun 27, 2018, 12:14:56 AM6/27/18
to Bruce Hoult, RISC-V ISA Dev
Ok, thanks the for info. May I know what was the reason it is independently maintained and not part of the workgroup(as per the github link).

On 27 June 2018 at 09:35, Bruce Hoult <bruce...@sifive.com> wrote:
Yes, there are plans to add bit manipulation instructions.

At the moment the best resource for possible instructions is probably https://github.com/cliffordwolf/xbitmanip

On Tue, Jun 26, 2018 at 9:01 PM, Gnanasekar R <gnanase...@gmail.com> wrote:
In the RISCV ISA I dont see any bit manipulation instructions. E.g., clearing a bit cannot be done with a single instruction and currently would require more than one instruction to create a mask and then 'and' it. Is it being thought of or is there no plan to introduce such instructions. Just trying to understand the rationale behind not having such instructions.

In embedded applications usually there are many bit manipulation code. Not having such instruction may increase the code size considerably. On the mailing list archives I saw some old discussions on bit manipulation instructions. Is it being actively pursued? Does that mean RISCV is not widely being used in embedded space as of now(where code density also matters)?

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

Luke Kenneth Casson Leighton

unread,
Jun 27, 2018, 5:29:59 AM6/27/18
to Gnanasekar R, Bruce Hoult, RISC-V ISA Dev
On Wed, Jun 27, 2018 at 5:14 AM, Gnanasekar R <gnanase...@gmail.com> wrote:

> Ok, thanks the for info. May I know what was the reason it is independently
> maintained and not part of the workgroup(as per the github link).

an AMD employee was the chair of the WG. we do not have the full
details, but gathering information third-hand from different sources,
it *appears* - from the outside - that AMD was allowed to join (or
participate in) the RISC-V Foundation without having signed something
to do with patent indemnification. once they were told "um no you
really do have to agree to the same conditions as every other RISC-V
Member", AMD pulled out of the Foundation.

this resulted in the immediate termination of the Working Group
(without notice). documentation and access by all members to the
bitmanip WG's mailing list was terminated without notice. it took
some effort and complaints to re-gain access to the documentation that
had been developed and stored on the RISC-V proprietary server
infrastructure.

lots of lessons there, for everyone involved in *open* collaborative
development of RISC-V.

l.

Bruce Hoult

unread,
Jun 27, 2018, 12:54:42 PM6/27/18
to Gnanasekar R, RISC-V ISA Dev
On a 32 bit RISC-V, setting an arbitrary statically-known bit in a register can be done with two instructions (or one if it's in the lowest 12 bits), and clearing a bit with three instructions (or one if it's in the lowest 12 bits). Both are no more than six bytes of code except for clearing the MSB, which takes eight bytes.

Of course that's worse than having a dedicated instruction (which would certainly be a four byte instruction), but it's not awful and will probably have a pretty minor impact on the size of any real program that is big enough that you're worried about it fitting into a ROM. The execution time is also unlikely to be a problem.


The dynamic case takes three or four instructions (and eight or ten bytes of code, respectively), but can usefully be done as a function which would save a bit of size if the compiler doesn't have to shuffle the arguments around too much.

So I think bit setting and clearing (and bitfield extraction and insertion) are nice-to-have but not really essential.

Things such as clz and byte swapping seem much more likely to be a problem for programs that need them

On Tue, Jun 26, 2018 at 9:01 PM, Gnanasekar R <gnanase...@gmail.com> wrote:
In the RISCV ISA I dont see any bit manipulation instructions. E.g., clearing a bit cannot be done with a single instruction and currently would require more than one instruction to create a mask and then 'and' it. Is it being thought of or is there no plan to introduce such instructions. Just trying to understand the rationale behind not having such instructions.

In embedded applications usually there are many bit manipulation code. Not having such instruction may increase the code size considerably. On the mailing list archives I saw some old discussions on bit manipulation instructions. Is it being actively pursued? Does that mean RISCV is not widely being used in embedded space as of now(where code density also matters)?

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

Clifford Wolf

unread,
Jun 27, 2018, 1:02:04 PM6/27/18
to Bruce Hoult, Gnanasekar R, RISC-V ISA Dev
Hi,


On Wed, Jun 27, 2018, 09:54 Bruce Hoult <bruce...@sifive.com> wrote:
 (or one if it's in the lowest 12 bits), and clearing a bit with three instructions (or one if it's in the lowest 12 bits).

Minor nitpick: lowest 11 bits. (Immediates are always sign-extended. So you can't just have bit 11 set in the immediate of an i-type instruction.)

Bruce Hoult

unread,
Jun 27, 2018, 1:14:50 PM6/27/18
to Clifford Wolf, Gnanasekar R, RISC-V ISA Dev
You are correct.

Setting or clearing bit 11 is I think best done by lui 0xfffff and then adding either -2048 or -2047 before the and or or. Both of which take eight bytes of instructions, the same as for bit 31.

kr...@berkeley.edu

unread,
Jun 27, 2018, 10:28:50 PM6/27/18
to Bruce Hoult, Gnanasekar R, RISC-V ISA Dev

>>>>> On Wed, 27 Jun 2018 09:54:39 -0700, Bruce Hoult <bruce...@sifive.com> said:
| Both are no more than six bytes of code except for clearing the MSB, which
| takes eight bytes.

# clears MSB in four bytes if rd in x8-x15
c.slli x8, x8, 1
c.srli x8, x8, 1

Krste

Bruce Hoult

unread,
Jun 27, 2018, 10:53:05 PM6/27/18
to Krste Asanovic, Gnanasekar R, RISC-V ISA Dev
Good point.

Someone should teach that to gcc because...

unsigned clr31_1(unsigned i){return i & 0x7fffffff;}
unsigned clr31_2(unsigned i){return (i<<1)>>1;}

... both produce the same code and it's ten bytes ...

00000000 <clr31_1>:
   0: 800007b7          lui a5,0x80000
   4: fff7c793          not a5,a5
   8: 8d7d                and a0,a0,a5
   a: 8082                ret

0000000c <clr31_2>:
   c: 800007b7          lui a5,0x80000
  10: fff7c793          not a5,a5
  14: 8d7d                and a0,a0,a5
  16: 8082                ret

(my 8 bytes was two shifts, but stupidly forgetting immediate shifts were compressible)

Gnanasekar R

unread,
Jun 28, 2018, 12:07:49 AM6/28/18
to Bruce Hoult, Krste Asanovic, RISC-V ISA Dev
Thanks for the inputs. I do understand that setting/clearing can be done with 4,6 or 8 bytes depending on the bit position. But my point is, it can never be done with 2 bytes due to lack of a instruction whereas it is possible in other architectures. And if the code does a lot of bit(field) manipulation then it contributes to additional code size because the instruction size is now doubled/tripled or more. We do have a lot of operations like this which is showing increase in code size compared to our current architecture. May be it is specific case but having clear/set instruction does makes sense to me. I agree on the clz and swapping part too.

Bruce Hoult

unread,
Jun 28, 2018, 12:45:55 AM6/28/18
to Gnanasekar R, Krste Asanovic, RISC-V ISA Dev
I'm curious which architecture you're thinking of that can set or clear an arbitrary bit in a 32 bit variable in a register with a 2-byte instruction.

i386 and amd64 both use a 5-byte instruction
Thumb2, ARM, and Aarch64 all use a 4-byte instruction
sh4 needs two 2-byte instruction, plus a PC-relative 2 byte literal
m68k uses a 4-byte instruction

Ok, avr8 can always use a 2-byte instruction because every 8 bit chunk of a larger variable is in a different register e.g. 90 62 ori r25, 0x20  or  9f 7d andi r25, 0xDF

And I guess the same is true of 6502, 6800, and I guess 8080/z80 too.

I don't think we're really trying to compete against them.

Samuel Falvo II

unread,
Jun 28, 2018, 12:54:12 AM6/28/18
to Bruce Hoult, Gnanasekar R, Krste Asanovic, RISC-V ISA Dev
On Wed, Jun 27, 2018 at 9:45 PM, Bruce Hoult <bruce...@sifive.com> wrote:
> I don't think we're really trying to compete against them.

You could always play games with how memory is decoded as well. This
works regardless of ISA capabilities. Many ARM MCUs, for example,
provide the ability to directly set or clear bits in memory through
specialized regions of the memory map. E.g., if you have a 64KB chunk
of memory mapped from $10000-$1FFFF, then it's conceivable that
$100000-$17FFFF contains 1-bit mirrors of every bit in the 64KB RAM
bank, letting you set or clear these bits depending on the value
stored there. I forget what this technique is called, but it has a
specific name which escapes me at the moment and Google's not helpful
right now.

--
Samuel A. Falvo II

Peter Ashenden

unread,
Jun 28, 2018, 12:56:48 AM6/28/18
to isa...@groups.riscv.org
It's called bit banding
Peter Ashenden, CTO IC Design, ASTC

Krste Asanovic

unread,
Jun 28, 2018, 1:08:45 AM6/28/18
to Peter Ashenden, isa...@groups.riscv.org
We use AMOs in A extension to atomically set/clear memory bits on RISC-V without needing memory-map tricks.

The AMOs can reduce instruction count to set/clear bits in memory-mapped device registers, with atomicity to allow use when registers are shared by different interrupt routines for example.

Krste


-- 
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Jim Wilson

unread,
Jun 28, 2018, 1:51:44 AM6/28/18
to Bruce Hoult, Krste Asanovic, Gnanasekar R, RISC-V ISA Dev
On Wed, Jun 27, 2018 at 7:53 PM, Bruce Hoult <bruce...@sifive.com> wrote:
> Someone should teach that to gcc because...
> unsigned clr31_1(unsigned i){return i & 0x7fffffff;}
> unsigned clr31_2(unsigned i){return (i<<1)>>1;}
> ... both produce the same code and it's ten bytes ...
> 00000000 <clr31_1>:
> 0: 800007b7 lui a5,0x80000
> 4: fff7c793 not a5,a5
> 8: 8d7d and a0,a0,a5
> a: 8082 ret
> ...

Yes, this looks like a generic problem. We canonicalize shifts to
logical operations when we can, but we aren't converting back when
that would help. Some targets have an instruction that can do this
and so are OK. Sparc has and-not so can do it in two instructions.
MIPS and RISC-V take 3 instructions. We do get the 0xffff and 0xff
cases right, as we have special patterns for zero extending short and
char. This should be easy enough to fix for RISC-V by adding a
combiner pattern to emit two shifts for and-not w/ a constant when
that would be faster. I'll take a look at implementing this.

Jim

Gnanasekar R

unread,
Jun 28, 2018, 4:10:36 AM6/28/18
to Bruce Hoult, Krste Asanovic, RISC-V ISA Dev

Luke Kenneth Casson Leighton

unread,
Jun 28, 2018, 4:25:10 AM6/28/18
to Bruce Hoult, Gnanasekar R, Krste Asanovic, RISC-V ISA Dev
On Thu, Jun 28, 2018 at 5:45 AM, Bruce Hoult <bruce...@sifive.com> wrote:

> I'm curious which architecture you're thinking of that can set or clear an
> arbitrary bit in a 32 bit variable in a register with a 2-byte instruction.
>
> i386 and amd64 both use a 5-byte instruction
> Thumb2, ARM, and Aarch64 all use a 4-byte instruction

http://users.ece.utexas.edu/~valvano/EE345M/CortexM3InstructionSet.pdf
table 6.1, p68.

that's 1 instruction (damn important ones on an EC) oh! i am
guessing, gnanasekar, you meant "4,6, or 8 instructions" not "4,6 or 8
bytes"?

> On Wed, Jun 27, 2018 at 9:07 PM, Gnanasekar R <gnanase...@gmail.com>
> wrote:
>>
>> Thanks for the inputs. I do understand that setting/clearing can be done
>> with 4,6 or 8 bytes depending on the bit position. But my point is, it can

Samuel Falvo wrote:

> Many ARM MCUs, for example,
> provide the ability to directly set or clear bits in memory through
> specialized regions of the memory map.

RVV does something similar (which is really neat), you can configure
the vector CSRs all at once into the most common configurations by
writing to a single CSR... and then for anything out-of-the-ordinary
follow up with further (specific) writes to other CSRs (addresses).

however, samuel, these are generally for memory-mapped peripherals,
not for general-purpose use. i'd find it very strange if a virtual
(or real) memory page was re-mapped to 32 or 64 times its actual size
(into a bit-field) in *general-purpose* memory. aside from anything
it would require the memory-accessing side of the memory bus
architecture to have 5 or 6 extra bits, internally. still, it's a
neat idea, not to be totally ruled out.

l.

Luke Kenneth Casson Leighton

unread,
Jun 28, 2018, 4:31:34 AM6/28/18
to Gnanasekar R, Bruce Hoult, Krste Asanovic, RISC-V ISA Dev
On Thu, Jun 28, 2018 at 9:10 AM, Gnanasekar R <gnanase...@gmail.com> wrote:

> ARC has that I believe.
>
> http://me.bios.io/images/d/dd/ARCompactISA_ProgrammersReference.pdf

TST_S (page 98), CMP_S (99), BCLR_S/BTST_S/BMSK_S (102) - they all
use a reduced set of registers. format's listed on p161.

i guess you did mean 2-byte :)

l.

Luke Kenneth Casson Leighton

unread,
Jun 28, 2018, 4:56:02 AM6/28/18
to Gnanasekar R, Bruce Hoult, Krste Asanovic, RISC-V ISA Dev
On Thu, Jun 28, 2018 at 9:31 AM, Luke Kenneth Casson Leighton
<lk...@lkcl.net> wrote:

> i guess you did mean 2-byte :)

https://github.com/cliffordwolf/xbitmanip/blob/master/xbitmanip-draft.pdf

(at time of writing) section 2.10 p19, xBitManip Compressed Instructions

recommends c.not, c.neg and c.brev which fit neatly into brownfield
space of C.LUI (and still have one more single-reg opcode spare).
further analysis on p28, section 4.1.

l.

Luke Kenneth Casson Leighton

unread,
Jun 28, 2018, 5:08:21 AM6/28/18
to Gnanasekar R, RISC-V ISA Dev
On Wed, Jun 27, 2018 at 5:01 AM, Gnanasekar R <gnanase...@gmail.com> wrote:

> In embedded applications usually there are many bit manipulation code. Not
> having such instruction may increase the code size considerably. On the
> mailing list archives I saw some old discussions on bit manipulation
> instructions. Is it being actively pursued?

i learned something new from reading a more recent version of
clifford's xbitmanip-draft spec:
https://github.com/cliffordwolf/xbitmanip/blob/master/xbitmanip-draft.pdf

xbitmanip-draft, section 5.8 p40 (at time of writing), the RI5CY
project has *already implemented* quite extensive bitmanipulation,
which was a real surprise (a good one).
https://www.pulp-platform.org/documentation/ shows that it's an
"enhanced" part of the pulp instruction set... here is the full
manual:
https://pulp-platform.org//wp-content/uploads/2017/11/ri5cy_user_manual.pdf

p42 section 14.3.1 shows a compact description and the encoding.

l.

Gnanasekar R

unread,
Jun 28, 2018, 5:20:25 AM6/28/18
to Luke Kenneth Casson Leighton, RISC-V ISA Dev
Thank you for all the pointers Luke! :) Yeah, these instructions that some of them have implemented are definitely helpful I believe

Luke Kenneth Casson Leighton

unread,
Jun 28, 2018, 5:46:47 AM6/28/18
to Gnanasekar R, RISC-V ISA Dev
On Thu, Jun 28, 2018 at 10:20 AM, Gnanasekar R <gnanase...@gmail.com> wrote:

> Thank you for all the pointers Luke! :) Yeah, these instructions that some
> of them have implemented are definitely helpful I believe

no problem. i had to look... saves time for other people.
l.

Samuel Falvo II

unread,
Jun 28, 2018, 5:10:46 PM6/28/18
to Luke Kenneth Casson Leighton, Bruce Hoult, Gnanasekar R, Krste Asanovic, RISC-V ISA Dev
On Thu, Jun 28, 2018 at 1:24 AM, Luke Kenneth Casson Leighton
<lk...@lkcl.net> wrote:
> however, samuel, these are generally for memory-mapped peripherals,
> not for general-purpose use. i'd find it very strange if a virtual
> (or real) memory page was re-mapped to 32 or 64 times its actual size
> (into a bit-field) in *general-purpose* memory. aside from anything
> it would require the memory-accessing side of the memory bus
> architecture to have 5 or 6 extra bits, internally. still, it's a
> neat idea, not to be totally ruled out.

These ARM MCUs are microcontrollers working with scratchpad RAM, often
without access to external RAM resources (or if they have external RAM
resources, they fall outside the bit-banding address spaces).
Regardless, yes, these tricks would have to sit in front of the MMU,
not after it.

Jim Wilson

unread,
Jun 30, 2018, 5:59:34 PM6/30/18
to Bruce Hoult, Krste Asanovic, Gnanasekar R, RISC-V ISA Dev
On Wed, Jun 27, 2018 at 7:53 PM, Bruce Hoult <bruce...@sifive.com> wrote:
> Someone should teach that to gcc because...
>
> unsigned clr31_1(unsigned i){return i & 0x7fffffff;}
> unsigned clr31_2(unsigned i){return (i<<1)>>1;}
>
> ... both produce the same code and it's ten bytes ...

Fixed upstream. It now generates 2 immediate shifts. I also noticed
the same problem could occur with (i>>33)<<33 on riscv64 and fixed
that too.
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01924.html

Jim

Bruce Hoult

unread,
Jun 30, 2018, 6:09:34 PM6/30/18
to Jim Wilson, Krste Asanovic, Gnanasekar R, RISC-V ISA Dev
\o/

Reply all
Reply to author
Forward
0 new messages