More 6502 bit twiddling

Paul Guertin

unread,

Mar 18, 2004, 10:24:57 PM3/18/04

to

Challenge: write the fastest (average number of cycles) or
shortest 6502 routine (smallest number of bytes, including
executable code and data tables) that will reverse the order
of bits in a byte. For instance, 10000100 should become
00100001, and 11010111 should become 11101011.

The byte to reverse is initially in the accumulator, and the
result should also be in the accumulator.

http://graphics.stanford.edu/~seander/bithacks.html
has interesting techniques for doing this, but they might
not be so useful on a 6502.

Paul Guertin
p...@sff.net

lsch...@d.umn.edu

unread,

Mar 19, 2004, 1:56:29 PM3/19/04

to

Probably not the fastest or smallest, but something to kick the
competition off with. 27 cycles, 48 bytes

pha
and #$0F
tax
pla
lsr
lsr
lsr
lsr
tay
lda tbl1,x
ora tbl2,y

tbl1 dc '$00, $80, $40, $C0, $20, $A0, $60, $E0'
dc '$10, $90, $50, $D0, $30, $B0, $70, $F0'
tbl2 dc '$00, $08, $04, $0C, $02, $0A, $06, $0E'
dc '$01, $09, $05, $0D, $03, $0B, $07, $0F'

-Lucas

Lucas

unread,

Mar 20, 2004, 7:30:27 PM3/20/04

to

lsch...@d.umn.edu wrote in message news:<c3ffot$dk4$1...@lenny.tc.umn.edu>...

> Probably not the fastest or smallest, but something to kick the
> competition off with. 27 cycles, 48 bytes

Here's a slow but short version that uses recursion and must be called
as a subroutine. Up to 208 cycles, but only 12 bytes

bitrev anop
beq .2
pha
asl
jsr bitrev
plx
cpx #$80
rol
.2 rts

-Lucas

Jerry Penner

unread,

Mar 20, 2004, 10:38:20 PM3/20/04

to

26 bytes, using a zero-page variable
sta val
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror

10 bytes, using X and zero-page
sta val
ldx #8
.1 asl val
ror
dex
bne .1

Lucas

unread,

Mar 21, 2004, 2:46:30 PM3/21/04

to

> 26 bytes, using a zero-page variable
>

> 10 bytes, using X and zero-page

56 bytes, 42.5 cycles (average), inlined, using no other registers, or
memory. This one uses the fact, that, to swap bits X and Y in a
bytes, we can perform the operation

X = X xor (X xor Y)
Y = Y xor (X xor Y)

The three BIT/BEQ pairs are what computes this value. If someone can
figure out a way to compute the XOR between any two bits in a byte
more efficiently that what I've got, the code can be improved. Try to
come up with something of the semantics "If (bit X xor bit Y) == 0,
then branch".

bit #$81
beq .2
bit #$80
beq .1
bit #$01
beq .2
.1 eor #$81
.2 bit #$42
beq .4
bit #$40
beq .3
bit #$02
beq .4
.3 eor #$42
.4 bit #$24
beq .6
bit #$20
beq .5
bit #$04
beq .6
.5 eor #$24
.6 bit #$18
beq .8
bit #$10
beq .7
bit #$08
beq .8
.7 eor #$18
.8 done

-Lucas

Matthew Montchalin

unread,

Mar 22, 2004, 5:42:09 AM3/22/04

to

plx? Don't you mean pla? After all, he did say 6502, not 65816.
And if you use pla, you are going to have to change cpx #$80 to
cmp #$80. And if you just want bit 7 moved into the carry, you
could always save one byte by changing it to asl.

Paul Guertin

unread,

Mar 22, 2004, 7:59:37 AM3/22/04

to

On 21 Mar 2004 11:46:30 -0800, lsch...@d.umn.edu (Lucas) wrote:

> > 26 bytes, using a zero-page variable
> >
> > 10 bytes, using X and zero-page
>
> 56 bytes, 42.5 cycles (average), inlined, using no other registers,
> or memory

10 bytes, inlined, using only stack memory, 136 cycles maximum.

bit #$0
.1 php
asl
bne .1
.2 rol
plp
bne .2

(Note: bit immediate is a 65C02 instruction; on a 6502, you need
to bit $zp where zp contains zero).

Paul Guertin
p...@sff.net

Lucas

unread,

Mar 22, 2004, 6:57:42 PM3/22/04

to

Matthew Montchalin <mmon...@OregonVOS.net> wrote in message news:<Pine.LNX.4.44.040322...@lab.oregonvos.net>...

Ack! I've been caught. Sorry for getting my 6502 vs. 65C02 mixed up.
I guess this (and my previous entry using BIT #Imm are disqualified.
I'll have to be more careful in the future.

Thanks for catching this.

-Lucas

Lucas

unread,

Mar 22, 2004, 7:04:24 PM3/22/04

to

Paul Guertin <p...@sff.net> wrote in message news:<loot50pe4revpbqpl...@4ax.com>...

That is brilliant! I had forgotten all about the php/plpl
instructions. They certainly do the trick here. Very elegant, I
especially like the use of BIT to guarantee that P=0. I've not seen
that trick before.

-Lucas

Paul Guertin

unread,

Mar 22, 2004, 9:28:10 PM3/22/04

to

I tried implementing a classic hacker's trick for flipping bits
"in parallel". To my surprise, it was reasonably efficient.

36 bytes, 56 cycles, uses X and one zero page byte.

Replace TAX -> PHA and TXA -> PLA throughout to free X at the
cost of 6 cycles.

tax ;swap contiguous bits
and #$55
asl
sta $00
txa
and #$AA
lsr
ora $00

tax ;swap contiguous pairs
and #$33
asl
asl
sta $00
txa
and #$CC
lsr
lsr
ora $00

asl ;swap contiguous nibbles
rol ;(see previous thread)
rol
rol
sta $00
and #$7
adc $00

Paul Guertin
p...@sff.net

Matthew Montchalin

unread,

Mar 24, 2004, 1:52:25 AM3/24/04

to

On 22 Mar 2004, Lucas wrote:
|Paul Guertin <p...@sff.net> wrote in message news:<loot50pe4revpbqpl...@4ax.com>...
|> On 21 Mar 2004 11:46:30 -0800, lsch...@d.umn.edu (Lucas) wrote:

<snip>

|> bit #$0
|> .1 php
|> asl
|> bne .1
|> .2 rol
|> plp
|> bne .2
|>
|> (Note: bit immediate is a 65C02 instruction; on a 6502, you need
|> to bit $zp where zp contains zero).
|
|That is brilliant! I had forgotten all about the php/plpl
|instructions. They certainly do the trick here. Very elegant, I
|especially like the use of BIT to guarantee that P=0. I've not seen
|that trick before.

Yes, that certainly looks very elegant!

(Dare I wonder how it would be done on a Z80?)