The byte to reverse is initially in the accumulator, and the
result should also be in the accumulator.
http://graphics.stanford.edu/~seander/bithacks.html
has interesting techniques for doing this, but they might
not be so useful on a 6502.
Paul Guertin
p...@sff.net
pha
and #$0F
tax
pla
lsr
lsr
lsr
lsr
tay
lda tbl1,x
ora tbl2,y
tbl1 dc '$00, $80, $40, $C0, $20, $A0, $60, $E0'
dc '$10, $90, $50, $D0, $30, $B0, $70, $F0'
tbl2 dc '$00, $08, $04, $0C, $02, $0A, $06, $0E'
dc '$01, $09, $05, $0D, $03, $0B, $07, $0F'
-Lucas
Here's a slow but short version that uses recursion and must be called
as a subroutine. Up to 208 cycles, but only 12 bytes
bitrev anop
beq .2
pha
asl
jsr bitrev
plx
cpx #$80
rol
.2 rts
-Lucas
26 bytes, using a zero-page variable
sta val
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
asl val
ror
10 bytes, using X and zero-page
sta val
ldx #8
.1 asl val
ror
dex
bne .1
56 bytes, 42.5 cycles (average), inlined, using no other registers, or
memory. This one uses the fact, that, to swap bits X and Y in a
bytes, we can perform the operation
X = X xor (X xor Y)
Y = Y xor (X xor Y)
The three BIT/BEQ pairs are what computes this value. If someone can
figure out a way to compute the XOR between any two bits in a byte
more efficiently that what I've got, the code can be improved. Try to
come up with something of the semantics "If (bit X xor bit Y) == 0,
then branch".
bit #$81
beq .2
bit #$80
beq .1
bit #$01
beq .2
.1 eor #$81
.2 bit #$42
beq .4
bit #$40
beq .3
bit #$02
beq .4
.3 eor #$42
.4 bit #$24
beq .6
bit #$20
beq .5
bit #$04
beq .6
.5 eor #$24
.6 bit #$18
beq .8
bit #$10
beq .7
bit #$08
beq .8
.7 eor #$18
.8 done
-Lucas
plx? Don't you mean pla? After all, he did say 6502, not 65816.
And if you use pla, you are going to have to change cpx #$80 to
cmp #$80. And if you just want bit 7 moved into the carry, you
could always save one byte by changing it to asl.
> > 26 bytes, using a zero-page variable
> >
> > 10 bytes, using X and zero-page
>
> 56 bytes, 42.5 cycles (average), inlined, using no other registers,
> or memory
10 bytes, inlined, using only stack memory, 136 cycles maximum.
bit #$0
.1 php
asl
bne .1
.2 rol
plp
bne .2
(Note: bit immediate is a 65C02 instruction; on a 6502, you need
to bit $zp where zp contains zero).
Paul Guertin
p...@sff.net
Ack! I've been caught. Sorry for getting my 6502 vs. 65C02 mixed up.
I guess this (and my previous entry using BIT #Imm are disqualified.
I'll have to be more careful in the future.
Thanks for catching this.
-Lucas
That is brilliant! I had forgotten all about the php/plpl
instructions. They certainly do the trick here. Very elegant, I
especially like the use of BIT to guarantee that P=0. I've not seen
that trick before.
-Lucas
36 bytes, 56 cycles, uses X and one zero page byte.
Replace TAX -> PHA and TXA -> PLA throughout to free X at the
cost of 6 cycles.
tax ;swap contiguous bits
and #$55
asl
sta $00
txa
and #$AA
lsr
ora $00
tax ;swap contiguous pairs
and #$33
asl
asl
sta $00
txa
and #$CC
lsr
lsr
ora $00
asl ;swap contiguous nibbles
rol ;(see previous thread)
rol
rol
sta $00
and #$7
adc $00
Paul Guertin
p...@sff.net
<snip>
|> bit #$0
|> .1 php
|> asl
|> bne .1
|> .2 rol
|> plp
|> bne .2
|>
|> (Note: bit immediate is a 65C02 instruction; on a 6502, you need
|> to bit $zp where zp contains zero).
|
|That is brilliant! I had forgotten all about the php/plpl
|instructions. They certainly do the trick here. Very elegant, I
|especially like the use of BIT to guarantee that P=0. I've not seen
|that trick before.
Yes, that certainly looks very elegant!
(Dare I wonder how it would be done on a Z80?)