Dwight Elvey wrote:
[stuff about using tables to do CRCs 4 bits at a time deleted]
> Now back to the original questions. I haven't
> totally figured out how the clever code by Jon Buller
> works, but it looks like it takes advantage
> of the fact that the polynomial x^16+x^15+x^2+1
> has no terms in the range of x^7 to x^14. The
> polynomial x^16+x^12+x^5+1 has the term x^12
> that looks like it would destroy much chance
> of making such a simple algorithm. When I figure
> out Jon Buller's algorithm, I'll know better if
> what I suspect is true and if a simple code
> sequence as his can be applied to the polynomial
> x^16+x^12+x^5+1. If so I'll publish it here for
> all to use.
Well, it's not really that hard... What I did (after
someone pointed out the trick to me) was to run 8 bits
of the specific CRC in question symbolically, instead of
on real data. I labeled the low CRC bits as A,B,C...H,
the high bits as P,Q...X, and the incoming data as 0,1...7
I used a Turbo-Pascal program to spit out the table, since I
had that available at the time. it can be done with some
spreadsheet macros as well, because I did it that way once too.
You can do it by hand, but it gets real tedious, and it's semi
prone to making small mistakes, which will blow the thing out
of the water before you get your toes wet.
After 8 bits of shifting and XORing, you get the following
code. (for CRC-CCITT) Of course, once you have the table of
which bits you need to XOR together, you need to figure out
a way to collect them into the same bytes, in the proper
order, and fast helps too 8-) And, yes, it is a bit worse
than the CRC-16 code (Which I included below as well, for
completeness mostly, but I discuss it a tiny bit after the
CCITT code...)
It really is relatively automatic, except for the part about
going over your code several times to squeeze out every last
cycle. Having a friend doing the same thing really helps,
because Jeff Owens (in this case) would take my code and
squeeze about 2 bytes and 1 cycle out of my code, then I came
back and squeezed 1 byte and 3 cycles out of his. Do this for
an hour or two every day for a week or two, and you get this...
(BTW, in this particular contest I won, but not by much, and I
had to beat Jeff about 3 or 4 different times, most of which
were from him beating my previous try.)
To figure out the code, take the table, and compare groups of
bits added into the result with small blocks of instructions.
In general the code shifts and rolls some data a bit, then XORs
it, then does it again to get another block of the table into
the CRC result.
;------------------------------------------------------------------------
;
; A DS5000 subroutine to generate CCITT-CRC data
; By Jon Buller Feb 27, 1991
;
; CRC Input: Data: CRC Result:
; A 0 P A0 E4
; B 1 Q B1 F5
; C 2 R C2 G6
; D 3 S D3 H7 A0
; E 4 T B1
; F 5 U C2
; G 6 W D3
; H 7 X A0 E4
; P A0 B1 F5
; Q B1 C2 G6
; R C2 D3 H7
; S D3
; T E4 A0
; U F5 B1
; W G6 C2
; X H7 D3
;
; All bits listed in the result should be xor'ed together. Bit '0' and
bit
; 'A' are the least significant bits of incoming data. The CRC input
data
; will be changed into the CRC result by this routine. If there is some
; reason that the previous CRC data must be saved, it must be done by
the
; calling routine. (The same thing goes for the 'psw' register.)
;
; This routine uses 28 instruction cycles per call.
;
;------------------------------------------------------------------------
lo equ 40h
hi equ 41h
ccitt:
push acc ; save some state information for caller
xrl a, lo ; data bits always appear with these crc
bits
mov lo, hi ; start accumulating crc information
mov hi, a
anl a, #0fh
swap a ; add this nibble to the crc hi byte
xrl a, hi
mov hi, a ; this is used everywhere
swap a
anl a, #0fh ; take the hi nibble just made
xrl lo, a ; and add it to the low byte
mov a, hi
rr a ; now shift everything 3 bits
swap a
push acc ; and save it for later
anl a, #0f8h ; take these 5 bits
xrl lo, a ; and finish of the lo byte with them
pop acc ; get the shifted byte back
anl a, #07h ; take 3 of the bits this time
xrl hi, a ; and finish the hi byte
pop acc ; restore this for the caller
ret ; and return the results
end
Now for all the people who didn't see the original CRC16 code with
comments here it is, in hopes that the comments will help a bit, more
than they will hinder. (I'd put even money on that at this point, so
if they confuse you, strip them out.) The main trick to this one is
to recognize that the real work needed is a parity calculation, and
that is readily available on the 8051. This does make it simpler
than the CCITT code, as you can see from the table, in fact, if I
hadn't done this one first, I may not have had any try at CCITT,
thinking that it was way too complicated to be worth the bother.
;------------------------------------------------------------------------
;
; A DS5000 subroutine to generate CRC16 data
;
; CRC Input: Data: CRC Result:
; A 0 P ABCDEFGH01234567
; B 1 Q
; C 2 R
; D 3 S
; E 4 T
; F 5 U
; G 6 W A0
; H 7 X A0 B1
; P B1 C2
; Q C2 D3
; R D3 E4
; S 4F E5
; T F5 G6
; U G6 H7
; W H7 ABCDEFGH01234567
; X ABCDEFGH01234567
;
; All bits listed in the result should be xor'ed together. Bit '0' and
bit
; 'A' are the least significant bits of incoming data. The CRC input
data
; will be changed into the CRC result by this routine. If there is some
; reason that the previous CRC data must be saved, it must be done by
the
; calling routine. (The same thing goes for the 'psw' register.)
;
; Notice that the sequence ABCDEFGH01234567 is the parity of the data
and
; the low byte of the crc input. The code that follows tries to put
large
; pieces of the CRC into place, and then clean up the bits that need
fixing.
;
; This routine will execute in 23.5 cycles on average (22 best case, 25
worst)
; The 23.5 cycle time average is based on all 3 conditional branches
being taken
; half of the time. CRC's usually produce a uniform distribution of
outputs
; for any change of the input, so it should be a fair to guess that the
3
; conditional branches will, in fact, be taken half the time, although
this
; has not been verified experimentally.
;
; The variable 'hi' does not need to be bit addressable, and if memory
is so
; tight that 1 byte in the bit addressable space is not desirable and/or
; available, the three instructions that cpl a bit in 'lo' could be
changed
; to xrl instructions, at the cost of 3 bytes of code and 1.5 cycles on
; average.
;
;------------------------------------------------------------------------
lo equ 20h ; low byte of crc
hi equ 21h ; high byte of crc
crc16:
push acc ; save this in case the caller needs it
xrl a, lo
mov lo, hi ; put the high byte of the crc in its
dest.
mov hi, a ; save data xor low(crc) for later
mov c, p
jnc crc0
cpl lo.0 ; add the parity to crc bit 0
crc0:
rrc a ; get the low bit in c
jnc crc1
cpl lo.6 ; need to fix bit 6 of the result
crc1:
mov c, acc.7
xrl a, hi ; compute the results for bits P...U
rrc a ; shift them into place
mov hi, a ; and save them
jnc crc2
cpl lo.7 ; now clean up bit 7
crc2:
pop acc ; restore everything and return
ret
end
Hope you all enjoy, hope it helps too...
Jon Buller
.include lawyer.txt
.include quote.txt