Albert van der Horst wrote:
> I didn't claim that it was "the" fastest method, as you do.
> That is hard to tell without a comprehensive survey.
> I was much more cautious with "techniques like".
> If your table has time to get swapped out of the cache before the next
> square root is called, it may not be so fast compared to an inline
> version of e.g. the above code.
Well, this uses a table of 510 bytes, so I wonder if it will break the bank.
Note it WILL require changes for 64bit, but those are trivial (repeat inner
lines 4 more times). Tested with gForth.
Hans Bezemer
8 constant char-bits
4 constant /cell
create (msb)
1 c, 2 c, 2 c, 3 c, 3 c, 3 c, 3 c, 4 c, 4 c, 4 c, 4 c, 4 c, 4 c, 4 c, 4 c,
5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c, 5 c,
5 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c,
6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c, 6 c,
6 c, 6 c, 6 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c,
7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c,
7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c,
7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c,
7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 7 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c, 8 c,
does> swap chars + c@ ;
create (lsb)
1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c,
5 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c,
1 c, 6 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c,
2 c, 1 c, 5 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c,
1 c, 2 c, 1 c, 7 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c,
3 c, 1 c, 2 c, 1 c, 5 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c,
1 c, 3 c, 1 c, 2 c, 1 c, 6 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c,
2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 5 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c,
1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 8 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c,
4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 5 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c,
1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 6 c, 1 c, 2 c, 1 c, 3 c, 1 c,
2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 5 c, 1 c, 2 c, 1 c, 3 c,
1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 7 c, 1 c, 2 c, 1 c,
3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 5 c, 1 c, 2 c,
1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 6 c, 1 c,
2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 5 c,
1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c, 4 c, 1 c, 2 c, 1 c, 3 c, 1 c, 2 c, 1 c,
does> swap chars + c@ ;
: (msb@) 1- (msb) + nip ; ( n shift byte -- bit)
: (lsb@) 1- (lsb) + nip ; ( n shift byte -- bit)
: lastbit ( n1 -- n2)
/cell 1- char-bits * over over rshift dup if (msb@) exit then
drop char-bits - over over rshift 255 and dup if (msb@) exit then
drop char-bits - over over rshift 255 and dup if (msb@) exit then
drop char-bits - over 255 and dup if (msb@) exit then
drop drop dup xor
;
: firstbit ( n1 -- n2)
0 over 255 and dup if (lsb@) exit then
drop char-bits + over over rshift 255 and dup if (lsb@) exit then
drop char-bits + over over rshift 255 and dup if (lsb@) exit then
drop char-bits + over over rshift dup if (lsb@) exit then
drop drop dup xor
;