Joe keane wrote:
> In article<
9kmj69-...@ntp6.tmsw.no>,
> Terje Mathisen<"terje.mathisen at
tmsw.no"> wrote:
>> When did you last time the uint16->float->exponent trick?
>
> just now
>
> time in us
>
> 'server'
> 239 testfx
> 270 testfy
>
> 'PC'
> 211 testfx
> 186 testfy
OK, so the 'x' version is 64K tests of the table lookup, while 'y' is
the same number of fp conversions.
Problem one: The table lookups will have 100% cache hits since you are
doing nothing else, so you don't measure the misses you'll get in a real
program.
Problem two: The tests have a really funky way to randomize the tests
(multiply by 31), which means that the table lookup version will have
very close to 100% branch prediction hits for the pattern of switching
between the top and low byte.
Problem three: How common will zero be?
This affects the branch hit ratio for the fp version.
All that said, you are getting effectively the same speed for both
versions, and the fp code has zero table space so better cache behavior.
Terje
>
> -- fhsx.c
> static const char fhstab[256] =
> {
> 99, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
> [...]
> };
>
>
> extern int kk_fhsx(int mask)
> {
> int ret;
>
> ret = (mask& 0xFF00) != 0 ? fhstab[mask>> 8& 0xFF] + 8 :
> fhstab[mask& 0xFF];
> return ret;
> }
>
> -- fhsy.c
> union u
> {
> float f;
> unsigned int u;
> };
>
>
> extern int kk_fhsy(int mask)
> {
> union u t;
> int bex;
> int ret;
>
> t.f = (float) mask;
> bex = t.u>> 23;
> ret = bex - 127;
> return ret;
> }
>
> -- testfx.c
> #include<stdio.h>
>
> static void testfx(int *sump);
> extern int kk_fhsx(int mask);
>
>
> [main]
>
>
> static void testfx(int *sump)
> {
> int m;
>
> for (m = 0x1; m<= 0xFFFF; m++)
> {
> int mask;
> int ret;
>
> mask = 31 * m& 0xFFFF;
> ret = kk_fhsx(mask);
> /* *sump += ret; */
> }
> }
>
> -- testfy.c
> #include<stdio.h>
>
> static void testfy(int *sump);
> extern int kk_fhsy(int mask);
>
>
> [main]
>
>
> static void testfy(int *sump)
> {
> int m;
>
> for (m = 0x1; m<= 0xFFFF; m++)
> {
> int mask;
> int ret;
>
> mask = 31 * m& 0xFFFF;
> ret = kk_fhsy(mask);
> /* *sump += ret; */
> }
> }