unsigned int R = G = B = A = 0;
void SetColor(float r, float g, float b, float a)
{
R = (float)(r * 255.f);
if(R > 255) R = 255;
G = (float)(g * 255.f);
if(G > 255) G = 255;
B = (float)(b * 255.f);
if(B > 255) B = 255;
A = (float)(a * 255.f);
if(A > 255) A = 255;
}
This was horribly slow, mostly because of _ftol2, the new safe float to long
conversion tool in Visual Studion .NET. So I did this:
inline unsigned int FloatToLong(float a) {
unsigned int retval;
__asm fld a
__asm fistp retval
return retval;
}
unsigned int R = G = B = A = 0;
void SetColor(float r, float g, float b, float a)
{
R = FloatToLong(r * 255.f);
if(R > 255) R = 255;
G = FloatToLong(g * 255.f);
if(G > 255) G = 255;
B = FloatToLong(b * 255.f);
if(B > 255) B = 255;
A = FloatToLong(a * 255.f);
if(A > 255) A = 255;
}
While this was definitely better (about 3-4x faster), when I looked at the
assembly code coming out, it seemed that I could do better with manually
written assembly code, so I did this:
unsigned int R = G = B = A = 0;
void SetColor(float r, float g, float b, float a)
{
__asm
{
fld r
fimul 0ffh
fistp R
cmp R, 0ffh
jbe green
mov R, 0ffh
green:
fld g
fimul 0ffh
fistp G
cmp G, 0ffh
jbe blue
mov G, 0ffh
blue:
fld b
fimul 0ffh
fistp B
cmp B, 0ffh
jbe alpha
mov B, 0ffh
alpha:
fld a
fimul 0ffh
fistp A
cmp A, 0ffh
jbe end
mov A, 0ffh
end:
}
}
This was a very slight improvement, mostly because I no longer need to call
FloatToLong, saving the pushes and pops involved, but also because the
assembly code Visual Studio .NET generated wasn't re-using the register
containing 0xff.
This function is called tens of thousands of time per frame, and is making
up almost 30% of the execution time after optimization (it was more like 70%
before).
I'm currently thinking about using SIMD (i.e. MMX regs) to do the
conversion, though the overhead on a single call will probably wipe out any
performance gains I get from doing 4 conversions at once. I'd probably have
to batch up as many SetColor calls as possible in order to get any
improvement from MMX.
I cannot remove this function by using floating point colors directly due to
API limitations outside of my control. So high-level redesign isn't
possible.
Any suggestions on how to make this faster are welcome!
Thanks,
Rob
If the numbers are in the range 0 to 1, why do you need to check each
component for the result being over 255?
I'd still suggest trying to get what's outside your control into your
control. Sounds like you're doing a float to int RGBA for a full
buffer each frame, which would be a bit of work any way you look at it.
Eyal
Thanks,
Rob
"Eyal Teler" <e...@nospam-et3d.com> wrote in message
news:%23CkGK8Z...@TK2MSFTNGP10.phx.gbl...
>I'm trying to implement an algorithm to convert floating point colors in the
>range 0.0 to 1.0 to integer colors in the range 0x0-0xff. The first attempt
>was something like this (please ignore any typos, I'm paraphrasing...):
This of any use?
http://www.gamasutra.com/features/19990326/katmai_06.htm
-- Mat.