I have a question regarding Fixed point Arithmetic addition.
For example, i have two fixed point numbers:
a = unsigned Q7.8 format (7-bit integer, 8 bit factional).
b = unsigned Q7.8 format ( " " ).
Now a + b = c, where c is an unsigned Q8.8 result.
Qs: How do I transform c into d, where d is a unsigned Q7.9 result ??
The way i have tried to approach it is as follows:
Integer part
----------------
The way i have thought about the integer part is to say that if bit
[15] of the result c is a '1', then bits[14:8] of d is b"111_1111",
otherwise d[14:8] = c[14:8].
Is is correct ??
Fractional Part
----------------------
The way i have thought about the fractional part is that for d, i want
one extra fractional bit to increase the fractional preciion.
The obvious way to me seems to be to add an extra bit at the LSB end:
ie d[8:0 = c[7:0] & 1'b0.
Is this correct?
QS; Can anyone recommend a good book on Fixed Point and Floating point
arithmetic ?
THanks in Advance
J