I know .... i'am newbie but trying the best i can. So if someone have better
idea it will be appreciate
:-)
P.S.: sorry for my poor english, I usualy speek french
Frédéric Landry
Sounds like some homework. And I'm not sure why you mention printf()
in the context of assembly. But I'll try and help. You've tried to
give it some glancing thought, at least.
A value can be imagined as this series:
A = 2^(0*N) * a[0]
+ 2^(1*N) * a[1]
+ 2^(2*N) * a[2]
+ 2^(3*N) * a[3]
+ ...
+ 2^((M-1)*N) * a[M-1]
where a[0..M-1] is the array of M words, each containing N bits.
Dividing by 10, gives:
A / 10 = 2^(0*N) * a[0] / 10
+ 2^(1*N) * a[1] / 10
+ 2^(2*N) * a[2] / 10
+ 2^(3*N) * a[3] / 10
+ ...
+ 2^((M-1)*N) * a[M-1] / 10
The result of the division has two parts, the quotient and the
remainder. Let's assume that dividing a[i] by 10 results in q[i] and
r[i], the respective quotient and remainder. Let's rephrase it in
that form:
A / 10 = 2^(0*N) * (q[0] + r[0] / 10)
+ 2^(1*N) * (q[1] + r[1] / 10)
+ 2^(2*N) * (q[2] + r[2] / 10)
+ 2^(3*N) * (q[3] + r[3] / 10)
+ ...
+ 2^((M-1)*N) * (q[M-1] + r[M-1] / 10)
or
A / 10 = 2^(0*N) * q[0] + 2^(0*N) * r[0] / 10
+ 2^(1*N) * q[1] + 2^(1*N) * r[1] / 10
+ 2^(2*N) * q[2] + 2^(2*N) * r[2] / 10
+ 2^(3*N) * q[3] + 2^(3*N) * r[3] / 10
+ ...
+ 2^((M-1)*N) * q[M-1] + 2^((M-1)*N) * r[M-1] / 10
Let's say that the result of A/10 is an array of b[i]. Starting from
the most significant word, this means that b[M-1] = q[M-1]. The
remainder r[M-1] still needs to be divided by 10, but that can be
handled by letting it become the high order word of the next division
by 10, with the low order coming from a[M-2]. The quotient of that is
b[M-2] and the remainder is used again, as before. In the end, the
last remainder is the digit.
Let's look at a specific example, to make it concrete:
24687168225 (in hex, 5BF784AE1)
Let's keep the values as decimal, though. And rather than handle this
in 32-bit words which would trivialize it, let's handle it using
16-bit words as if we had to do this on an 8088 processor.
The memory words would look like:
a[2] = 5
a[1] = 49016
a[0] = 19169
Although we can easily see by inspection that a[2] is less than 10 and
that we might "cheat" a little here, let's assume that we don't know
this fact.
Load up DX as 0, AX from the highest order word a[2], and divide by
10. The result is 0r5. Store the quotient 0 back in a[2], leave DX
as 5, and load AX from a[1]. (The number held in DX:AX now is
376696.) Divide by 10 giving 37669r6. Store 37669 into a[1], leave
DX as 6, and load AX from a[0]. (The number held in DX:AX now is
412385.) Divide by 10 giving 41238r5. Store 41238 in a[0]. Your
digit is 5, at the end of this phase. Here is what is in a[] now:
a[2] = 0
a[1] = 37669
a[0] = 41238
Here's the array at the starting point and after each step:
a[2] a[1] a[0] digit
--------------------------------
5 49016 19169
0 37669 41238 5
0 3766 63106 2
0 376 45632 2
0 37 43884 8
0 3 50263 6
0 0 24687 1
At this point, I think you can see that the result will be 24687168225
(a[0] at the end has the first 5 digits, which are combined with the
digits already computed) and that his exactly matches the desired
result.
Note that Intel's decision to place the remainder into DX is exactly
what is needed to avoid shuffling registers around during this
process.
Hope that helps you understand one possible approach to the problem.
Jon
A 128-bit number can have up to 39 digits, so you'll have to split it
into smaller parts to make it possible to fit the pieces in a 32-bit
register. 5 parts each with 8 digits will do:
repeat
Divide the number by 100000000 (i.e 100 mill).
The remainder is the last 8 digits of the result.
Convert this to ascii
until (number < 1e8);
Something like this:
void split128(t_uint128 *a, t_uint32 *parts)
{
mov esi,[a]
mov edi,[parts]
push ebp
lea ebp,[edi+4*4] ; Do 4 long divisions
mov ebx,100000000 ; 8 digits per iteration
next_outer:
mov ecx,3
xor edx,edx
next_inner:
mov eax,[esi+ecx*4]
div ebx
mov [esi+ecx*4],eax
sub ecx,1
jae next_inner
mov [edi],edx ; Final remainder after long div
add edi,4
cmp edi,ebp
jb next_outer
pop ebp
mov [edi],eax ; Result after last iteration will fit!
}
At this point you have an array of 5 32-bit binary numbers, each of
which will contribute up to 8 digits to the final answer.
For a fast way to convert each of those numbers to ascii, do a google
for code I've posted here previously, or look at AMD's optimization
guide which seems to have "stolen" my algorithm. :-)
Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
> Hello,
> I have to print a 128 bits number on the screen in asm, but printf()
> can't take more than, 64 bits format. The only solution i have found for
the
> moment is to substract the biger factor of 10 as possible and the number
of
> occcurence possible to substract give me the most significant digits,
after
> that recall with the first smaller factor of 10 give me the second most
> significant digits etc...
>
> I know .... i'am newbie but trying the best i can. So if someone have
better
> idea it will be appreciate
How about classic algorithm using DAA (was posted here)?:
--------8<--------
You may double a BCD number by means of:
MOV AL, [BCD_BYTE1]
ADD AL, AL
DAA
MOV [BCD_BYTE1], AL
MOV AL, [BCD_BYTE2]
ADDC AL, AL
DAA
MOV [BCD_BYTE2], AL
...
MOV AL, [BCD_BYTEN]
ADDC AL, AL
DAA
MOV [BCD_BYTEN], AL
Now, how u r going to use the above...
1. You double the BCD number as shown above (to convert a 64-bit binary
number you'll need space for 20 decimal digits, e.g. 10 bytes)
2. You shift your binary (not BCD, but initial binary) number left one bit.
3. You add the bit, which you just shifted out, to the BCD number (obviosly
buy just copying the bit, e.g. you may use either of ADD and OR instructions
for that)
4. Repeat steps 1...3 for all 64 bits
--------8<--------
Simply extend this to the 128-bit case.
Good Luck
Alexei A. Frounze
http://alexfru.narod.ru
http://www.members.tripod.com/protected_mode/
very appreciate... ;)
--
Frédéric Landry