integer divide in ml

Aaron Wallace

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

ok, it's me again..

I saw how you can do multiply with shifts and adds, but what about divide?
Do you have to setup a loop and just subtract until you underflow and just
keep count of how many loops you make? Or is there a faster/more
efficient way?

--
Aaron.

Keith Farmer

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

Aaron Wallace wrote:

> keep count of how many loops you make? Or is there a faster/more
> efficient way?

Why not implement the equivalent of long division? This is just
pseudocode and not particularly efficient -- I'm not sure it's entirely
correct -- but should give an idea.

Num = (given)
Den = (given)

N = Num
D = Den
Cnt = 1
Rem = 0
Res = 0

While

D < N

Repeat

While

D < N

Repeat

Left Shift D *** How many times can D be multiplied by two?
Left Shift Cnt

End

Right Shift D *** Undoes the last multiplication
Right Shift Cnt *** again

Res = Res + Cnt
Cnt = 1
N = N - D
D = Den

End

Return Res

Joseph K. Horn

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

Aaron Wallace wrote:

> I saw how you can do multiply with shifts and adds, but what about
> divide?

This is almost like cheating, but there are two division routines
already built into the ROM of the HP48 which you can disassemble
with Jazz (or your favorite disassembler) if you want to see how
HP implemented division in assembly.

IDIV (Integer Divide) divides the entire A register by the entire
C register, in HEX or DEC depending on mode, and places the
quotient into A (and a copy into C), and the remainder into B.
The entry point for IDIV is #65807h. Crashes if denominator is 0,
so Don't Do That.

DIVF (Divide Finite arguments) divides two floating-point numbers
in "split form" (also known as "15-form"). The numerator is split
between registers A and B, and the denominator is split between
C and D. The result is returned in A and B (in split form).
The entry point for DIVF is #2B9C4h. Does not crash if denominator
is 0, but indicates that by exiting with XM set and P=3 or 4.

-Joe-

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

Robert Tiismus

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to Aaron Wallace

Aaron Wallace wrote:
>
> ok, it's me again..
>

> I saw how you can do multiply with shifts and adds, but what about divide?

> Do you have to setup a loop and just subtract until you underflow and just

> keep count of how many loops you make? Or is there a faster/more
> efficient way?

You may want to check the Dr. Dobbs journal, February 1991. Optimizing
Integer Division by a Constant Divisor.
The algorithm is like this:
One wants to divide x by y.
We take some z=2^n so that z is twice as long as x.
Then define a=z/y and r=z mod y.
Then define b=a+r-1.
Dividing by y can now be done by (a*x+b)/z and knowing that z=2^n, its
mutch more simple than original problem :)
Kind of this algorithm is used in HP48 for dividing by 5 but using the
above algorithm gives somewhat faster and shorter routine (tested it :)
One "but": with some divisors you get sometimes answers which are off by
1. (Five is "safe" divisor.) So you might want to check it out for range
of numbers you want to divide on some faster computer, before
implementing in HP48.

Best wishes from,

--
Robert Tiismus
http://www.physic.ut.ee/~robert

Werner Huysegoms

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

In article <37576539...@fi.tartu.ee>,

Robert Tiismus <rob...@fi.tartu.ee> wrote:
> You may want to check the Dr. Dobbs journal, February 1991. Optimizing
> Integer Division by a Constant Divisor.
> The algorithm is like this:
> One wants to divide x by y.
> We take some z=2^n so that z is twice as long as x.
> Then define a=z/y and r=z mod y.
> Then define b=a+r-1.
> Dividing by y can now be done by (a*x+b)/z and knowing that z=2^n, its
> mutch more simple than original problem :)
> Kind of this algorithm is used in HP48 for dividing by 5 but
using the
> above algorithm gives somewhat faster and shorter routine (tested it
:)

I'd like to see that routine of yours!
The DIV5 routine in the 48 ROM is actually implemented
exactly the way you discribed it - but it can indeed be
improved very easily - as there is no need to work with
10 digits, 6 will do (and shift after every addition
instead of all the shifts at the end, like DIV5 does),
as follows:

*
* C.A = C.A DIV 5
* uses D.A, P
*
=DIV5
P= 5 * C = n
C=0 P
D=C WP
D=D+C WP
D=D+C WP * D = 3*n
C=C+D WP * C = 4*n
CSR WP * C = n * 0.4h
C=C+D WP
CSR WP * C = n * 0.34h
C=C+D WP
CSR WP * C = n * 0.334h
C=C+D WP
CSR WP * C = n * 0.3334h
C=C+D WP
CSR WP * C = n * 0.33334h
C=C+D WP
CSR WP * C = n * 0.333334h
P= 0
RTN

--
Best Regards,
Werner Huysegoms
Reply-To: werner.h...@dgx11.cec.be
remove the x before replying

Aaron Wallace

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

On Thu, 3 Jun 1999, Keith Farmer wrote:

> Aaron Wallace wrote:
>
> > keep count of how many loops you make? Or is there a faster/more
> > efficient way?
>

> Why not implement the equivalent of long division? This is just
> pseudocode and not particularly efficient -- I'm not sure it's entirely
> correct -- but should give an idea.

this is the way I have done it, but I was just wondering if there was a
way to get around using a loop. Right now, I can do a divide by 10 on
#800d and it takes 1.19ms (which isn't too bad).

--
Aaron.

Aaron Wallace

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

wow. this is great. did you come up with this, or is this a "common"
trick?

I did a timing test with this method vs. doing a CSRB.F A and then
dividing by 5 using a loop to do subtracts until I get an underflow. For
the largest number (#7FFFF) it was about 5,000 times faster!

Thanks!

--
Aaron.

Keith Farmer

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

Aaron Wallace wrote:
> this is the way I have done it, but I was just wondering if there was a

I wouldn't know.. Haven't written ML before.. ;)

Actually it was fun thinking it up on the fly. I'd always wondered how
they did integer divide efficiently, and this forced the issue.

Aaron Wallace

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

On Fri, 4 Jun 1999, Aaron Wallace wrote:

> > I'd like to see that routine of yours!
> > The DIV5 routine in the 48 ROM is actually implemented
> > exactly the way you discribed it - but it can indeed be
> > improved very easily - as there is no need to work with
> > 10 digits, 6 will do (and shift after every addition
> > instead of all the shifts at the end, like DIV5 does),
> > as follows:

ahh.. I see now that this is in ROM.. should have read more carefully. ;-)

> >
> > *
> > * C.A = C.A DIV 5
> > * uses D.A, P
> > *
> > =DIV5
> > P= 5 * C = n
> > C=0 P
> > D=C WP
> > D=D+C WP
> > D=D+C WP * D = 3*n
> > C=C+D WP * C = 4*n
> > CSR WP * C = n * 0.4h
> > C=C+D WP
> > CSR WP * C = n * 0.34h
> > C=C+D WP
> > CSR WP * C = n * 0.334h
> > C=C+D WP
> > CSR WP * C = n * 0.3334h
> > C=C+D WP
> > CSR WP * C = n * 0.33334h
> > C=C+D WP
> > CSR WP * C = n * 0.333334h
> > P= 0
> > RTN

--
Aaron.

Werner Huysegoms

unread,

Jun 7, 1999, 3:00:00 AM6/7/99

to

In article
<Pine.GSO.3.96.99060...@infoserv.utdallas.edu>,

Aaron Wallace <awal...@utdallas.edu> wrote:
>
>
> On Fri, 4 Jun 1999, Aaron Wallace wrote:
>
> > > I'd like to see that routine of yours!
> > > The DIV5 routine in the 48 ROM is actually implemented
> > > exactly the way you discribed it - but it can indeed be
> > > improved very easily - as there is no need to work with
> > > 10 digits, 6 will do (and shift after every addition
> > > instead of all the shifts at the end, like DIV5 does),
> > > as follows:
> ahh.. I see now that this is in ROM.. should have read more carefully.
;-)
>

No, it isn't. The one in the ROM is quite different,
this was a way to improve upon it.
It's a by-product from an MC to do an ML DIV12, a while (years?)
ago, and the DIV5 here (although written from memory) originates
from a post by Mika Heiskanen at that time.

Aaron Wallace

unread,

Jun 7, 1999, 3:00:00 AM6/7/99

to

On Mon, 7 Jun 1999, Werner Huysegoms wrote:

> In article
> <Pine.GSO.3.96.99060...@infoserv.utdallas.edu>,
> Aaron Wallace <awal...@utdallas.edu> wrote:
> >
> >
> > On Fri, 4 Jun 1999, Aaron Wallace wrote:
> >
> > > > I'd like to see that routine of yours!
> > > > The DIV5 routine in the 48 ROM is actually implemented
> > > > exactly the way you discribed it - but it can indeed be
> > > > improved very easily - as there is no need to work with
> > > > 10 digits, 6 will do (and shift after every addition
> > > > instead of all the shifts at the end, like DIV5 does),
> > > > as follows:
> > ahh.. I see now that this is in ROM.. should have read more carefully.
> ;-)
> >
>
> No, it isn't. The one in the ROM is quite different,
> this was a way to improve upon it.
> It's a by-product from an MC to do an ML DIV12, a while (years?)
> ago, and the DIV5 here (although written from memory) originates
> from a post by Mika Heiskanen at that time.
>

so what about the DIV5 I saw in ROM (6A9E)? It used the same procedure,
but there were more shifting and adding sets (for higher precision, I
suppose since it uses P=10).

--
Aaron.

Mika Heiskanen

unread,

Jun 8, 1999, 3:00:00 AM6/8/99

to

Aaron Wallace <awal...@utdallas.edu> writes:

>so what about the DIV5 I saw in ROM (6A9E)? It used the same procedure,
>but there were more shifting and adding sets (for higher precision, I
>suppose since it uses P=10).

DIV5 in HP48 ROM uses P=10 because the code is not optimized. The
algorithm is the same in both implementations, however the one in the
ROM delays division by #100000 until the very end, whereas the
optimized version does it ASAP.

The accuracy depends only on the number of shift'n'add cycles.
6 is enough to obtain full accuracy, 5 for integers up to #7FFFF,
4 for #7FFF etc.

--
---
--> Mika Heiskanen mhei...@gamma.hut.fi http://www.hut.fi/~mheiskan

Robert Tiismus

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to Werner Huysegoms

Werner Huysegoms wrote:

> In article <37576539...@fi.tartu.ee>,
> Robert Tiismus <rob...@fi.tartu.ee> wrote:

<snip>

> > One wants to divide x by y.
> > We take some z=2^n so that z is twice as long as x.
> > Then define a=z/y and r=z mod y.
> > Then define b=a+r-1.
> > Dividing by y can now be done by (a*x+b)/z and knowing that z=2^n, its
> > mutch more simple than original problem :)

> > Kind of this algorithm is used in HP48 for dividing by 5 but
> using the
> > above algorithm gives somewhat faster and shorter routine (tested it
> :)
>

> I'd like to see that routine of yours!
> The DIV5 routine in the 48 ROM is actually implemented
> exactly the way you discribed it - but it can indeed be
> improved very easily - as there is no need to work with
> 10 digits, 6 will do (and shift after every addition
> instead of all the shifts at the end, like DIV5 does),
> as follows:
>

Well, as it happens sometimes, I found my routine amongst some others on
one almost faded out printout when searching for some totally different
documents. It is 13% shorter than HP original and 19% faster, but does not
beat your optimized routine. Here it is:

aDIV5 P= 8
C=0 WP
C=A A
C=C+1 WP
A=C WP
C=C+C WP
A=A+C WP
B=A WP
BSR WP
C=A WP
CSL WP
A=A+C WP
C=A WP
CSL WP
CSL WP
A=A+C WP
A=A+B WP
ASR WP
ASR WP
ASR WP
ASR WP
P= 0
RTNCC

But based on your article, I came up with another one which is 3.7% faster
and 4.7% shorter than yours:

aDIV5 P= 5
A=0 P
A=A+1 WP
C=A WP
A=A+C WP
A=A+C WP
C=A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
P= 0
RTNCC

as it can be seen, the HP routine is not based exactly on the method I
described, and by using this method plus some optimization still gives
shorter/faster routine.

Best wishes from,
Robert Tiismus
rob...@physic.ut.ee

Robert Tiismus

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to Werner_h...@my-deja.com

Robert Tiismus wrote:

>
> aDIV5 P= 5
> A=0 P
> A=A+1 WP
> C=A WP
> A=A+C WP
> A=A+C WP
> C=A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> P= 0
> RTNCC
>

Sorry, it was not the last version I posted, it should of course be:

aDIV5
P= 5
A=0 P
A=A+1 WP
C=A WP
A=A+C WP
A=A+C WP
C=A WP

ASR WP
A=C+A WP
ASR WP
A=C+A WP
ASR WP
A=C+A WP
ASR WP
A=C+A WP

ASR WP
P= 0
RTNCC

--
Robert Tiismus | PGP public key for 'rob...@fi.tartu.ee'
http://www.physic.ut.ee/~robert | http://www.physic.ut.ee/~robert/pgp.pub

Aaron Wallace

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to

Just a quick question.. why did you use P=5 and then WP? why not just use
A field? Or did you just use WP so that one can specify how many nibble
to use?

--
Aaron.

Raymond Hellstern

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to

Hi,

P=5 means the rightmost SIX nibs,
whereas the A field consist of the rightmost FIVE nibs;-)

Greetings

Raymond

On Tue, 3 Aug 1999 09:16:05 -0500,
Aaron Wallace <awal...@utdallas.edu> wrote:

> Just a quick question.. why did you use P=5 and then WP? why not just use
> A field? Or did you just use WP so that one can specify how many nibble
> to use?
>

Raymond Hellstern -Magic48ges- Email: Raymond....@gmx.de

Aaron Wallace

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to

oh yeah, I should know that... ;-)

--
Aaron.