I saw how you can do multiply with shifts and adds, but what about divide?
Do you have to setup a loop and just subtract until you underflow and just
keep count of how many loops you make? Or is there a faster/more
efficient way?
--
Aaron.
> keep count of how many loops you make? Or is there a faster/more
> efficient way?
Why not implement the equivalent of long division? This is just
pseudocode and not particularly efficient -- I'm not sure it's entirely
correct -- but should give an idea.
Num = (given)
Den = (given)
N = Num
D = Den
Cnt = 1
Rem = 0
Res = 0
While
D < N
Repeat
While
D < N
Repeat
Left Shift D *** How many times can D be multiplied by two?
Left Shift Cnt
End
Right Shift D *** Undoes the last multiplication
Right Shift Cnt *** again
Res = Res + Cnt
Cnt = 1
N = N - D
D = Den
End
Return Res
> I saw how you can do multiply with shifts and adds, but what about
> divide?
This is almost like cheating, but there are two division routines
already built into the ROM of the HP48 which you can disassemble
with Jazz (or your favorite disassembler) if you want to see how
HP implemented division in assembly.
IDIV (Integer Divide) divides the entire A register by the entire
C register, in HEX or DEC depending on mode, and places the
quotient into A (and a copy into C), and the remainder into B.
The entry point for IDIV is #65807h. Crashes if denominator is 0,
so Don't Do That.
DIVF (Divide Finite arguments) divides two floating-point numbers
in "split form" (also known as "15-form"). The numerator is split
between registers A and B, and the denominator is split between
C and D. The result is returned in A and B (in split form).
The entry point for DIVF is #2B9C4h. Does not crash if denominator
is 0, but indicates that by exiting with XM set and P=3 or 4.
-Joe-
Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
Aaron Wallace wrote:
>
> ok, it's me again..
>
> I saw how you can do multiply with shifts and adds, but what about divide?
> Do you have to setup a loop and just subtract until you underflow and just
> keep count of how many loops you make? Or is there a faster/more
> efficient way?
You may want to check the Dr. Dobbs journal, February 1991. Optimizing
Integer Division by a Constant Divisor.
The algorithm is like this:
One wants to divide x by y.
We take some z=2^n so that z is twice as long as x.
Then define a=z/y and r=z mod y.
Then define b=a+r-1.
Dividing by y can now be done by (a*x+b)/z and knowing that z=2^n, its
mutch more simple than original problem :)
Kind of this algorithm is used in HP48 for dividing by 5 but using the
above algorithm gives somewhat faster and shorter routine (tested it :)
One "but": with some divisors you get sometimes answers which are off by
1. (Five is "safe" divisor.) So you might want to check it out for range
of numbers you want to divide on some faster computer, before
implementing in HP48.
Best wishes from,
--
Robert Tiismus
http://www.physic.ut.ee/~robert
I'd like to see that routine of yours!
The DIV5 routine in the 48 ROM is actually implemented
exactly the way you discribed it - but it can indeed be
improved very easily - as there is no need to work with
10 digits, 6 will do (and shift after every addition
instead of all the shifts at the end, like DIV5 does),
as follows:
*
* C.A = C.A DIV 5
* uses D.A, P
*
=DIV5
P= 5 * C = n
C=0 P
D=C WP
D=D+C WP
D=D+C WP * D = 3*n
C=C+D WP * C = 4*n
CSR WP * C = n * 0.4h
C=C+D WP
CSR WP * C = n * 0.34h
C=C+D WP
CSR WP * C = n * 0.334h
C=C+D WP
CSR WP * C = n * 0.3334h
C=C+D WP
CSR WP * C = n * 0.33334h
C=C+D WP
CSR WP * C = n * 0.333334h
P= 0
RTN
--
Best Regards,
Werner Huysegoms
Reply-To: werner.h...@dgx11.cec.be
remove the x before replying
On Thu, 3 Jun 1999, Keith Farmer wrote:
> Aaron Wallace wrote:
>
> > keep count of how many loops you make? Or is there a faster/more
> > efficient way?
>
> Why not implement the equivalent of long division? This is just
> pseudocode and not particularly efficient -- I'm not sure it's entirely
> correct -- but should give an idea.
this is the way I have done it, but I was just wondering if there was a
way to get around using a loop. Right now, I can do a divide by 10 on
#800d and it takes 1.19ms (which isn't too bad).
--
Aaron.
wow. this is great. did you come up with this, or is this a "common"
trick?
I did a timing test with this method vs. doing a CSRB.F A and then
dividing by 5 using a loop to do subtracts until I get an underflow. For
the largest number (#7FFFF) it was about 5,000 times faster!
Thanks!
--
Aaron.
I wouldn't know.. Haven't written ML before.. ;)
Actually it was fun thinking it up on the fly. I'd always wondered how
they did integer divide efficiently, and this forced the issue.
On Fri, 4 Jun 1999, Aaron Wallace wrote:
> > I'd like to see that routine of yours!
> > The DIV5 routine in the 48 ROM is actually implemented
> > exactly the way you discribed it - but it can indeed be
> > improved very easily - as there is no need to work with
> > 10 digits, 6 will do (and shift after every addition
> > instead of all the shifts at the end, like DIV5 does),
> > as follows:
ahh.. I see now that this is in ROM.. should have read more carefully. ;-)
> >
> > *
> > * C.A = C.A DIV 5
> > * uses D.A, P
> > *
> > =DIV5
> > P= 5 * C = n
> > C=0 P
> > D=C WP
> > D=D+C WP
> > D=D+C WP * D = 3*n
> > C=C+D WP * C = 4*n
> > CSR WP * C = n * 0.4h
> > C=C+D WP
> > CSR WP * C = n * 0.34h
> > C=C+D WP
> > CSR WP * C = n * 0.334h
> > C=C+D WP
> > CSR WP * C = n * 0.3334h
> > C=C+D WP
> > CSR WP * C = n * 0.33334h
> > C=C+D WP
> > CSR WP * C = n * 0.333334h
> > P= 0
> > RTN
--
Aaron.
No, it isn't. The one in the ROM is quite different,
this was a way to improve upon it.
It's a by-product from an MC to do an ML DIV12, a while (years?)
ago, and the DIV5 here (although written from memory) originates
from a post by Mika Heiskanen at that time.
On Mon, 7 Jun 1999, Werner Huysegoms wrote:
> In article
> <Pine.GSO.3.96.99060...@infoserv.utdallas.edu>,
> Aaron Wallace <awal...@utdallas.edu> wrote:
> >
> >
> > On Fri, 4 Jun 1999, Aaron Wallace wrote:
> >
> > > > I'd like to see that routine of yours!
> > > > The DIV5 routine in the 48 ROM is actually implemented
> > > > exactly the way you discribed it - but it can indeed be
> > > > improved very easily - as there is no need to work with
> > > > 10 digits, 6 will do (and shift after every addition
> > > > instead of all the shifts at the end, like DIV5 does),
> > > > as follows:
> > ahh.. I see now that this is in ROM.. should have read more carefully.
> ;-)
> >
>
> No, it isn't. The one in the ROM is quite different,
> this was a way to improve upon it.
> It's a by-product from an MC to do an ML DIV12, a while (years?)
> ago, and the DIV5 here (although written from memory) originates
> from a post by Mika Heiskanen at that time.
>
so what about the DIV5 I saw in ROM (6A9E)? It used the same procedure,
but there were more shifting and adding sets (for higher precision, I
suppose since it uses P=10).
--
Aaron.
>so what about the DIV5 I saw in ROM (6A9E)? It used the same procedure,
>but there were more shifting and adding sets (for higher precision, I
>suppose since it uses P=10).
DIV5 in HP48 ROM uses P=10 because the code is not optimized. The
algorithm is the same in both implementations, however the one in the
ROM delays division by #100000 until the very end, whereas the
optimized version does it ASAP.
The accuracy depends only on the number of shift'n'add cycles.
6 is enough to obtain full accuracy, 5 for integers up to #7FFFF,
4 for #7FFF etc.
--
---
--> Mika Heiskanen mhei...@gamma.hut.fi http://www.hut.fi/~mheiskan
> In article <37576539...@fi.tartu.ee>,
> Robert Tiismus <rob...@fi.tartu.ee> wrote:
<snip>
> > One wants to divide x by y.
> > We take some z=2^n so that z is twice as long as x.
> > Then define a=z/y and r=z mod y.
> > Then define b=a+r-1.
> > Dividing by y can now be done by (a*x+b)/z and knowing that z=2^n, its
> > mutch more simple than original problem :)
> > Kind of this algorithm is used in HP48 for dividing by 5 but
> using the
> > above algorithm gives somewhat faster and shorter routine (tested it
> :)
>
> I'd like to see that routine of yours!
> The DIV5 routine in the 48 ROM is actually implemented
> exactly the way you discribed it - but it can indeed be
> improved very easily - as there is no need to work with
> 10 digits, 6 will do (and shift after every addition
> instead of all the shifts at the end, like DIV5 does),
> as follows:
>
Well, as it happens sometimes, I found my routine amongst some others on
one almost faded out printout when searching for some totally different
documents. It is 13% shorter than HP original and 19% faster, but does not
beat your optimized routine. Here it is:
aDIV5 P= 8
C=0 WP
C=A A
C=C+1 WP
A=C WP
C=C+C WP
A=A+C WP
B=A WP
BSR WP
C=A WP
CSL WP
A=A+C WP
C=A WP
CSL WP
CSL WP
A=A+C WP
A=A+B WP
ASR WP
ASR WP
ASR WP
ASR WP
P= 0
RTNCC
But based on your article, I came up with another one which is 3.7% faster
and 4.7% shorter than yours:
aDIV5 P= 5
A=0 P
A=A+1 WP
C=A WP
A=A+C WP
A=A+C WP
C=A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
C=C+A WP
CSR WP
P= 0
RTNCC
as it can be seen, the HP routine is not based exactly on the method I
described, and by using this method plus some optimization still gives
shorter/faster routine.
Best wishes from,
Robert Tiismus
rob...@physic.ut.ee
>
> aDIV5 P= 5
> A=0 P
> A=A+1 WP
> C=A WP
> A=A+C WP
> A=A+C WP
> C=A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> C=C+A WP
> CSR WP
> P= 0
> RTNCC
>
Sorry, it was not the last version I posted, it should of course be:
aDIV5
P= 5
A=0 P
A=A+1 WP
C=A WP
A=A+C WP
A=A+C WP
C=A WP
ASR WP
A=C+A WP
ASR WP
A=C+A WP
ASR WP
A=C+A WP
ASR WP
A=C+A WP
ASR WP
P= 0
RTNCC
--
Robert Tiismus | PGP public key for 'rob...@fi.tartu.ee'
http://www.physic.ut.ee/~robert | http://www.physic.ut.ee/~robert/pgp.pub
--
Aaron.
P=5 means the rightmost SIX nibs,
whereas the A field consist of the rightmost FIVE nibs;-)
Greetings
Raymond
On Tue, 3 Aug 1999 09:16:05 -0500,
Aaron Wallace <awal...@utdallas.edu> wrote:
> Just a quick question.. why did you use P=5 and then WP? why not just use
> A field? Or did you just use WP so that one can specify how many nibble
> to use?
>
Raymond Hellstern -Magic48ges- Email: Raymond....@gmx.de
--
Aaron.