16 bit signed division

305 views
Skip to first unread message

Robb Bates

unread,
Dec 17, 2025, 4:38:20 PM12/17/25
to RC2014-Z80
I don't know why, but it's almost impossible to find a fast compact 16 bit signed division assembly routine.

Yes, I know just use an unsigned one after getting the ABS of both values and negating the result if needed.

But mine is a slow, big and bloated brute force method.  If anyone has a nice small routine already crafted, can you post it?

Ideally, HL=DE/HL with DE=remainder.  But I'll take anything.

Thanks,
Robb

Willy De la Court

unread,
Dec 17, 2025, 4:43:11 PM12/17/25
to RC2014-Z80
https://wikiti.brandonw.net/index.php?title=Z80_Routines:Math:Division
Not tested these but this I think are valid.

Willy De la Court

unread,
Dec 17, 2025, 4:45:23 PM12/17/25
to RC2014-Z80

Robb Bates

unread,
Dec 17, 2025, 5:54:02 PM12/17/25
to RC2014-Z80
Yeah, I've seen those and I'll probably end up using one of them.  I was hoping someone had one already pre-rolled and ready to go.  All the register shuffling that I have to do adds bloat.

The ac/de one is nice and small, but I'll have to see if it leaves the remainder somewhere.  I need a MOD function as well.

Robb

Phillip Stevens

unread,
Dec 18, 2025, 12:08:19 AM12/18/25
to RC2014-Z80
Here is the z88dk small implementation, which follows the C standard of the remainder taking the sign of the dividend.
 

Phillip Stevens

unread,
Dec 18, 2025, 12:10:55 AM12/18/25
to RC2014-Z80
I don't know why, but it's almost impossible to find a fast compact 16 bit signed division assembly routine.
Ideally, HL=DE/HL with DE=remainder.  But I'll take anything.
Here is the z88dk small implementation, which follows the C standard of the remainder taking the sign of the dividend.

There are fast versions too, if you prefer that path.

free...@gmail.com

unread,
Dec 29, 2025, 7:23:11 PM (14 days ago) 12/29/25
to RC2014-Z80
We could always try this on a card, https://www.nxp.com/products/no-longer-manufactured/math-coprocessor:MC68882 not long ago I was able to source some from Aliexpress. 

Justin 

Alan Cox

unread,
Dec 30, 2025, 5:31:32 AM (13 days ago) 12/30/25
to rc201...@googlegroups.com
For integer maths only the CDP1855 might be simpler and certainly easier to build a modern programmable logic copy of ?

Alan



--
You received this message because you are subscribed to the Google Groups "RC2014-Z80" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rc2014-z80+...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/rc2014-z80/79c92d5b-8d9a-4e77-b75e-7609c71a48b0n%40googlegroups.com.

Phillip Stevens

unread,
Dec 30, 2025, 7:26:27 AM (13 days ago) 12/30/25
to RC2014-Z80
On Tuesday, 30 December 2025 Justin wrote:
We could always try this on a card, https://www.nxp.com/products/no-longer-manufactured/math-coprocessor:MC68882 not long ago I was able to source some from Aliexpress.

Usually I’d have the RC2014 APU Module PCBs for sale, but at the moment they’re packed away. But, I’m pretty sure the Gerbers are in this thread somewhere.

The APU Module supports the AMD Am9511A APU, which provides the fastest 16-bit signed division available for 8-bit machines. A 16-bit signed division takes between 84 and 94 APU cycles. To that the overhead of loading and unloading the operands must be added, but it is still quite fast.

The Motorola FPU is closely tied to their CPU, which makes it more difficult to use with Z80 CPUs, and it doesn’t support 16-bit arithmetic to my knowledge. So it could be less useful.

There’s a collection of the Am9511A technical documents on the z88dk github doc site. Using the Am9511A is supported from C and assembly for 8085 and Z80 (Z180, etc) processors. using the -am9511 math library.

If you want to have fast math (long and float) then the Am9511A is the best bet, imho.

Cheers, Phillip

Marten Feldtmann

unread,
Dec 30, 2025, 8:36:51 AM (13 days ago) 12/30/25
to rc201...@googlegroups.com
Am 30.12.25 um 13:26 schrieb Phillip Stevens:
>
> The Motorola FPU is closely tied to their CPU, which makes it more
> difficult to use with Z80 CPUs, and it doesn’t support 16-bit
> arithmetic to my knowledge. So it could be less useful.
>

Here are some postings (from 2020)  regarding the MC68882 and a Z180:


https://schrievkrom.wordpress.com/2020/07/03/mskzio-mc68882-fpu-working-revision/

https://schrievkrom.wordpress.com/2020/07/16/mc68882-as-a-peripheral-the-overhead/


Marten

free...@gmail.com

unread,
Jan 2, 2026, 6:46:47 PM (10 days ago) Jan 2
to RC2014-Z80
Can I interest you in a ROM based look up table? a pair of 27C322 EPROMS and some latches and bus switches. the low byte of each ROM is the output byte, and the high byte from each are the carry flags. The first 16 address lines on each ROM become an A and B byte input. the additional address lines are the carry bits and the operand selector inputs. 

Justin  

Phillip Stevens

unread,
Jan 3, 2026, 10:17:45 AM (9 days ago) Jan 3
to rc201...@googlegroups.com
It works for 8x8 multiply, and other lookups, so reasonable to expect it also works for signed divide.


I’m sure Gerbers for the 8x8 LUT Module are still around here somewhere, which could be extended and adapted for what you’re suggesting.

Willy De la Court

unread,
Jan 3, 2026, 10:25:30 AM (9 days ago) Jan 3
to RC2014-Z80

Tom Storey

unread,
Jan 3, 2026, 9:17:21 PM (9 days ago) Jan 3
to rc201...@googlegroups.com
16-bit multiply and divide are part of the original 68k instruction set, so that might explain why it isn't included in the coprocessor.

--
You received this message because you are subscribed to the Google Groups "RC2014-Z80" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rc2014-z80+...@googlegroups.com.

free...@gmail.com

unread,
Jan 4, 2026, 7:43:34 AM (8 days ago) Jan 4
to RC2014-Z80
after seeing the 8x8 LUT board for the RC2014, I have to ask if a 16x16 LUT could be made with a pair of ROMs? The question in my mind is how to handle the outputs of the ROMs into a single output. IE how does one handle the carry forwards and backwards between the pair of ROM chips.  Some further thought on this leads me to realize something like this for the 68k would involve 16x16 inputs for a 32bit output. 

Justin 

Phillip Stevens

unread,
Jan 5, 2026, 5:58:15 AM (7 days ago) Jan 5
to RC2014-Z80
On Sunday, 4 January 2026 Justin wrote:
after seeing the 8x8 LUT board for the RC2014, I have to ask if a 16x16 LUT could be made with a pair of ROMs? The question in my mind is how to handle the outputs of the ROMs into a single output. IE how does one handle the carry forwards and backwards between the pair of ROM chips.  Some further thought on this leads me to realize something like this for the 68k would involve 16x16 inputs for a 32bit output.

I’d just say with 16x16 > 32 you have quite a bit of complexity to latch the input and get the results out. As an intermediate step, order some 8x8 > 16 bit PCBs using the gerbers shared at osh park and then you experiment with partial solutions using some assembly glue code. See if that scratches the itch. If yes, then you can design the full solution knowing you’re on the right path.

In my use case I found it better to a) write a better software based 24x8 multiplication solution and b) build the APU Module hardware which is much more flexible for integer, long, and float calculations. These grew in demand for different options. The LUT Module fell by the wayside.

P. 

Mark T

unread,
Jan 5, 2026, 10:52:56 AM (7 days ago) Jan 5
to RC2014-Z80
I think 16x16>32 would be 4,096 x SST39SF040s, so not feasible.

free...@gmail.com

unread,
Jan 7, 2026, 9:43:36 AM (5 days ago) Jan 7
to RC2014-Z80
it's four 27C322 ROMs. They're cascaded. 

Bill Shen

unread,
Jan 7, 2026, 11:30:53 AM (5 days ago) Jan 7
to RC2014-Z80
I took a look at signed divide instruction of Z280 and come to appreciate the complexity of the algorithm:  dividend is in DE/HL, divisor is either in register or in memory.  Quotient is in HL, remainder is in DE, but there are a couple error conditions: divided-by-zero, and quotient larger than 2**16.  Z280 has exception flags to deal with these errors, but Z80 will need even more instruction bloats to deal with these exceptions.
Bill.

Reply all
Reply to author
Forward
0 new messages