8086 32-bit multiply

Paul Edwards

unread,

Apr 23, 2021, 5:36:07 PM4/23/21

to

Hi.

Since 1994 I have been working on a project to
create a public domain version of MSDOS, called
PDOS. There is an 8086 version and an 80386
version which can be found here:

http://pdos.sourceforge.net/

I took some shortcuts along the way to get it to
work at all, and one of those has finally bitten me.

I'm getting incorrect results from this:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp
push cx

push ax
mul cx
mov cx, ax
pop ax
mul bx
add dx, cx

pop cx
pop bp
ret
f_lxmul@ endp

Does anyone have some public domain (explicit notice)
8086 (not 80386) code they are willing to share to do
this? Not LGPL. Not BSD. Public domain. The entire
codebase of tens of thousands of lines of code is
public domain.

Also let me know if you wish to be acknowledged in
the source code and/or code check-in. Some people
prefer to remain anonymous.

There are other routines in there that may not work
properly either, but I haven't come across them yet.

Thanks. Paul.

DJ Delorie

unread,

Apr 23, 2021, 7:36:28 PM4/23/21

to

Paul Edwards <muta...@nospicedham.gmail.com> writes:
> ; multiply cx:bx by dx:ax, result in dx:ax

Such would have three multiplies and a few adds:

LSW = bx * ax (lower 16, save upper 16 in XX)

MSW = bx * dx + cx * ax + XX (from lsw)

wolfgang kern

unread,

Apr 23, 2021, 8:51:35 PM4/23/21

to

On 23.04.2021 13:42, Paul Edwards wrote:

[x8086 only]

> ; multiply cx:bx by dx:ax, result in dx:ax

the result of 32*32 bit doesn't fit into 32 bit.
either go with the given limits (16*16 bit) or
build a cascade with intermediate variables aka
MUL-ADD chains.
__
wolfgang

Paul Edwards

unread,

Apr 23, 2021, 11:21:50 PM4/23/21

to

Thanks for the algorithm! I thought I might be able to do that,
but my brain started to melt down. Here's what I came up with,
which causes a hang, but at least it happened after I got the
results of some calculations. I'll see if I can figure out what
is happening.

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp

push bx
push cx
push si
push di

push ax
push bx

; I think this multiples bx * ax and puts the upper 16 bits in ax
; and lower 16 bits in bx
mul bx

; Save upper 16 in si and lower 16 in di
mov si, ax
mov di, bx

; This does the equivalent of bx * dx
pop bx
mov ax, dx
mul bx
mov dx, ax

; Now we do cx * ax with upper 16 bits in ax and lower in cx
pop ax
mul cx

; Now we need to add the results of those two multiplies together
; lower 16 bits first, so we can get the carry
push bp ; ran out of registers!
mov bp, bx
mov bx, ax
mov ax, 1
add dx, cx
jc noone
mov ax, 1
noone:

push ax

; Now the other lower 16 bits we saved
mov ax, 1
add dx, di
jc noone2
mov ax, 1
noone2:

push ax

; Upper 16 bits
mov ax, bx
add bx, ax
pop ax
add bx, ax ; one carry
pop ax
add bx, ax ; the other carry
mov ax, bp
add bx, ax

; store in proper output register
mov dx, bx

pop bp

pop di
pop si
pop cx
pop bx

pop bp
ret
f_lxmul@ endp

BFN. Paul.

Paul Edwards

unread,

Apr 23, 2021, 11:21:52 PM4/23/21

to

On Saturday, April 24, 2021 at 10:51:35 AM UTC+10, wolfgang kern wrote:

> [x8086 only]
> > ; multiply cx:bx by dx:ax, result in dx:ax
> the result of 32*32 bit doesn't fit into 32 bit.

Good point. I didn't think of that. I can't multiply
17 bits by 17 bits, one of the registers needs to
be 0. But I assume I need to at least overflow in
a predictable manner.

> either go with the given limits (16*16 bit) or
> build a cascade with intermediate variables aka
> MUL-ADD chains.

See my most recent post. :-)

BFN. Paul.

wolfgang kern

unread,

Apr 24, 2021, 4:52:31 AM4/24/21

to

On 24.04.2021 05:17, Paul Edwards wrote:

>> [x8086 only]
>>> ; multiply cx:bx by dx:ax, result in dx:ax
>> the result of 32*32 bit doesn't fit into 32 bit.
>
> Good point. I didn't think of that. I can't multiply
> 17 bits by 17 bits, one of the registers needs to
> be 0. But I assume I need to at least overflow in
> a predictable manner.
>
>> either go with the given limits (16*16 bit) or
>> build a cascade with intermediate variables aka
>> MUL-ADD chains.
>
> See my most recent post. :-)

you create a stack frame but use not a single variable there.
and it may hang because your stack isn't balanced.
__
wolfgang

Terje Mathisen

unread,

Apr 24, 2021, 6:22:39 AM4/24/21

to

As several have noted, the code above is missing at least one MUL!

Please test it, then feel free to use (with or without attribution) this
totally untested but reasonably efficent/short code:

mov si,ax
mov di,dx
mul cx ;; hi * lo
xchg ax,di ;; First mul saved, grab org dx
mul bx ;; lo * hi
add di,ax ;; top word of result

mov ax,si ;; retrieve original AX
mul bx ;; lo * lo
add dx,di

At this point DX:AX has the low 32 bits of the multiplication result.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Anton Ertl

unread,

Apr 24, 2021, 10:08:00 AM4/24/21

to

Paul Edwards <muta...@nospicedham.gmail.com> writes:
>On Saturday, April 24, 2021 at 10:51:35 AM UTC+10, wolfgang kern wrote:
>
>> [x8086 only]
>> > ; multiply cx:bx by dx:ax, result in dx:ax
>> the result of 32*32 bit doesn't fit into 32 bit.
>
>Good point. I didn't think of that. I can't multiply
>17 bits by 17 bits, one of the registers needs to
>be 0. But I assume I need to at least overflow in
>a predictable manner.

The usual way is to produce the lower 32 bits of the result, i.e.,
produce a*b mod 2^32. And thanks to the magic of 2s-complement
arithmetic, the result is the same for unsigned multiplication and for
signed multiplication (the results for the high 32 bits would differ,
but you are not interested in that).

- anton
--
M. Anton Ertl Some things have to be seen to be believed
an...@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html

Paul Edwards

unread,

Apr 24, 2021, 5:12:12 PM4/24/21

to

Thanks so much!!!

I have tested it and it works fine. I have committed the
change, with attribution:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

BFN. Paul.