# Rounding on integer division

33 views

### James Harris

Jun 18, 2021, 4:12:49 AMJun 18
to
I've been putting this off for years but I have to tackle it some time
and I've made a start. It's the seemingly innocent problem of integer
division.

In general, the result of an integer division will require rounding and
a choice must be made as to which way to round. (There is rounding in
floating point, too, but that's a separate topic.)

Some languages take the easy way out and say that integer division
yields whatever result is yielded by the hardware but that's not
portable so I am looking for a default rounding mode which is most
useful to programmers that can be part of the language specification
while also being suitable for fast implementation.

AISI for signed integers there are these potential rounding modes

round up (towards positive infinity)
round down (towards negative infinity)
round towards zero
round away from zero
round to the nearest integer

Furthermore, for the last of those, round nearest, there are values
which are exactly mid way between two integers so there needs to be a
tie breaker to say what will happen in such a case, leading to these
variants:

round nearest up
round nearest down
round nearest towards zero
round nearest away from zero
round nearest even
round nearest odd

meaning the rounding would be to the nearest integer except that if two
integers were exactly the same distance away then the direction would be
up, down, etc, as stated.

I should add that for /unsigned/ integers the situation is similar
except that rounding towards zero is the same as rounding down, etc.

In other words, I make it that there are ten potential rounding modes
for signed integers and six potential rounding modes for unsigned ints,
and the question is over which to choose (at least as default).

Anyone disagree so far?

That's the overview. I'll reply with some of the mathematics of the issue.

--
James Harris

### Dmitry A. Kazakov

Jun 18, 2021, 4:36:06 AMJun 18
to
I do. Integer division is defined mathematically. The following axioms
apply:

1. a = (a/b) * b + a rem b

The definition of full division

2. a > 0, b > 0 => a/b >= 0

The result is not negative when arguments are not.

3. a > 0, b > 0 -> a rem b >= 0

The remainder is not negative when arguments are not.

4. (-a)/b = -(a/b)
a/(-b) = -(a/b)

Multiplication association laws

These leave one valid implementation: truncation, e.g. rounding towards
zero.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

### James Harris

Jun 18, 2021, 4:59:41 AMJun 18
to
On 18/06/2021 09:12, James Harris wrote:

...

> In other words, I make it that there are ten potential rounding modes
> for signed integers and six potential rounding modes for unsigned ints,
> and the question is over which to choose (at least as default).

...

> That's the overview. I'll reply with some of the mathematics of the issue.

If we are dividing A by B there are two potentially useful results: the
quotient and the remainder, and that there are two types of remainder.
One is where the remainder is what's left over of A. The other is where
the remainder is what's left over of B.

For example,

-35 / 10

can lead to a remainder of 5 or -5, depending on how it's seen.
Following what seems to be the most common convention for MOD and REM
operators,

REM = -5 (i.e. what's left over of the -35).
MOD = 5 (i.e. what's left over of the 10).

In other words, REM has the sign of the dividend, MOD has the sign of
the divisor. I think that, also, REM will always be between 0 and the
dividend whereas MOD will always be between 0 and the divisor.

Tedious, isn't it! Now you know why I've been putting this off.....!

Where it gets slightly interesting is that I think that with the
appropriate rounding mode for division the other operations may be
natural consequences. Taking the above division,

-35 / 10

could be seen as either

-4 remainder 5
-3 remainder -5

so that in all cases if you add the remainder to (the divisor times the
quotient) then you get the original dividend. That's the Ada definition
of rem as set out in point 5 of

But that's enough of that for now. I may come back to that particular
point but to save this post getting even more turgid let me ask you what
rounding mode or modes you think a programming language ought to support
for integer division. And which do you think should be default?

I think my own personal preference for integers is "round down" but I
see that Intel's signed divide operation uses "round towards zero" which
seems to me all kinds of wrong. :-(

Also, as a programmer, what integer rounding mode do you prefer a
language to have as default and would you see value in a language
providing other modes or would you consider that an unwanted complexity,
e.g. when reading someone else's code?

--
James Harris

### Dmitry A. Kazakov

Jun 18, 2021, 5:30:07 AMJun 18
to
On 2021-06-18 10:59, James Harris wrote:

> Following what seems to be the most common convention for MOD and REM
> operators,
>
>   REM = -5 (i.e. what's left over of the -35).
>   MOD =  5 (i.e. what's left over of the 10).

That is remainder proper vs. modulo ring complement.

mod is the number to add to the result of division to get back at the
ring's zero. The ring modulo is the dividend. The sign of the complement
is same as of the modulo = the sign of the dividend. The complement must
be from 0 to modulo - 1.

This gives you Ada definitions you refer to.

### David Brown

Jun 18, 2021, 5:44:50 AMJun 18
to
That is one way to define division and remainder of integers. But it is
not the only way. You can also view "a / b" to mean "add or subtract
multiples of "b" to "a", until the remainder is in the range 0 up to
abs(b) - 1. That gives you round down (I think :-) ).

The round towards zero is nicer mathematically, IMHO, but it does mean
you can divide by a positive number and end up with a negative remainder
- that is not always something you want. And rounding down can be more
efficient than rounding towards zero.

As an example, C90 left it to the implementation to decide between round
to zero or round down. C99 then fixed it as round towards zero for
consistency and predictability.

My vote is - like yours - on round towards zero. But it is not quite as
simple a decision as you imply.

(For unsigned types, the result is the same - and there is no question.)

### Andy Walker

Jun 18, 2021, 7:26:45 AMJun 18
to
On 18/06/2021 10:44, David Brown wrote:
> The round towards zero is nicer mathematically, IMHO, [...].

We can all bandy around axioms [esp Dmitry]! But some seem

(a) 23 cakes shared among 5 children is 4 each with 3 left over.

It would be absurd to claim they get 5 each with -2 left over.
So for positive integers, we always have rounding down with
remainder [if any] positive.

(b) If 5 more cakes appear, then they each get 1 more.

Consequently, if 5n more cakes appear, then each get n more,
and for consistency this has to apply equally if n is negative
[else taking away 5 cakes and then giving them back could
change the result]. So for negative numerators, we again have
to round down with remainder positive. This works better with
bank balances than with cakes.

(c) -23 cakes shared among -5 children is the same as (a).

Signs cancel in the same way here as throughout the rest of
mathematics. Negative children don't work very well -- there
are reasons why for thousands of years zero and negative
numbers were not "proper" numbers! -- but the maths does.

Everything else follows; and anything different is for
computational efficiency rather than mathematical niceness.

FWIW, any other rounding regime is easily defined in terms
of the Chosen One, so it's not really going to be a problem for
anyone whose special requirement differs from that.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Litolff

### David Brown

Jun 18, 2021, 8:13:42 AMJun 18
to
On 18/06/2021 13:26, Andy Walker wrote:
> On 18/06/2021 10:44, David Brown wrote:
>> The round towards zero is nicer mathematically, IMHO, [...].
>
>     We can all bandy around axioms [esp Dmitry]!  But some seem
>
> (a)  23 cakes shared among 5 children is 4 each with 3 left over.
>
>      It would be absurd to claim they get 5 each with -2 left over.
>      So for positive integers, we always have rounding down with
>      remainder [if any] positive.

Agreed. (And note that rounding down and rounding towards zero are the
same for non-negative values.)

>
> (b)  If 5 more cakes appear, then they each get 1 more.
>
>      Consequently, if 5n more cakes appear, then each get n more,
>      and for consistency this has to apply equally if n is negative
>      [else taking away 5 cakes and then giving them back could
>      change the result].  So for negative numerators, we again have
>      to round down with remainder positive.  This works better with
>      bank balances than with cakes.

That would also be rounding down, but now for negative numerator.

>
> (c)  -23 cakes shared among -5 children is the same as (a).
>
>      Signs cancel in the same way here as throughout the rest of
>      mathematics.  Negative children don't work very well -- there
>      are reasons why for thousands of years zero and negative
>      numbers were not "proper" numbers! -- but the maths does.
>
>     Everything else follows;  and anything different is for
> computational efficiency rather than mathematical niceness.
>
>     FWIW, any other rounding regime is easily defined in terms
> of the Chosen One, so it's not really going to be a problem for
> anyone whose special requirement differs from that.
>

All this shows is that there are different ways to define division over
signed integers, which each give results that can be viewed as
reasonable, useful and intuitive from different viewpoints.

In reality, there is no good intuitive definition - because we don't
have an intuitive feel for -23 children or -5 cakes. There is no single
mathematical definition that will do the job, partly because "division"
involves two results, not just one, and because you can't make a
division function on integers that gives the key result you really want,
that (a / b) * b == a. And then we throw into the mix the fact that
numbers in a programming language rarely represent mathematical integers
exactly - they are usually bounded, and have odd effects at their
boundaries.

In programming languages, round down and round towards zero are the two
common strategies used. Each can be justified "intuitively" or
"mathematically".

And for those that think signed integers should be defined as wrapping,
because they think being "more defined" automatically means "better",
things get more complicated. For example, if we work with 16-bit values
(keeping the numbers smaller), then -5 / 3 should be -21847 remainder 0,
since -21847 * 3 equals -5 in wrapping 16-bit signed arithmetic.

### Bart

Jun 18, 2021, 9:09:19 AMJun 18
to
On 18/06/2021 09:59, James Harris wrote:
> On 18/06/2021 09:12, James Harris wrote:
>
> ...
>
>> In other words, I make it that there are ten potential rounding modes
>> for signed integers and six potential rounding modes for unsigned
>> ints, and the question is over which to choose (at least as default).
>
> ...
>
>> That's the overview. I'll reply with some of the mathematics of the
>> issue.
>
> If we are dividing A by B there are two potentially useful results: the
> quotient and the remainder, and that there are two types of remainder.
> One is where the remainder is what's left over of A. The other is where
> the remainder is what's left over of B.

I've never heard of that one.

> For example,
>
>   -35 / 10
>
> can lead to a remainder of 5 or -5, depending on how it's seen.
> Following what seems to be the most common convention for MOD and REM
> operators,
>
>   REM = -5 (i.e. what's left over of the -35).
>   MOD =  5 (i.e. what's left over of the 10).

But will the 5 always be 5?

Perhaps you should have chosen values where the remainder wasn't exactly
half of B.

### Dmitry A. Kazakov

Jun 18, 2021, 9:22:28 AMJun 18
to
On 2021-06-18 14:13, David Brown wrote:

> In reality, there is no good intuitive definition - because we don't
> have an intuitive feel for -23 children or -5 cakes. There is no single
> mathematical definition that will do the job, partly because "division"
> involves two results, not just one, and because you can't make a
> division function on integers that gives the key result you really want,
> that (a / b) * b == a.

Actually there is. The concept is called continuation or extension. For
example, you take a system like natural numbers and then notice that
some elements are missing, e.g. a - b when a < b. This way negative
numbers can be defined as an extension of the set of natural numbers
fulfilled with negative ideals -a.

Doing so you keep as much properties intact as possible, all, if you
can. That way you can formally prove that there is really only one way
to produce negative numbers. Similarly, there is only one way to define
division of negative numbers. Because you want to keep fundamental
properties like (-a)/b = -(a/b). You have that for multiplication and
want to have it for its inverse as well.

Likewise, there is only one way to extend integers to reals. And only
one way to extend that to complex numbers. And there is no "good" way to
go any further without losing too much (e.g. quaternions).

### Dmitry A. Kazakov

Jun 18, 2021, 9:36:20 AMJun 18
to
On 2021-06-18 15:09, Bart wrote:

> But will the 5 always be 5?
>
> Perhaps you should have chosen values where the remainder wasn't exactly
> half of B.

The absolute value of the complement equals the absolute value of the
remainder if the divisor and the dividend have same signs.

|mod| = |rem|

If signs differ then, in absolute values the relation is this:

zero remains zero
everything else is

|mod| = |divisor| - |rem| (in absolute values)

E.g. -35 rem 3 = -2
-35 mod 3 = 1 (positive modulo 3)

2 + 1 = 3

### Bart

Jun 18, 2021, 9:40:54 AMJun 18
to
On 18/06/2021 09:12, James Harris wrote:
Disagree with what, that all this is far too complicated?

This is the approach I try use (here A and B are both positive):

* There is no rounding, not to nearest value anyway. The result of A/B
is the number of times you can subtract B from A while B>=A.

* With remainder, it is the final value of A in that last step

* When one operand is negative, the result should be negative. So (-A)/B
or A/(-B) is the same -(A/B), and ideally the same for remainder (but
see below).

* When both are negative, then they cancel out.

So that is a simple model, where the operations are only defined for
positive values, and any negative values are dealt with on top.

As to how it works in practice:

* x64 DIV hardware deals with A/B according to that model, including signs

* x64 REM hardware (remainder from DIV) almost does the same, but can
give a different sign for the result for A/-B or -A/-B. (Not sure what
what they mean anyway. Perhaps always apply abs() to the result.)

* My bigint library, which uses signed magnitude, gives the same results
as DIV and REM above. This was a surprise, as REM is done in software.
(I haven't looked at why yet; perhaps it was simply to emulate x64 REM.)

There is another issue which I've discovered. While the above rounds
towards zero, if I try and optimise division by a constant (positive)
power of two by turning it into a right shift, then for negative A, it
will round away from zero!

(I've seen the effects of this in a jpeg decoder where in a line like this:

p^ := clamp((bb*57/2048)+luminance, 0,255)

the /2048 is turned into a shift. Then lsb of the final pixel may be
different when bb is negative. Fortunately for this app, that doesn't
matter.)

### David Brown

Jun 18, 2021, 9:49:43 AMJun 18
to
Sure, you can do extensions in various ways (and you can, at least
sometimes, make choices about what you consider "fundamental properties"
- resulting in different extensions). And you can pick other algebraic
structures like modular rings for your integers.

But the prime purpose of integer types and operations in a programming
language are to be useful to the programmer - /not/ to satisfy
particular mathematical properties of mathematical integers, or other
number types. You only have a fixed finite size (for fixed size integer
types), and you will of necessity give up some mathematical properties
in creating your types and operations. It is always a compromise, and
it is a bad idea to declare certain properties as more "fundamental"
than others until you have looked at the options and decided on the
compromise that would work best for the language.

### Dmitry A. Kazakov

Jun 18, 2021, 9:54:40 AMJun 18
to
On 2021-06-18 14:13, David Brown wrote:

> In reality, there is no good intuitive definition - because we don't
> have an intuitive feel for -23 children or -5 cakes.

Let you borrowed \$100 from a bank. You own \$-100, the bank owns \$100.
Say pay off the loan in 3 years. The amount of your money per year is
\$-100/3 = \$-33 or \$33 of the bank's money. The remainder after 3 years
is \$-1 of your money or \$1 of the bank's money.

Calculations on your and bank's sides are same.

### Dmitry A. Kazakov

Jun 18, 2021, 10:09:22 AMJun 18
to
On 2021-06-18 15:49, David Brown wrote:

> Sure, you can do extensions in various ways (and you can, at least
> sometimes, make choices about what you consider "fundamental properties"
> - resulting in different extensions). And you can pick other algebraic
> structures like modular rings for your integers.

But we are talking about integers here.

> But the prime purpose of integer types and operations in a programming
> language are to be useful to the programmer - /not/ to satisfy
> particular mathematical properties of mathematical integers, or other
> number types.

No. The purpose of *all* data types is to model entities of real world.
Programming language as well as programmers using them have no any
purpose outside the real world.

Integer is such an external entity existing in mathematics.

If an entity is modeled improperly, that constitutes a bug. Bugs must be
fixed.

> You only have a fixed finite size (for fixed size integer
> types), and you will of necessity give up some mathematical properties
> in creating your types and operations.

Sure. Computational models are never accurate. E.g. floating-point
operations might be not exactly commutative.

But in the case of division there is no reason to introduce systematic
inaccuracy. Likewise, you could not argue for 2+2=5 on the ground that
integers are incomputable. They are not, but you still can implement 2+2
properly.

### Bart

Jun 18, 2021, 11:38:57 AMJun 18
to
You can argue that 2+2=0 where you only have 2 bits of precision. This
program demonstrates that (bitfield here are unsigned):

record rec=
byte dummy: (x:2, y:2, z:2)
end

rec a
a.x := 2
a.y := 2
a.z := a.x+a.y

println a.x
println a.y
println a.z

Output is 2 2 0. However 'print a.x+a.y' displays 4, since internal
calculations are 64 bits wide.

Then you can argue instead that 9223372036854775808 +
9223372036854775808 = 0.

### Dmitry A. Kazakov

Jun 18, 2021, 12:15:38 PMJun 18
to
No, we had that discussion before.

The result must be either mathematically correct or else an ideal = dome
non-integer value. That could be propagation of an exception or +Inf
etc, never 0.

### Charles Lindsey

Jun 18, 2021, 12:51:50 PMJun 18
to
On 18/06/2021 09:59, James Harris wrote:
> On 18/06/2021 09:12, James Harris wrote:
>
> ...
>
>> In other words, I make it that there are ten potential rounding modes for
>> signed integers and six potential rounding modes for unsigned ints, and the
>> question is over which to choose (at least as default).
>
> ...
>
>> That's the overview. I'll reply with some of the mathematics of the issue.
>
> If we are dividing A by B there are two potentially useful results: the quotient
> and the remainder, and that there are two types of remainder. One is where the
> remainder is what's left over of A. The other is where the remainder is what's
> left over of B.
>
> For example,
>
>   -35 / 10
>
> can lead to a remainder of 5 or -5, depending on how it's seen. Following what
> seems to be the most common convention for MOD and REM operators,
>
>   REM = -5 (i.e. what's left over of the -35).
>   MOD =  5 (i.e. what's left over of the 10).
>
Yes, though I would put it differently.

The 'modulo' function is a well-known mathematical construct, such that
a mod b
maps a into numbers of the form 0<=m<b if b>=0
and m>b<=0 if b<=0
so if you plot a mod b against a you get a sawtooth waveform which shows nothing
unusual in the vicinity of zero.

Then, if you use 'mod' as your remainder function, and you plot a/b against a,
you get a set of discrete points which all lie on a straight line, which is a
mathematically clean property.

So it follows that (a+n)/b == a/b +n*b
which is useful if, for example, you are trying to compute the coordinates of
some item in a matrix, and then want to shift the origin.

BUT if you choose the so-called 'rem' as your remainder function, and you then
plot a/b against a, you do NOT get a straight line, and if you want to shift the
origin in some coordinate system, then you get a complete mess (and likely some
bugs too).

So it is clear which version any mathematician would prefer.

Now when the first computers were built, division was not built into the order
code, but had to be performed by a subroutine (and in those days subroutines
were usually written by mathematicians.

BUT when division came to be incorporated into the hardware, the hardware was
designed by Engineers, and unless those engineers were closely supervised by
Mathematicians (which they usually were in Europe, though not in America) they
did it wrong. So when I was designing the Orion computer for Ferranti, I
consulted the people in Portland Place and did it right. IBM, in general, did it
wrong, and hence so did FORTRAN (AIUI).

I as pretty sure Algol 60 got it right, and I made sure that Algol 68 did
likewise, and so, it would seem, did Ada. C sat on the fence (they recognised
there was a problem) but then fell off the wrong side. C++ also sat on the fence
(and still sits there SFAIK); but it also defined the % operator to be either
'rem' or 'mod' so as to be consistent with '/'. Bjarne had a good mathematical
background, and knew his Algol 68.

> In other words, REM has the sign of the dividend, MOD has the sign of the
> divisor. I think that, also, REM will always be between 0 and the dividend
> whereas MOD will always be between 0 and the divisor.
>
> Tedious, isn't it! Now you know why I've been putting this off.....!
>
> Where it gets slightly interesting is that I think that with the appropriate
> rounding mode for division the other operations may be natural consequences.
> Taking the above division,
>
>   -35 / 10
>
> could be seen as either
>
>   -4 remainder 5
>   -3 remainder -5
>
> so that in all cases if you add the remainder to (the divisor times the
> quotient) then you get the original dividend. That's the Ada definition of rem
> as set out in point 5 of
>
>
>
>
>
> But that's enough of that for now. I may come back to that particular point but
> to save this post getting even more turgid let me ask you what rounding mode or
> modes you think a programming language ought to support for integer division.
> And which do you think should be default?
>
> I think my own personal preference for integers is "round down" but I see that
> Intel's signed divide operation uses "round towards zero" which seems to me all
> kinds of wrong. :-(

--
Charles H. Lindsey ---------At my New Home, still doing my own thing------
Tel: +44 161 488 1845 Web: https://www.clerew.man.ac.uk
Email: c...@clerew.man.ac.uk Snail-mail: Apt 40, SK8 5BF, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

### Bart

Jun 18, 2021, 2:23:26 PMJun 18
to
On 18/06/2021 17:51, Charles Lindsey wrote:
> On 18/06/2021 09:59, James Harris wrote:

>> If we are dividing A by B there are two potentially useful results:
>> the quotient and the remainder, and that there are two types of
>> remainder. One is where the remainder is what's left over of A. The
>> other is where the remainder is what's left over of B.
>>
>> For example,
>>
>>    -35 / 10
>>
>> can lead to a remainder of 5 or -5, depending on how it's seen.
>> Following what seems to be the most common convention for MOD and REM
>> operators,
>>
>>    REM = -5 (i.e. what's left over of the -35).
>>    MOD =  5 (i.e. what's left over of the 10).
>>
> Yes, though I would put it differently.
>
> The 'modulo' function is a well-known mathematical construct, such that
>    a mod b
> maps a into numbers of the form 0<=m<b if b>=0
> and                             m>b<=0 if b<=0
> so if you plot a mod b against a you get a sawtooth waveform which shows
> nothing unusual in the vicinity of zero.
>
> Then, if you use 'mod' as your remainder function, and you plot a/b
> against a, you get a set of discrete points which all lie on a straight
> line, which is a mathematically clean property.

If I use A68G's % (int divide) and MOD operators, then plotting A/3
against A gives me a staircase shape, increasing from left to right,
with an extra-wide step about zero.

The MOD operator (A vs. A MOD 3) gives me the horizontal sawtooth curve
you mentioned. I'm not sure where the straight line comes into it.

If I use my REM function (that I mentioned in another post, with its
funny +/- signs), I get a sawtooth curve that is reflected about both X
and Y axes (180 degree rotation I think).

But given D = A % B, and remainder R = A rem B, that gives me the
ability to recover A using:

D*B + R

So perhaps that funny +/- behaviour has a purpose after all.

### James Harris

Jun 19, 2021, 4:45:14 AMJun 19
to
Good. :-)

>
> Integer division is defined mathematically. The following axioms
> apply:
>
> 1. a = (a/b) * b + a rem b
>
> The definition of full division

I agree with that. It's a useful premise. Do you have any objection to
it being true for mod as well as for rem? Ada does not seem to define it
for mod and I can't see why not.

>
> 2. a > 0, b > 0 => a/b >= 0
>
> The result is not negative when arguments are not.
>
> 3. a > 0, b > 0 -> a rem b >= 0
>
> The remainder is not negative when arguments are not.
>
> 4. (-a)/b  = -(a/b)
>    a/(-b)  = -(a/b)
>
> Multiplication association laws
>
> These leave one valid implementation: truncation, e.g. rounding towards
> zero.
>

I am not sure your set of axioms deals with A and B both being negative
but AISI if the quotient is Q and the remainder is R then:

* Q will always have the 'xor' of the signs of A and B

* |R| will always be between 0 (inclusive) and |B| (exclusive)

* the sign of R, if R is nonzero, will depend on how Q was rounded

I don't see how you get to there being one valid implementation but it
seems increasingly likely that rounding a division towards zero gives
the Ada REM as a by-product whereas rounding towards minus infinity
would give the Ada MOD. That may be useful.

--
James Harris

### James Harris

Jun 19, 2021, 5:40:18 AMJun 19
to
On 18/06/2021 17:51, Charles Lindsey wrote:
> On 18/06/2021 09:59, James Harris wrote:

...

>
> BUT if you choose the so-called 'rem' as your remainder function, and
> you then plot a/b against a, you do NOT get a straight line, and if you
> want to shift the origin in some coordinate system, then you get a
> complete mess (and likely some bugs too).

I know what you mean. With rounding towards zero one gets irregular
results about x=zero rather than mathematical linearity. That's the main
reason I dislike it: it effectively treats below zero and above zero as
separate cases, which makes little or no sense to me.

>
> So it is clear which version any mathematician would prefer.

Curious to see that others prefer round towards zero. I think I'll have
to provide at least round down and round towards zero, and maybe others
as well.

...

>> I think my own personal preference for integers is "round down" but I
>> see that Intel's signed divide operation uses "round towards zero"
>> which seems to me all kinds of wrong. :-(
>
>

I may well do!

The thing is, on hardware which rounds division towards zero, such as
x86, how expensive is it to implement integer round down?

--
James Harris

### James Harris

Jun 19, 2021, 5:46:41 AMJun 19
to
On 18/06/2021 12:26, Andy Walker wrote:
> On 18/06/2021 10:44, David Brown wrote:
>> The round towards zero is nicer mathematically, IMHO, [...].
>
>     We can all bandy around axioms [esp Dmitry]!  But some seem
>
> (a)  23 cakes shared among 5 children is 4 each with 3 left over.
>
>      It would be absurd to claim they get 5 each with -2 left over.
>      So for positive integers, we always have rounding down with
>      remainder [if any] positive.
>
> (b)  If 5 more cakes appear, then they each get 1 more.
>
>      Consequently, if 5n more cakes appear, then each get n more,
>      and for consistency this has to apply equally if n is negative
>      [else taking away 5 cakes and then giving them back could
>      change the result].  So for negative numerators, we again have
>      to round down with remainder positive.  This works better with
>      bank balances than with cakes.

So you prefer rounding down and therefore (with a negative numerator)

-23 / 5 = -5 remainder 2

?

I don't necessarily disagree with that. Just checking whether that's
what you mean.

--
James Harris

### James Harris

Jun 19, 2021, 6:02:15 AMJun 19
to
On 18/06/2021 14:09, Bart wrote:
> On 18/06/2021 09:59, James Harris wrote:

...

>> For example,
>>
>>    -35 / 10
>>
>> can lead to a remainder of 5 or -5, depending on how it's seen.
>> Following what seems to be the most common convention for MOD and REM
>> operators,
>>
>>    REM = -5 (i.e. what's left over of the -35).
>>    MOD =  5 (i.e. what's left over of the 10).
>
> But will the 5 always be 5?
>
> Perhaps you should have chosen values where the remainder wasn't exactly
> half of B.

OK, here's an example of -32 / 10. The result can be either of

-4 remainder 8
-3 remainder -2

Whereas 32 / -10 can result in either of

-4 remainder -8
-3 remainder 2

For just changing where the sign is that's quite a contrast! :-(

In each case the lines are in the order

round down (and MOD)
round towards zero (and REM)

--
James Harris

### James Harris

Jun 19, 2021, 6:32:28 AMJun 19
to
Do you disagree with there being, at least in theory, those ten
potential rounding modes for signed and six for unsigned?

Also, would you support them all in a programming language? Or which
ones would you support? And, I have to ask, if you would support more
than one, how would you have a programmer express the rounding mode in
the syntax?

...

> (I've seen the effects of this in a jpeg decoder where in a line like this:
>
>     p^ := clamp((bb*57/2048)+luminance, 0,255)
>
> the /2048 is turned into a shift. Then lsb of the final pixel may be
> different when bb is negative. Fortunately for this app, that doesn't
> matter.)

Good point. What does two's complement right shift do to negative
numbers? A signed shift one place right applied to

2'1110_0001'
2'1110_0000'

would in both cases result in 2'1111_0000'.

I think (using an online calculator) that that means -31 and -32 both
become -16. Which I think means they both round DOWN.

If so, that's effectively another vote for Charles's suggestion to round
down by default, rather than towards zero.

--
James Harris

### Bart

Jun 19, 2021, 6:33:17 AMJun 19
to
On 19/06/2021 10:46, James Harris wrote:
So -1/1000 is -1, with remainder 999?

I can't say I find that intuitive. Apart from which, how much work is it
going to take to get those results given that most hardware works
differently?

If you want to get mathematical, then 1/1000 and -1/1000 will give the
float values 0.001 and -0.001, differing only in sign. Scale the values
by X, and the results are equally scaled.

If you convert such a float result to integer, then the result will not
match the proposal, unless you introduce equivalent rounding methods for
converting float to integer.

The usual behaviour is to round towards zero.

### James Harris

Jun 19, 2021, 7:14:05 AMJun 19
to
On 19/06/2021 11:33, Bart wrote:
> On 19/06/2021 10:46, James Harris wrote:

...

>> So you prefer rounding down and therefore (with a negative numerator)
>>
>>    -23 / 5 = -5 remainder 2
>
> So -1/1000 is -1, with remainder 999?

If, for that expression, one is using mod and mod is defined as
resulting in a number between 0 and 999 then yes. What other choice
would there be?

I think I'm going to have to offer the programmer a choice of at least
rounding integer division down and rounding towards zero, and maybe all
the others.

The questions, then, are which to make default and how to specify some
other rounding mode in the syntax.

>
> I can't say I find that intuitive.

AISI one can think of mod as providing a remainder in the range one
requires. For example, if one wants a remainder in the range 0 to 9 (say
for display of a decimal number) then one uses a denominator of 10. And
if one wants a remainder to be in the range 0 to -9 then one uses a
denominator of -10 - both with the mod operation (rounding down, i.e.
towards minus infinity).

Correspondingly, the case you mention looks like it is asking for a mod
in the range 0 to 999. And that's what it would get.

Essentially, one would use mod (rounding down) if one wanted a remainder
to be in a range determined by the /denominator/, whereas if one wanted
a remainder to be what's left /of the numerator/ then one would use rem
(rounding towards zero).

IOW, mod is controlled by the denominator and rem is controlled by the
numerator.

>
> Apart from which, how much work is it
> going to take to get those results given that most hardware works
> differently?

I don't know. Any suggestions? :-)

>
> If you want to get mathematical, then 1/1000 and -1/1000 will give the
> float values 0.001 and -0.001, differing only in sign. Scale the values
> by X, and the results are equally scaled.
>
> If you convert such a float result to integer, then the result will not
> match the proposal, unless you introduce equivalent rounding methods for
> converting float to integer.
>
> The usual behaviour is to round towards zero.

For some definition of 'usual'...!

--
James Harris

### James Harris

Jun 19, 2021, 7:55:13 AMJun 19
to
On 18/06/2021 19:23, Bart wrote:
> On 18/06/2021 17:51, Charles Lindsey wrote:

...

>> Then, if you use 'mod' as your remainder function, and you plot a/b
>> against a, you get a set of discrete points which all lie on a
>> straight line, which is a mathematically clean property.
>
> If I use A68G's % (int divide) and MOD operators, then plotting A/3
> against A gives me a staircase shape, increasing from left to right,
> with an extra-wide step about zero.

That's surprising. A wide step around zero seems to me to be
mathematically irregular and might be the result of rem but is not what
I would expect of a mod operator.

>
> The MOD operator (A vs. A MOD 3) gives me the horizontal sawtooth curve
> you mentioned. I'm not sure where the straight line comes into it.
>
> If I use my REM function (that I mentioned in another post, with its
> funny +/- signs), I get a sawtooth curve that is reflected about both X
> and Y axes (180 degree rotation I think).

I found a number of such graphs at

https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition

Which of the graphs shown there match what you see?

--
James Harris

### Bart

Jun 19, 2021, 9:06:15 AMJun 19
to
The very first graph there shows an example of the wide step for divide,
in red.

The top left one matches the behaviour of my languages for integer
divide and rem. But none exactly match that of A68G, which is a
combination of the top red divide graph, and the left green sawtooth in
the second box.

### David Brown

Jun 19, 2021, 9:33:54 AMJun 19
to
I don't know Algol 68, but according to the table further down on the
page, it uses the "Euclidean definition" (in which the remainder is
always non-negative, regardless of the sign of the numerator and
denominator).

The page and the graphs are quite instructive, I think. They show that
there is always a balance of properties here - whichever algorithm a
language picks, there will be compromises and effects that don't seem
reasonable or that don't match useful mathematical identities.

The table for languages may be a useful guide for a new language. For
example, if you are making a new language that is mostly C-like in style
and appearance, picking the same division as C (truncation) would
minimise surprises.

### Dmitry A. Kazakov

Jun 19, 2021, 10:46:22 AMJun 19
to
Just apply the rules, mathematics is your friend:

(-a)/(-b) = [apply (-a)/b = -(a/b)]
= -(a/(-b)) = [apply a/(-b) = -(a/b)]
= -(-(a/b)) = [apply -(-a) = a]
= a/b

### Andy Walker

Jun 19, 2021, 11:32:09 AMJun 19
to
On 19/06/2021 10:46, James Harris wrote:
[I wrote:]
>> [...] So for negative numerators, we again have
>>       to round down with remainder positive.  This works better with
>>       bank balances than with cakes.
> So you prefer rounding down and therefore (with a negative numerator)
>   -23 / 5 = -5 remainder 2
> ?
> I don't necessarily disagree with that. Just checking whether that's
> what you mean.

Yes. In the terms of the Wiki page that you referred us to,
that's "floored" or "euclidean" division, and satisfies identities

(a + nb) % b == a % b + n and a % b * b + a %* b = a

for all integers a, b, n provided b > 0. IRL, b is rarely negative,
and I don't really care what happens in that case, but if pushed I
would vote for "a *% b" to be non-negative ["euclidean"].

As Bart said, A68G [correctly according to the RR] uses
"truncated" quotient and "euclidean" remainder, which seems to me
to be broken, even if Charles did insist on it on advice from
mathematicians. As a mathematician, I agree with Knuth, Boute and
Leijen, the only named people in the Wiki article, that "floored"
and "euclidean" are superior to "truncated". According to Boute
in that article, Pascal is broken as it fails the second identity
above; if so, then Algol [60 and 68] is broken in exactly the
same way.

Like Bart, I don't like the "wide step" [which follows from
not obeying the first identity above], and don't see how any real
mathematician could approve of it. If that's a consequence of
designing languages around what's in the hardware [which seems to
be the claim], then the hardware is also broken. How fast do you
want your incorrect results? The only mitigation is that [again,
IRL] the principal uses have positive numerators and denominators
[eg, in number theory], for which almost all definitions agree.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Chalon

### Bart

Jun 19, 2021, 12:14:20 PMJun 19
to
If you say so. I've never really looked into it in that detail!

> Also, would you support them all in a programming language?

Not in any of mine.

> Or which
> ones would you support? And, I have to ask, if you would support more
> than one, how would you have a programmer express the rounding mode in
> the syntax?

My atttitude is that I simply don't want those all complications, all
those possibilities to have to think about, in the language. It makes it
harder to read and write especially for those equally uninterested.

The language will provide one default way of operating (hopefully the
most useful and/or intuitive), but should provide means of achieving the
other results via other features.

For example, with the simpler operation of converting float to integer,
it will round down towards zero (so both 5.8 and 5.2 end up as 5, and
-5.8 and -5.2 as -5)

This I believe is the default behaviour of x64's cvttsd2si instruction
(double 2 int).

Anyway, for any other behaviour, I use a combinination of round() (to
nearest int, but return a float), floor() and ceil().

And the same with any special requirements for integer divide and rem.

Until this thread, I'd assumed mod and rem were the same! (What C calls
'mod', its % operator, gives the same results as my 'rem'.)

If you are creating languages for other people to use (or those like
me), they might know about any of this either.

>> the /2048 is turned into a shift. Then lsb of the final pixel may be
>> different when bb is negative. Fortunately for this app, that doesn't
>> matter.)
>
> Good point. What does two's complement right shift do to negative
> numbers? A signed shift one place right applied to
>
> 2'1110_0001'
> 2'1110_0000'
>
> would in both cases result in 2'1111_0000'.

> I think (using an online calculator) that that means -31 and -32 both
> become -16. Which I think means they both round DOWN.

Yes, -32 always becomes -16 with either divide or shift.

But -31 because -15 with divide or -16 with shift.

The discrepancy is the problem. I think that most divides will not be by
powers of two, so what divide does is more important.

A simple optimisation is to turn a divide by a constant power of two
into a shift. A more complex one is dividing by any constant value of
two, turned into multiplies and shifts.

But they ideally need to behave the same way as dividing by a variable.
(This is where my language can give a different result. I've put it down
as a quirk for now.)

### James Harris

Jun 19, 2021, 2:22:22 PMJun 19
to
On 19/06/2021 17:14, Bart wrote:
> On 19/06/2021 11:32, James Harris wrote:
>> On 18/06/2021 14:40, Bart wrote:

...

> My atttitude is that I simply don't want those all complications, all
> those possibilities to have to think about, in the language. It makes it
> harder to read and write especially for those equally uninterested.

Neither do I. But

(1) I want the computations of a program to be the same on any machine.
That means that the definition has to be part of the language so that
implementations are not free to produce different results.

(2) I suspect that some programmers will want such controls.

>
> The language will provide one default way of operating (hopefully the
> most useful and/or intuitive), but should provide means of achieving the
> other results via other features.
>
> For example, with the simpler operation of converting float to integer,
> it will round down towards zero (so both 5.8 and 5.2 end up as 5, and
> -5.8 and -5.2 as -5)
>
> This I believe is the default behaviour of x64's cvttsd2si instruction
> (double 2 int).

So I see, though truncation seems to be Intel's favourite (as it is of
some people here) so it's perhaps not completely surprising that they
have an instruction for it. The more general

cvtsd2si

rounds according to the rounding-control bits in the MXCSR. By default
that's 'round nearest even' but it can be changed to 'round down',
'round up' and 'round towards zero (aka truncate)'.

>
> Anyway, for any other behaviour, I use a combinination of round() (to
> nearest int, but return a float), floor() and ceil().

OK.

>
> And the same with any special requirements for integer divide and rem.
>
> Until this thread, I'd assumed mod and rem were the same! (What C calls
> 'mod', its % operator, gives the same results as my 'rem'.)
>
> If you are creating languages for other people to use (or those like
> me), they might know about any of this either.

Yes, that's a bit of a problem. I need to find some way to make the
default behaviour as intuitive as possible and not intimidating.

>
>>> the /2048 is turned into a shift. Then lsb of the final pixel may be
>>> different when bb is negative. Fortunately for this app, that doesn't
>>> matter.)
>>
>> Good point. What does two's complement right shift do to negative
>> numbers? A signed shift one place right applied to
>>
>> 2'1110_0001'
>> 2'1110_0000'
>>
>> would in both cases result in 2'1111_0000'.
>
>
>
>> I think (using an online calculator) that that means -31 and -32 both
>> become -16. Which I think means they both round DOWN.
>
> Yes, -32 always becomes -16 with either divide or shift.
>
> But -31 because -15 with divide or -16 with shift.
>
> The discrepancy is the problem. I think that most divides will not be by
> powers of two, so what divide does is more important.

I don't think the fact that it's a power of 2 is significant. For
example, say I pick -53 and -54. In 8-bit binary they are

2'11001011'
2'11001010'

shifting either of them right gives

2'11100101'

which is -27 which (as the true result for -53 / 2 is -26.5) is, again,
rounded down.

Notably, this is not CPU specific. It is pure 2's complement shifting
and should be the same on any CPU which uses 2's complement
representation. And both positive and negative numbers are thereby
rounded down.

Binary conversions courtesy of

https://www.omnicalculator.com/math/twos-complement

>
> A simple optimisation is to turn a divide by a constant power of two
> into a shift.

I agree. And if >> 1 rounds down then / 2 probably should do as well, at
least by default, as programmers are used to replacing one with the other.

>
> A more complex one is dividing by any constant value of
> two, turned into multiplies and shifts.
>
> But they ideally need to behave the same way as dividing by a variable.
> (This is where my language can give a different result. I've put it down
> as a quirk for now.)
>

Yes. You may agree that the code fragment

int d = 4, x, y ,z;

x = val >> 2;
y = val / 4;
z = val / d;

should assign the same value to x, y and z irrespective of what is in val.

--
James Harris

### Rod Pemberton

Jun 19, 2021, 2:33:32 PMJun 19
to
On Fri, 18 Jun 2021 09:12:45 +0100
James Harris <james.h...@gmail.com> wrote:

> In general, the result of an integer division will require rounding
> and a choice must be made as to which way to round. (There is
> rounding in floating point, too, but that's a separate topic.)
>

I see this as an issue of matching the hardware capabilities to the
demands of the environment.

hardware: central processing unit, microprocessor, floating-point unit

business environment: banking, accounting, brokerage, ...
academic environment: mathematics, physics, chemistry, engineering, ...
programming environment: C, Forth, Fortran, COBOL, ...

> Some languages take the easy way out and say that integer division
> yields whatever result is yielded by the hardware but that's not
> portable so I am looking for a default rounding mode which is most
> useful to programmers that can be part of the language specification
> while also being suitable for fast implementation.

Have you looked at the specifications for C and FORTH? Or, etc. like
COBOL and Fortran? I.e., I'd suspect this problem was "solved" decades
ago with the development of early programming languages.

E.g., a web page on some types of accounting rounding, which some lowly
programmer will have to figure out how to program:
https://www.syspro.com/blog/applying-and-operating-erp/rounding-for-accounting-accuracy/

--
What is hidden in the ground, when found, is hidden there again?

### Charles Lindsey

Jun 19, 2021, 5:27:50 PMJun 19
to
On 19/06/2021 12:55, James Harris wrote:
Yes, that page is very good. Clearly, the definition I gave for MOD was wrong
(it is a long time since I had dealing with the A68 Report). But I also not that
there is a pencilled correction in my copy of my Informal Introduction to the
explanation for MOD and I am suspicious of my description of OVER.

The plots given in that page are helpful, and show some nice linear one for the
"good" implementations.

I had not heard of the term "Euclidean division", but the paper by Boute that is

### Bart

Jun 19, 2021, 6:25:34 PMJun 19
to
On 19/06/2021 19:22, James Harris wrote:
> On 19/06/2021 17:14, Bart wrote:
>> On 19/06/2021 11:32, James Harris wrote:
>>> On 18/06/2021 14:40, Bart wrote:
>
> ...
>
>> My atttitude is that I simply don't want those all complications, all
>> those possibilities to have to think about, in the language. It makes
>> it harder to read and write especially for those equally uninterested.
>
> Neither do I. But
>
> (1) I want the computations of a program to be the same on any machine.
> That means that the definition has to be part of the language so that
> implementations are not free to produce different results.
>
> (2) I suspect that some programmers will want such controls.

I've looked in more detail at that page of graphs you linked to, which
gives a summary of support for MOD and REM across many languages (you'll
have to scroll past the graphs..).

There it seems that for integer operands, most languages support only
one version, and some support two.

Based on that, I may myself add support for an extra operator, once I
understand what it does, although it will be mainly for box-ticking.

Two alternatives is manageable I think, especially as so many well-known
languages offer them. But 10 (or was it 16) is too much.

>> This I believe is the default behaviour of x64's cvttsd2si instruction
>> (double 2 int).
>
> So I see, though truncation seems to be Intel's favourite (as it is of
> some people here) so it's perhaps not completely surprising that they
> have an instruction for it. The more general
>
>   cvtsd2si
>
> rounds according to the rounding-control bits in the MXCSR. By default
> that's 'round nearest even' but it can be changed to 'round down',
> 'round up' and 'round towards zero (aka truncate)'.

Yeah I remember similar flags for the x87. And I vaguely remember having
to switch from cvtsd2si to cvttsd2si likely for such issues. (I like how
these new opcodes trip off the tongue so easily.)

>> The discrepancy is the problem. I think that most divides will not be
>> by powers of two, so what divide does is more important.
>
> I don't think the fact that it's a power of 2 is significant. For
> example, say I pick -53 and -54. In 8-bit binary they are
>
>   2'11001011'
>   2'11001010'
>
> shifting either of them right gives
>
>   2'11100101'
>
> which is -27 which (as the true result for -53 / 2 is -26.5) is, again,
> rounded down.

Well, ANY shift will necessarily scale by a power of two.

Usually this is a purely logical operation on a collection of bits: bit
1 is shifted into bit 0. But when used to implement a fast divide, then
you need to look at how it rounds when shifting a number.

>> But they ideally need to behave the same way as dividing by a
>> variable. (This is where my language can give a different result. I've
>> put it down as a quirk for now.)
>>
>
> Yes. You may agree that the code fragment
>
>   int d = 4, x, y ,z;
>
>   x = val >> 2;
>   y = val / 4;
>   z = val / d;
>
> should assign the same value to x, y and z irrespective of what is in val.

My language does that, when val is positive or is unsigned.

However, even if it's fixed for /4 and /d (which can give different
values), I can't touch >>4; a right-shift is a right-shift with specific
behaviour.

(I can do this by generating a special op which is not DIV, neither SHR,
but an inbetween one. This does a runtime check for a positive value and
chooses whether to do DIV or SHR. A bit messy though, and it may lose
some of the benefits of doing shift.)

### Dmitry A. Kazakov

Jun 20, 2021, 4:09:46 AMJun 20
to
On 2021-06-20 00:25, Bart wrote:
> On 19/06/2021 19:22, James Harris wrote:

>> I don't think the fact that it's a power of 2 is significant. For
>> example, say I pick -53 and -54. In 8-bit binary they are
>>
>>    2'11001011'
>>    2'11001010'
>>
>> shifting either of them right gives
>>
>>    2'11100101'
>>
>> which is -27 which (as the true result for -53 / 2 is -26.5) is,
>> again, rounded down.

The morale: do not use shifts with numbers [*]

> Well, ANY shift will necessarily scale by a power of two.

Nope. It is the power of the radix, which can be 2 or 10, 16 whatever.
Right "arithmetic" shift pushes radix - 1 into to the leftmost unit.
E.g. say x is packed decimal (:-)) [**]

> Usually this is a purely logical operation on a collection of bits: bit
> 1 is shifted into bit 0. But when used to implement a fast divide, then
> you need to look at how it rounds when shifting a number.

Nope. Logical /= arithmetical. Logical shift /= arithmetical shift /=
circular shift /= dozens of other variants of shifts.

---------------
* Invest into optimization rather than mess with semantics. Modular x/2
is trivial to optimize to a shift on a 2's complement machine or maybe
it needs no optimization at all. Have you actually measured the times
x/2 and shift_right(x,2) take?

** IEEE 754 still deploys two messy decimal radix formats. Feel free to

### Bart

Jun 20, 2021, 6:42:45 AMJun 20
to
On 20/06/2021 09:09, Dmitry A. Kazakov wrote:
> On 2021-06-20 00:25, Bart wrote:
>> On 19/06/2021 19:22, James Harris wrote:
>
>>> I don't think the fact that it's a power of 2 is significant. For
>>> example, say I pick -53 and -54. In 8-bit binary they are
>>>
>>>    2'11001011'
>>>    2'11001010'
>>>
>>> shifting either of them right gives
>>>
>>>    2'11100101'
>>>
>>> which is -27 which (as the true result for -53 / 2 is -26.5) is,
>>> again, rounded down.
>
> The morale: do not use shifts with numbers [*]
>
>> Well, ANY shift will necessarily scale by a power of two.
>
> Nope. It is the power of the radix, which can be 2 or 10, 16 whatever.
> Right "arithmetic" shift pushes radix - 1 into to the leftmost unit.
> E.g. say x is packed decimal (:-)) [**]
>
>> Usually this is a purely logical operation on a collection of bits:
>> bit 1 is shifted into bit 0. But when used to implement a fast divide,
>> then you need to look at how it rounds when shifting a number.
>
> Nope. Logical /= arithmetical. Logical shift /= arithmetical shift /=
> circular shift /= dozens of other variants of shifts.

No. Shift is almost invariably meant as a binary shift.

I have a decimal number type which uses radix 1,000,000,000. Making A>>B
shift a billion at a time is not very useful!

When I had to implement someone's algorithm which involved shifts on big
numbers, I had to do them as binary shifts. Which here means that
multiplying or dividing by powers of two. Attempting anything else would
be meaningless.

Fortunately it didn't any other logical operations such as 'and' or 'or'
which would really require a pure binary type; emulation is too complex.

Shifting '-1 into the leftmost unit' is only relevant for two's
complement or the equivalent. My decimal type uses signed magnitude.

> ---------------
> * Invest into optimization rather than mess with semantics. Modular x/2
> is trivial to optimize to a shift on a 2's complement machine or maybe
> it needs no optimization at all. Have you actually measured the times
> x/2 and shift_right(x,2) take?

If I execute this loop:

to 100 million do
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
c:=a/b
od

then when A is an integer variable, and B is a constant 256, it takes
about 0.5 seconds (using SAR). When B is either a constant 255 or a
variable containing 256, it takes 20 seconds (using CQO; IDIV).

If I compare dividing by constant 2**60 and constant 2**60-1, then it's
0.5 seconds vs 32 seconds.

On which machine is hardware divide faster than shift?

### Dmitry A. Kazakov

Jun 20, 2021, 7:52:33 AMJun 20
to
On 2021-06-20 12:42, Bart wrote:
> On 20/06/2021 09:09, Dmitry A. Kazakov wrote:
>> On 2021-06-20 00:25, Bart wrote:
>>> On 19/06/2021 19:22, James Harris wrote:
>>
>>>> I don't think the fact that it's a power of 2 is significant. For
>>>> example, say I pick -53 and -54. In 8-bit binary they are
>>>>
>>>>    2'11001011'
>>>>    2'11001010'
>>>>
>>>> shifting either of them right gives
>>>>
>>>>    2'11100101'
>>>>
>>>> which is -27 which (as the true result for -53 / 2 is -26.5) is,
>>>> again, rounded down.
>>
>> The morale: do not use shifts with numbers [*]
>>
>>> Well, ANY shift will necessarily scale by a power of two.
>>
>> Nope. It is the power of the radix, which can be 2 or 10, 16 whatever.
>> Right "arithmetic" shift pushes radix - 1 into to the leftmost unit.
>> E.g. say x is packed decimal (:-)) [**]
>>
>>> Usually this is a purely logical operation on a collection of bits:
>>> bit 1 is shifted into bit 0. But when used to implement a fast
>>> divide, then you need to look at how it rounds when shifting a number.
>>
>> Nope. Logical /= arithmetical. Logical shift /= arithmetical shift /=
>> circular shift /= dozens of other variants of shifts.
>
> No. Shift is almost invariably meant as a binary shift.

Even so, [binary] logical /= arithmetical /= circular.

> I have a decimal number type which uses radix 1,000,000,000. Making A>>B
> shift a billion at a time is not very useful!

Per definition decimal means radix = 10. Shifting billionary numbers is
no less useful than having them in first place.

> When I had to implement someone's algorithm which involved shifts on big
> numbers,

Do not use ill-defined algorithms.

>> ---------------
>> * Invest into optimization rather than mess with semantics. Modular
>> x/2 is trivial to optimize to a shift on a 2's complement machine or
>> maybe it needs no optimization at all. Have you actually measured the
>> times x/2 and shift_right(x,2) take?
>
> If I execute this loop:
>
>     to 100 million do
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>         c:=a/b
>     od
>
> then when A is an integer variable, and B is a constant 256, it takes
> about 0.5 seconds (using SAR). When B is either a constant 255 or a
> variable containing 256, it takes 20 seconds (using CQO; IDIV).
>
> If I compare dividing by constant 2**60 and constant 2**60-1, then it's
> 0.5 seconds vs 32 seconds.

So the compiler does not optimize expressions like

a/2**b

?

That was my point as a programmer to a compiler designer. Do not mess
with the semantics, it is not your job, it is mine. Fix that lazy
compiler first.

> On which machine is hardware divide faster than shift?

Arguably on every machine with an optimizing compiler.

Optimizing arithmetic into most effective machine instructions (e.g.
shifts if they are faster) is easier to do than doing so for a
hand-written prematurely "optimized" code. There are lots of properties
numeric operations have which the optimizing compiler may rely on. If
you break out of the model, none of that apply anymore. So the code
written in a sane way in accordance with laws of mathematics and common
sense will likely turn to be faster.

### Andy Walker

Jun 20, 2021, 8:53:10 AMJun 20
to
On 20/06/2021 12:52, Dmitry A. Kazakov wrote:
> On 2021-06-20 12:42, Bart wrote:
>> No. Shift is almost invariably meant as a binary shift.
> Even so, [binary] logical /= arithmetical /= circular.

Esp if, as per other parts of this and previous articles, you
[or the compiler] intend to use "shift" as an optimisation for division
by a power of two. Just another reason why you should do arithmetic on
integer types and logic on "bits" types, and not try to mix the two.

[...]
>> When I had to implement someone's algorithm which involved shifts on big numbers,
> Do not use ill-defined algorithms.

+1.

[...]
>> If I compare dividing by constant 2**60 and constant 2**60-1, then
>> it's 0.5 seconds vs 32 seconds.
> So the compiler does not optimize expressions like
>    a/2**b
> ?
> That was my point as a programmer to a compiler designer. Do not mess
> with the semantics, it is not your job, it is mine. Fix that lazy
> compiler first.

+1. Or perhaps +0.5, because before optimisation comes the
algorithm design and assessment of needs. Most of my programs run
in <1s elapsed time from pressing the button to getting the results
[inc compilation -- you can do a staggering amount of computation
in a second these days]; optimising that to 0.5s or even 0.01s
saves less time than it takes extra to type the flags to compile to
an optimised executable and run the result.

[...]
> Optimizing arithmetic into most effective machine instructions (e.g.
> shifts if they are faster) is easier to do than doing so for a
> hand-written prematurely "optimized" code. [...]

+1, again.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Nevin

### Bart

Jun 20, 2021, 9:04:53 AMJun 20
to
On 20/06/2021 12:52, Dmitry A. Kazakov wrote:
> On 2021-06-20 12:42, Bart wrote:
>

>> No. Shift is almost invariably meant as a binary shift.
>
> Even so, [binary] logical /= arithmetical /= circular.

No. As language designer then /I/ get to decide what >> and << mean, and
how they work on different types!

But largely that is in line with other languages do.

The de facto meaning of A<<B is A*(2**B), and of A>>B it is A/(2**B)
using integer divide.

>> I have a decimal number type which uses radix 1,000,000,000. Making
>> A>>B shift a billion at a time is not very useful!
>
> Per definition decimal means radix = 10.

From the user's point of view, it is decimal. It can represent 0.1
exactly for example, and printing large numbers is instant. I can switch
from base-1000000000 to base-10 by changing:

const digitwidth = 9
const digitbase = 1'000'000'000

to:

const digitwidth = 1
const digitbase = 10

It still works, but is at least a magnitude slower, so what's the point?

> Shifting billionary numbers is
> no less useful than having them in first place.
>
>> When I had to implement someone's algorithm which involved shifts on
>> big numbers,
>
> Do not use ill-defined algorithms.

It is not ill-defined once you know what they mean by << and >> (see above).

(This algorithm was for a fast version of the Ackermann function, but
I've lost the main reference and my code. However a version of it in Go
is here:
https://rosettacode.org/wiki/Ackermann_function#Expanded_version_with_arbitrary_precision),
which uses a 'Lsh' method for <<.)

>> If I compare dividing by constant 2**60 and constant 2**60-1, then
>> it's 0.5 seconds vs 32 seconds.
>
> So the compiler does not optimize expressions like
>
>    a/2**b

Huh? I said that in a/b, it will optimise the case where it KNOWS that b
is of the form 2**N. It can then generate code using SAR rather than
IDIV, with results which are 40 times faster in my test.

You were suggesting that hardware division might be faster than hardware
shift.

>> On which machine is hardware divide faster than shift?
>
> Arguably on every machine with an optimizing compiler.

Rubbish. If hardware divide is slower than hardware shift, no
optimisating compiler can make it faster unless:

(1) It turns divide into hardware shift
(2) It eliminates the divide op completely.

> Optimizing arithmetic into most effective machine instructions (e.g.
> shifts if they are faster) is easier to do than doing so for a
> hand-written prematurely "optimized" code. There are lots of properties
> numeric operations have which the optimizing compiler may rely on. If
> you break out of the model, none of that apply anymore. So the code
> written in a sane way in accordance with laws of mathematics and common
> sense will likely turn to be faster.

I think you misunderstood. I'm not manually turning A/B operations into
A>>N where N is some log2 of B. My compiler does that simple optimisation.

### Dmitry A. Kazakov

Jun 20, 2021, 9:46:10 AMJun 20
to
On 2021-06-20 15:04, Bart wrote:
> On 20/06/2021 12:52, Dmitry A. Kazakov wrote:
>> On 2021-06-20 12:42, Bart wrote:
>
>>> No. Shift is almost invariably meant as a binary shift.
>>
>> Even so, [binary] logical /= arithmetical /= circular.
>
> No. As language designer then /I/ get to decide what >> and << mean, and
> how they work on different types!

Nope. >> is a lexeme. The semantics of various shifts are defined across
the languages and hardware. You cannot decide. You can only introduce a
bug by using +,-,*,/ for something different.

> The de facto meaning of A<<B is A*(2**B), and of A>>B it is A/(2**B)
> using integer divide.

Nope, it is writing B into the stream A! (:-))

>>> I have a decimal number type which uses radix 1,000,000,000. Making
>>> A>>B shift a billion at a time is not very useful!
>>
>> Per definition decimal means radix = 10.
>
> From the user's point of view, it is decimal. It can represent 0.1
> exactly for example, and printing large numbers is instant.

I do not understand this. See decimal:

https://en.wikipedia.org/wiki/Decimal

> It still works,

What works? Are you trying to say that a number can be represented in
any radix. Yes it can. So what?

> but is at least a magnitude slower, so what's the point?

The point is that if you choose machine radix 1000000000 you must live
with the consequences for low-level instructions like machine
representation shifts. Ergo, do not expose low-level instructions in the
language.

>>> When I had to implement someone's algorithm which involved shifts on
>>> big numbers,
>>
>> Do not use ill-defined algorithms.
>
> It is not ill-defined once you know what they mean by << and >> (see
> above).

It is ill-defined because the reader must learn some non-standard
notation for well established things. Use 2**K and everybody will
understand you.

>>> If I compare dividing by constant 2**60 and constant 2**60-1, then
>>> it's 0.5 seconds vs 32 seconds.
>>
>> So the compiler does not optimize expressions like
>>
>>     a/2**b
>
> Huh? I said that in a/b, it will optimise the case where it KNOWS that b
> is of the form 2**N.

Right. How you could replace it with

right_shift (a, b)

otherwise?

>>> On which machine is hardware divide faster than shift?
>>
>> Arguably on every machine with an optimizing compiler.
>
> Rubbish. If hardware divide is slower than hardware shift, no
> optimisating compiler can make it faster unless:
>
> (1) It turns divide into hardware shift
> (2) It eliminates the divide op completely.

(3) The program is longer than single division.

> I think you misunderstood. I'm not manually turning A/B operations into
> A>>N where N is some log2 of B. My compiler does that simple optimisation.

Good, then there is no need to introduce broken numeric shifts anymore.
If the algorithm uses some special form of divisor, let the programmer
hint that in the program. Why do you believe that 2**N is the only such
thing? Why not 5**N or (3**N + 1) etc. Do you have operations for them?

a + 2**N
a * (b + 1)
...

The only reason people are obsessed with 2**N is because it is easy to
implement.

### Bart

Jun 20, 2021, 10:20:53 AMJun 20
to
On 20/06/2021 14:46, Dmitry A. Kazakov wrote:
> On 2021-06-20 15:04, Bart wrote:

>> From the user's point of view, it is decimal. It can represent 0.1
>> exactly for example, and printing large numbers is instant.
>
> I do not understand this. See decimal:
>
>    https://en.wikipedia.org/wiki/Decimal
>
> Numeral system base = radix.

The reality is that binary numbers are usually treated as groups of 8 or
32 or 64, and calculations are down with those groups.

You can argue that the effective digit size is based on 256 (or 2**64)
rather than 2.

My digits are similarly grouped, except that 1000000000 is implemented
on top of a 32-bit binary value, in the same way that ANY decimal system
has to be implemneted in a binary computer.

>> It still works,
>
> What works?

It still mostly behaves as a decimal representation with the
characteristics expected. Such as calculations that give the same
results as pencil and paper ones.

>> It is not ill-defined once you know what they mean by << and >> (see
>> above).
>
> It is ill-defined because the reader must learn some non-standard
> notation for well established things. Use 2**K and everybody will
> understand you.

** is a rather scary operation that may end up as a call to floating
point pow() in some languages.

It is also not as clear; most people understand what is meant by binary
shifts and they can express their intention better than arithmetic
multiplies and divides, especially if you also write 2**8 instead of 256.

>> (1) It turns divide into hardware shift
>> (2) It eliminates the divide op completely.
>
> (3) The program is longer than single division.

Quite a few programs are dominated by division, because it is so slow.

>> I think you misunderstood. I'm not manually turning A/B operations
>> into A>>N where N is some log2 of B. My compiler does that simple
>> optimisation.
>
> Good, then there is no need to introduce broken numeric shifts anymore.
> If the algorithm uses some special form of divisor, let the programmer
> hint that in the program. Why do you believe that 2**N is the only such
> thing? Why not 5**N or (3**N + 1) etc. Do you have operations for them?
>
>    a + 2**N
>    a * (b + 1)
>    ...
>
> The only reason people are obsessed with 2**N is because it is easy to
> implement.

That's a pretty good reason. BTW practically all computers are obsessed
with it too.

### Dmitry A. Kazakov

Jun 20, 2021, 1:42:29 PMJun 20
to
On 2021-06-20 16:20, Bart wrote:
> On 20/06/2021 14:46, Dmitry A. Kazakov wrote:
>> On 2021-06-20 15:04, Bart wrote:
>
>>> From the user's point of view, it is decimal. It can represent 0.1
>>> exactly for example, and printing large numbers is instant.
>>
>> I do not understand this. See decimal:
>>
>>     https://en.wikipedia.org/wiki/Decimal
>>
>> Numeral system base = radix.
>
> The reality is that binary numbers are usually treated as groups of 8 or
> 32 or 64, and calculations are down with those groups.

Regardless true or not this has nothing to do with decimal numbers. The
calculations are done according to the rules of mathematics, whatever
representation.

> You can argue that the effective digit size is based on 256 (or 2**64)
> rather than 2.

>>> It still works,
>>
>> What works?
>
> It still mostly behaves as a decimal representation with the
> characteristics expected.

Why should a representation be relevant here?

> Such as calculations that give the same
> results as pencil and paper ones.

Yep, shifting by units of 0..10**9-1 is computable with pencil and
paper. What would you expect?

>>> It is not ill-defined once you know what they mean by << and >> (see
>>> above).
>>
>> It is ill-defined because the reader must learn some non-standard
>> notation for well established things. Use 2**K and everybody will
>> understand you.
>
> ** is a rather scary operation that may end up as a call to floating
> point pow() in some languages.

Use a sane language. In Whitespace 2**K is a comment.

> It is also not as clear; most people understand what is meant by binary
> shifts and they can express their intention better than arithmetic
> multiplies and divides, especially if you also write 2**8 instead of 256.

I hope that most people learn something the elementary school...

>> Good, then there is no need to introduce broken numeric shifts
>> anymore. If the algorithm uses some special form of divisor, let the
>> programmer hint that in the program. Why do you believe that 2**N is
>> the only such thing? Why not 5**N or (3**N + 1) etc. Do you have
>> operations for them? What about
>>
>>     a + 2**N
>>     a * (b + 1)
>>     ...
>>
>> The only reason people are obsessed with 2**N is because it is easy to
>> implement.
>
> That's a pretty good reason. BTW practically all computers are obsessed
> with it too.

That reminds me a story about a drunk guy who searched for his lost
watch under the street lamp. When asked why there, he answered, that he
can see better under the lamp.

### James Harris

Jul 4, 2021, 12:54:21 PMJul 4
to
On 19/06/2021 20:34, Rod Pemberton wrote:
> On Fri, 18 Jun 2021 09:12:45 +0100
> James Harris <james.h...@gmail.com> wrote:
>
>> In general, the result of an integer division will require rounding
>> and a choice must be made as to which way to round. (There is
>> rounding in floating point, too, but that's a separate topic.)
>>
>
> I see this as an issue of matching the hardware capabilities to the
> demands of the environment.
>
> hardware: central processing unit, microprocessor, floating-point unit
>
> business environment: banking, accounting, brokerage, ...
> academic environment: mathematics, physics, chemistry, engineering, ...
> programming environment: C, Forth, Fortran, COBOL, ...

OK.

>
>> Some languages take the easy way out and say that integer division
>> yields whatever result is yielded by the hardware but that's not
>> portable so I am looking for a default rounding mode which is most
>> useful to programmers that can be part of the language specification
>> while also being suitable for fast implementation.
>
> Have you looked at the specifications for C and FORTH? Or, etc. like
> COBOL and Fortran? I.e., I'd suspect this problem was "solved" decades
> ago with the development of early programming languages.

Sort of. There's a great table of what different languages do at

https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages

but just look at how inconsistent the choices are!

>
> E.g., a web page on some types of accounting rounding, which some lowly
> programmer will have to figure out how to program:
> https://www.syspro.com/blog/applying-and-operating-erp/rounding-for-accounting-accuracy/
>

That was a really interesting read and I've given it a lot of thought. I
think the rounding methods it mentions are different from the integer
roundings which are the subject of this thread as the roundings in the
article are elective, rare and often unary. As such I think that where
they are needed they could best be effected by functions. For example,

rounded = argentina_rounding(original)

For integer rounding from division, however, it is an unavoidable
consequence of the operation itself, it's a reasonably frequent
operation, and it has two operands rather than one. So I've some up with
a suggestion (in a new thread started just now, qv) of how it might be
realised in a programming language.

--
James Harris