Detect Math Errors in Functions

Joseph Hesse

unread,

Sep 1, 2021, 1:32:16 AM9/1/21

to

The famous unproven Collatz math conjecture says that if you start with
any positive integer n and generate a sequence by repeatedly apply the
rule that if n is even the next term is n/2, otherwise the next term is
3*n + 1, then eventually the sequence will reach 1.

I wrote a C++ function that will generate the sequence from any positive
starting integer, see below, header files omitted. Since there is
integer arithmetic involved, there might be a math overflow. How does
one write code to detect math underflow or overflow errors in functions
that do math computations?

Thank you,
Joe

void GenerateSequence(unsigned long start)
{
unsigned long x = start;
try
{
while(x != 1)
{
x = (x%2 == 0 ? x/2 : 3*x+1);

// code to detect math overflow error,
// if error then do a throw string("Math Overflow");

cout << x << "\n";
}
}
catch(const string & msg)
{
cout << "Program terminated: " << msg << endl;
exit(1);
}
}

Barry Schwarz

unread,

Sep 1, 2021, 2:55:00 AM9/1/21

to

On Wed, 1 Sep 2021 07:31:53 +0200, Joseph Hesse <jo...@gmail.com>
wrote:

>The famous unproven Collatz math conjecture says that if you start with
>any positive integer n and generate a sequence by repeatedly apply the
>rule that if n is even the next term is n/2, otherwise the next term is
>3*n + 1, then eventually the sequence will reach 1.
>
>I wrote a C++ function that will generate the sequence from any positive
>starting integer, see below, header files omitted. Since there is
>integer arithmetic involved, there might be a math overflow. How does
>one write code to detect math underflow or overflow errors in functions
>that do math computations?
>
>Thank you,
>Joe
>
>void GenerateSequence(unsigned long start)
>{
> unsigned long x = start;
> try
> {
> while(x != 1)
> {
> x = (x%2 == 0 ? x/2 : 3*x+1);
>
> // code to detect math overflow error,
> // if error then do a throw string("Math Overflow");

In this particular case, it seems to me to be easier to prevent the
overflow before it occurs rather than try to detect it after it
occurs. You could replace the previous assignment with
if (x%2 == 0)
x /= 2;
else
{
if (x < ULONG_MAX/3)
x = 3*x+1;
else
// perform appropriate overflow processing

}

>
> cout << x << "\n";
> }
> }
> catch(const string & msg)
> {
> cout << "Program terminated: " << msg << endl;
> exit(1);
> }
>}

--
Remove del for email

Bonita Montero

unread,

Sep 1, 2021, 3:31:08 AM9/1/21

to

Better use numeric_limits<unsigned long>::max()

David Brown

unread,

Sep 1, 2021, 3:51:11 AM9/1/21

to

On 01/09/2021 08:54, Barry Schwarz wrote:
> On Wed, 1 Sep 2021 07:31:53 +0200, Joseph Hesse <jo...@gmail.com>
> wrote:
>
>> The famous unproven Collatz math conjecture says that if you start with
>> any positive integer n and generate a sequence by repeatedly apply the
>> rule that if n is even the next term is n/2, otherwise the next term is
>> 3*n + 1, then eventually the sequence will reach 1.
>>
>> I wrote a C++ function that will generate the sequence from any positive
>> starting integer, see below, header files omitted. Since there is
>> integer arithmetic involved, there might be a math overflow. How does
>> one write code to detect math underflow or overflow errors in functions
>> that do math computations?
>>

> In this particular case, it seems to me to be easier to prevent the
> overflow before it occurs rather than try to detect it after it
> occurs. You could replace the previous assignment with

I agree with that. I've always preferred the "look before you leap"
approach, rather than the "close your eyes, make the jump and let
someone else sweep up the mess" tactic.

(The rest is to the OP, rather than Barry.)

Another option is to use gcc's overflow-detecting builtins, or other
equivalent extensions for other compilers. If you are trying to do this
checking as fast as possible, you might be looking at SIMD support
through compiler vector extensions or processor-specific intrinsics.

For unsigned types, you can of course just do "y = 3 * x + 1;" and check
for "y < x", as the overflow is defined as wrapping.

You would probably also want to use "long long" (or uint64_t), rather
than "long" (which is 32-bit on some systems), as you want big numbers.
You might even want compiler-specific 128-bit types - after all, the
conjecture is known to be true up to at least 2 ^ 68.

The conjecture itself is rather interesting. Here are a couple of
interesting videos on it (there are many, but these are from reliable
and interesting sources, to save you from dry lectures or trisectors who
say they have solved it):

<https://www.youtube.com/watch?v=094y1Z2wpJg>
<https://www.youtube.com/watch?v=5mFpVDpKX70>

Barry Schwarz

unread,

Sep 1, 2021, 10:53:56 AM9/1/21

to

On Wed, 1 Sep 2021 09:30:53 +0200, Bonita Montero
<Bonita....@gmail.com> wrote:

>> if (x < ULONG_MAX/3)
>Better use numeric_limits<unsigned long>::max()

Just out of curiosity, in what way is it better? Given that it is
more characters to type, does it compile faster? Does the resulting
program occupy less memory? Will the program execute faster?

Mike Terry

unread,

Sep 1, 2021, 11:38:38 AM9/1/21

to

That kind of test works for pure addition, but not when multiplication
is involved.

For example, mod 16 we have if x = 8, then y = 3*8 + 1 = 9, and y >= x.
So your test would fail, but there was overflow! Barry's test is the
way I've gone in the past, or use some library class built for the
purpose of overflow detection.

Mike.

Bonita Montero

unread,

Sep 1, 2021, 11:50:30 AM9/1/21

to

One is C-style, one is C++-style and synatactically more explicit.
All other aspects are irrelevant.

James Kuyper

unread,

Sep 1, 2021, 12:28:35 PM9/1/21

to

The only advantage I can see doesn't apply when the type is named
explicitly. std::numeric_limits<T>::max would be a much better approach
to use if T is an arbitrary integer type.

Mike Terry

unread,

Sep 1, 2021, 12:38:47 PM9/1/21

to

Better because that's what BM prefers? :) Perhaps it saves the reader
needing to understand the ULONG_MAX macro, although the intent of
ULONG_MAX seems clear enough... New style C++ versus old style C, if
that's important to you? Familiarity for the programmers in the
maintenance team? (..could go either way..)

There's certainly a good case for

if (x < numeric_limits<decltype (x)>::max() / 3)

if the type of x were unclear, or could be subject to future change but
it's even more characters to type, if you're intent on saving typing energy.

Or even clearer (with no worries about clever handling of edge cases)
would be

if (x <= (numeric_limits<decltype (x)>::max() - 1) / 3)

as that matches /visibly/ with the calculation about to be performed.

Yes, the test will match the same x values, but this way someone reading
the code won't have to think "aha very clever - integral types in C++
must have an even number of bits [is that even correct??? better check
the standard!] and 2^n mod 3 = 1 when n is even, so the natural -1 that
seems to be missing won't make any difference. The test would be off by
one of n were odd, or if the numeric type were signed, but it's not so
we're good - cunning!". (And yes, even then, the value would only be
"off by one" in the safe direction, but why be off by one at all? Why
not be (visibly and clearly) spot on?)

Regards,
Mike.

David Brown

unread,

Sep 1, 2021, 1:25:43 PM9/1/21

to

You are right, of course - sorry for posting without thinking here.

Juha Nieminen

unread,

Sep 1, 2021, 2:22:20 PM9/1/21

to

Joseph Hesse <jo...@gmail.com> wrote:
> I wrote a C++ function that will generate the sequence from any positive
> starting integer, see below, header files omitted. Since there is
> integer arithmetic involved, there might be a math overflow. How does
> one write code to detect math underflow or overflow errors in functions
> that do math computations?

This doesn't answer the question you are asking, and while what you
are doing may be interesting for curiosity's sake, or to practice the
art of programming, it's ultimately relatively futile because quite
famously the conjecture has been tested for all starting numbers
up to about 2^68.

It might be a more interesting exercise to be able to calculate
the collatz sequence for arbitrarily large numbers. You can use
a multi-precision library such as GMP to do calculations with
such numbers (which also indirectly solves your problem because
the numbers won't overflow, but grow as needed instead).

Bonita Montero

unread,

Sep 1, 2021, 3:03:15 PM9/1/21

to

Am 01.09.2021 um 07:31 schrieb Joseph Hesse:
> The famous unproven Collatz math conjecture says that if you start with
> any positive integer n and generate a sequence by repeatedly apply the
> rule that if n is even the next term is n/2, otherwise the next term is
> 3*n + 1, then eventually the sequence will reach 1.
>
> I wrote a C++ function that will generate the sequence from any positive
> starting integer, see below, header files omitted. Since there is
> integer arithmetic involved, there might be a math overflow. How does
> one write code to detect math underflow or overflow errors in functions
> that do math computations?
>
> Thank you,
> Joe
>
> void GenerateSequence(unsigned long start)
> {
> unsigned long x = start;
> try
> {
>     while(x != 1)
>     {
>       x = (x%2 == 0 ? x/2 : 3*x+1);

Will x ever has a chance to become 1 ?

Bonita Montero

unread,

Sep 1, 2021, 3:10:24 PM9/1/21

to

I think the most portbale solution is this:

#include <iostream>
#include <stdexcept>

using namespace std;

void GenerateSequence( unsigned long start )

{
unsigned long x = start;
try
{
while( x != 1 )
{

if( !(x % 2) )
x /= 2;
else
if( 3 * x / x == x && 3 * x + 1 > 3 * x )
x = 3 * x + 1;
else
throw range_error( "GenerateSequence: out of range" );
cout << x << endl;
}
}
catch( range_error &re )
{
cout << re.what() << endl;
}
}

Bonita Montero

unread,

Sep 1, 2021, 11:31:28 PM9/1/21

to

Am 01.09.2021 um 21:10 schrieb Bonita Montero:
> I think the most portbale solution is this:
>
> #include <iostream>
> #include <stdexcept>
>
> using namespace std;
>
> void GenerateSequence( unsigned long start )
> {
>     unsigned long x = start;
>     try
>     {
>         while( x != 1 )
>         {
>             if( !(x % 2) )
>                 x /= 2;
>             else
>                 if( 3 * x / x == x && 3 * x + 1 > 3 * x )

if( 3 * x / 3 == x && 3 * x + 1 > 3 * x )

Bonita Montero

unread,

Sep 2, 2021, 1:51:23 AM9/2/21

to

Look at this fantastic optimization of clang 11:

bool f( unsigned x )
{
return 3 * x / 3 == x && 3 * x + 1 > 3 * x;
}

mov eax, ecx
mov ecx, 3
mul ecx
setno cl
cmp eax, -1
setne al
and al, cl
ret

David Brown

unread,

Sep 2, 2021, 5:53:52 AM9/2/21

to

On 01/09/2021 21:02, Bonita Montero wrote:
> Am 01.09.2021 um 07:31 schrieb Joseph Hesse:
>> The famous unproven Collatz math conjecture says that if you start
>> with any positive integer n and generate a sequence by repeatedly
>> apply the rule that if n is even the next term is n/2, otherwise the
>> next term is 3*n + 1, then eventually the sequence will reach 1.
>>
>> I wrote a C++ function that will generate the sequence from any
>> positive starting integer, see below, header files omitted. Since
>> there is integer arithmetic involved, there might be a math overflow.
>> How does one write code to detect math underflow or overflow errors in
>> functions that do math computations?
>>
>> Thank you,
>> Joe
>>
>> void GenerateSequence(unsigned long start)
>> {
>>    unsigned long x = start;
>>    try
>>    {
>>      while(x != 1)
>>      {
>>        x = (x%2 == 0 ? x/2 : 3*x+1);
>
> Will x ever has a chance to become 1 ?
>

Yes.

The conjecture is that it will always become 1, regardless of the
starting point. This has been confirmed up to x = 2 ^ 68, but of course
that is not a proof.

Cholo Lennon

unread,

Sep 5, 2021, 11:20:33 PM9/5/21

to

Coincidence? I watched the video last week, I didn't know about the
Collatz conjecture until I watched it.

> <https://www.youtube.com/watch?v=5mFpVDpKX70>
Nice video, thanks!

--
Cholo Lennon
Bs.As.
ARG

Alf P. Steinbach

unread,

Sep 6, 2021, 1:23:36 AM9/6/21

to

Andrew Koenig (secretary of the first C++ standard, colleague at Bell
labs with Bjarne Stroustrup, the Koenig of "Koenig lookup" etc.) used
the Collatz sequence as an example basis for what he intended to be a
series of articles in the late Dr. Dobbs Journal, where he started out
discussing recursive functions, and in the second article I think it
was, pointed out how a little rewrite with an extra parameter in a
helper function could make Collatz sequence function tail recursive.

As I understood it he intended to continue that by showing how
`std::vector` as function result type would still give O(n^2) behavior,
whereas a DIY linked list made that O(n).

However, I pointed out in a comment (I'm now ashamed to say with
needlessly harsh words :( ) that C++11 move semantics, applied properly,
reduced also the `std::vector` result case to O(n). And `std::vector` is
much more convenient and safe than a DIY linked list. `std::vector` FTW! :)

>> <https://www.youtube.com/watch?v=5mFpVDpKX70>
> Nice video, thanks!

- Alf

Hope Rouselle

unread,

Sep 6, 2021, 10:05:10 AM9/6/21

to

It's quite alright. If you had thought a bit more perhaps I wouldn't
now have the references to the short documentaries on the Collatz
conjecture. (I'm not working on the problem! I swear. Lol. But it's
quite fun to talk about it.)

Pavel

unread,

Sep 6, 2021, 8:34:17 PM9/6/21

to

No wonder, since my FORTRAN 66 days I remember that the only useful data
structure is an array.. -Pavel

Pavel

unread,

Sep 6, 2021, 8:47:26 PM9/6/21

to

IMHO, this is UB if x ever grows bigger than ULONG_MAX / 3, so the
portability point is moot. (I tend to think UB is non-portable by
definition).

-Pavel

red floyd

unread,

Sep 6, 2021, 11:24:33 PM9/6/21

to

I don't think so. I believe unsigned rollover is defined. If x was a
regular long, that's another matter.

Bonita Montero

unread,

Sep 7, 2021, 3:10:30 AM9/7/21

to

No, as red floyd said: that's not UB with unsigned.

Juha Nieminen

unread,

Sep 8, 2021, 6:21:03 AM9/8/21

to

red floyd <no.spa...@its.invalid> wrote:
> I don't think so. I believe unsigned rollover is defined. If x was a
> regular long, that's another matter.

Indeed. Unsigned integers use 2's complement arithmetic as per the standard.
(Well, pratically speaking 2's complement.)

It's only with signed integers that overflow and underflow are undefined.
(And for a reason, as there are architectures where this may cause a fault
signal to be thrown.)

David Brown

unread,

Sep 8, 2021, 7:17:46 AM9/8/21

to

On 08/09/2021 12:20, Juha Nieminen wrote:
> red floyd <no.spa...@its.invalid> wrote:
>> I don't think so. I believe unsigned rollover is defined. If x was a
>> regular long, that's another matter.
>
> Indeed. Unsigned integers use 2's complement arithmetic as per the standard.
> (Well, pratically speaking 2's complement.)
>

I think I know what you are trying to say, but it is a strange way to
say it. "Two's complement" only makes sense for signed integers - it
has no meaning for unsigned types.

Unsigned types in C and C++ are defined as modulo types.
Conventionally, we may say they "wrap on overflow" - but to be more
accurate, there is no overflow. The result of adding unsigned ints "a"
and "b" is the value of "(a + b) % 2^n" using standard mathematical
arithmetic, where "n" is the number of value bits in unsigned int.

> It's only with signed integers that overflow and underflow are undefined.

Yes.

> (And for a reason, as there are architectures where this may cause a fault
> signal to be thrown.)
>

That is perhaps one of several reasons, but it is not the only one. The
primary reason, as I see it, is that overflow of signed integer
arithmetic will give you the wrong answer. Picking a definition for the
result does not make it correct or useful - it merely hinders tools in
helping you to find the errors in your code.

Hope Rouselle

unread,

Sep 8, 2021, 9:25:47 AM9/8/21

to

David Brown <david...@hesbynett.no> writes:

> On 08/09/2021 12:20, Juha Nieminen wrote:
>> red floyd <no.spa...@its.invalid> wrote:
>>> I don't think so. I believe unsigned rollover is defined. If x was a
>>> regular long, that's another matter.
>>
>> Indeed. Unsigned integers use 2's complement arithmetic as per the standard.
>> (Well, pratically speaking 2's complement.)
>
> I think I know what you are trying to say, but it is a strange way to
> say it. "Two's complement" only makes sense for signed integers - it
> has no meaning for unsigned types.
>
> Unsigned types in C and C++ are defined as modulo types.
> Conventionally, we may say they "wrap on overflow" - but to be more
> accurate, there is no overflow. The result of adding unsigned ints "a"
> and "b" is the value of "(a + b) % 2^n" using standard mathematical
> arithmetic, where "n" is the number of value bits in unsigned int.

FWIW, here's the exact language of C89.

--8<---------------cut here---------------start------------->8---
A computation involving unsigned operands can never overflow, because
a result that cannot be represented by the resulting unsigned integer
type is reduced modulo the number that is one greater than the largest
value that can be represented by the resulting unsigned integer type.
-- C89, section 3.1.2.5
--8<---------------cut here---------------end--------------->8---