recovering from std::bad_alloc in std::string reserve

Lynn McGuire

unread,

May 24, 2021, 9:46:26 PM5/24/21

to

I am getting std::bad_alloc from the following code when I try to
reserve a std::string of size 937,180,144:

std::string filename = getFormsMainOwner () -> getOutputFileName ();
FILE * pOutputFile = nullptr;
errno_t err = fopen_s_UTF8 ( & pOutputFile, filename.c_str (), "rt");
if (err == 0)
{
std::string outputFileBuffer;
// need to preallocate the space in case the output file is a
gigabyte or more, PMR 6408
fseek (pOutputFile, 0, SEEK_END);
size_t outputFileLength = ftell (pOutputFile) + 42; // give it some slop
fseek (pOutputFile, 0, SEEK_SET);
outputFileBuffer.reserve (outputFileLength);

Any thoughts here on how to handle the std::bad_alloc in std::string
reserve ?

Thanks,
Lynn

Lynn McGuire

unread,

May 24, 2021, 9:50:47 PM5/24/21

to

I am using Visual Studio C++ 2015 on a Windows 7 x86 PC with 16 GB of
ram. I am probably using a lot of ram already in my program.

Thanks,
Lynn

Lynn McGuire

unread,

May 24, 2021, 9:52:27 PM5/24/21

to

And I am building a Win32 program. Not a Win64 program.

Thanks,
Lynn

Lynn McGuire

unread,

May 24, 2021, 11:33:55 PM5/24/21

to

On 5/24/2021 8:46 PM, Lynn McGuire wrote:

I googled this problem and the predominating solution was to put the
std::string reserve into a try catch. I did so and it works like a
champ ! My first try catch.

try
{
outputFileBuffer.reserve (outputFileLength);
// now read the file into the preallocated buffer
while ((num = fread (buffer, sizeof (char), sizeof (buffer) - 1,
pOutputFile)) > 0)
{
// make sure that there is a trailing zero
buffer [num] = '\0';
outputFileBuffer += buffer;
total += num;
}
}
catch (...)
{
// nothing to catch since any error causes this code to bypass
}
}

Thanks,
Lynn

Lynn McGuire

unread,

May 24, 2021, 11:35:29 PM5/24/21

to

Let's try this code instead:

Paavo Helde

unread,

May 25, 2021, 12:47:29 AM5/25/21

to

The usable memory space in a Windows 32-bit program is limited to 2GB.
It can be increased to 3GB, but this does not buy you much.

Catching std::bad_alloc as you have done in other responses is trivial,
but now what? Your program still does not work as expected.

If you are dealing with strings in GB range then you really should start
thinking of switching over to x64 compilation (this will involve some
64-bit bugfixing if this is your first time). Either that, or you need
to redesign your code to read and write files in smaller pieces, which
might be a lot of work.

Also, ftell() returns a signed 32-bit value in Windows so what you have
written here will cease to work when your files grow larger than 2 GB.
Suggesting to always use 64-bit alternatives for handling file sizes and
positions, even in 32-bit programs. Unfortunately these alternatives are
not portable and one needs to take some extra care for supporting
different OS-es.

It's also strange to base the program logic on the size of an *output*
file. But there are probably reasons.

Juha Nieminen

unread,

May 25, 2021, 1:17:01 AM5/25/21

to

Lynn McGuire <lynnmc...@gmail.com> wrote:
> while ((num = fread (buffer, sizeof (char), sizeof (buffer) - 1,
> pOutputFile)) > 0)

The standard mandates that sizeof(char) is 1. It cannot have any other
value.

> catch (...)
> {
> // nothing to catch since any error causes this code to bypass
> }

If you are catching a memory allocation, why not catch it explicitly and
return an error code or print an informative error message or something
that indicates what happened, rather than silently failing and doing
nothing?

try
{
// your code here
}
catch(const std::bad_alloc& e)
{
// Could do, for example:
std::cout << "Memory allocation failed: " << e.what() << "\n";
}
catch(...)
{
std::cout << "Unknown excpetion thrown while trying to read file\n";
}

Bonita Montero

unread,

May 25, 2021, 6:56:53 AM5/25/21

to

You've got a 2GB address-space and not a contignous piece of
memory which fits to your 900MB.

Bonita Montero

unread,

May 25, 2021, 7:00:57 AM5/25/21

to

> If you are catching a memory allocation, why not catch it explicitly and
> return an error code or print an informative error message or something
> that indicates what happened, rather than silently failing and doing
> nothing?

On the other side:
if bad_alloc is the only expectable exception you could stick with "..."

Scott Lurndal

unread,

May 25, 2021, 10:55:53 AM5/25/21

to

Bonita Montero <Bonita....@gmail.com> writes:
>You've got a 2GB address-space and not a contignous piece of
>memory which fits to your 900MB.

It doesn't need to be contiguous. But it needs to exist either
as real memory or as configured backing store (i.e. swap space).

Paavo Helde

unread,

May 25, 2021, 11:43:12 AM5/25/21

to

For fitting a 900 MB std::string into a memory one needs 900 MB of
contiguous address space and Bonita is right in pointing out this might
be problematic because of memory fragmentation.

The OP has 16 GB of RAM so there is plenty of physical memory, it's just
a problem with the limited address space in 32-bit programs.

MrSpoo...@d0v_5eh1bqgd.com

unread,

May 25, 2021, 12:11:00 PM5/25/21

to

On Tue, 25 May 2021 18:42:54 +0300
Paavo Helde <myfir...@osa.pri.ee> wrote:
>25.05.2021 17:55 Scott Lurndal kirjutas:
>> Bonita Montero <Bonita....@gmail.com> writes:
>>> You've got a 2GB address-space and not a contignous piece of
>>> memory which fits to your 900MB.
>>
>> It doesn't need to be contiguous. But it needs to exist either
>> as real memory or as configured backing store (i.e. swap space).
>
>For fitting a 900 MB std::string into a memory one needs 900 MB of
>contiguous address space and Bonita is right in pointing out this might

Why? Only the virtual memory address space needs to be contiguous, the
real memory pages storing the string could be all over the place.

Nikolaj Lazic

unread,

May 25, 2021, 12:22:00 PM5/25/21

to

Dana Tue, 25 May 2021 18:42:54 +0300, Paavo Helde <myfir...@osa.pri.ee> napis'o:

Win7 32bit cannot access 16G. Limit is 3.5G.
She needs 64bit Windows to use it.

Bonita Montero

unread,

May 25, 2021, 12:23:44 PM5/25/21

to

> Why? Only the virtual memory address space needs to be contiguous,
> the real memory pages storing the string could be all over the place.

OMG, what a stupid statement.

Bonita Montero

unread,

May 25, 2021, 12:23:56 PM5/25/21

to

> It doesn't need to be contiguous. ...

I don't know exactly since when, but the standard-containers
guarantee linear memory for years.

Paavo Helde

unread,

May 25, 2021, 12:58:03 PM5/25/21

to

Exactly. And if the memory allocator cannot find a free range of
contiguous 900M addresses, guess what happens.

Paavo Helde

unread,

May 25, 2021, 1:04:46 PM5/25/21

to

OP was a bit unclear in this point. But there would be no point to have
a 32-bit Windows 7 on a machine with 16 GB RAM, so I hope he has got a
64-bit OS after all.

Lynn McGuire

unread,

May 25, 2021, 1:20:33 PM5/25/21

to

Thanks, that is a good idea to move to the 64 bit version of ftell.

My 450,000 lines of C++ program is so tied to the Win32 API that it is
not funny. My calculation engine is 850,000 lines of F77 and about
20,000 lines of C++ but it is still portable to the Unix boxen, probably
mainframes too if any engineers ran them anymore.

Lynn

Lynn McGuire

unread,

May 25, 2021, 1:46:51 PM5/25/21

to

Yes, I have Windows 7 x64 Pro. I cannot convert to win64 at this time.
When we do convert, it will be a steep hill as we started this code in
Win16 in 1987. The Win32 port was a very steep hill in 2000.

Lynn

Lynn McGuire

unread,

May 25, 2021, 1:49:44 PM5/25/21

to

Thanks, I did not know the code for catching the bad_alloc explicitly.

Lynn

Paavo Helde

unread,

May 25, 2021, 1:58:20 PM5/25/21

to

25.05.2021 20:46 Lynn McGuire kirjutas:
>
> Yes, I have Windows 7 x64 Pro. I cannot convert to win64 at this time.
> When we do convert, it will be a steep hill as we started this code in
> Win16 in 1987. The Win32 port was a very steep hill in 2000.

Nowadays there are special compiler warnings for 64-bit porting issues.
When porting, it would make sense to enable them all, turn them into
errors instead of warnings, and fix them all.

Nikolaj Lazic

unread,

May 25, 2021, 2:23:50 PM5/25/21

to

Dana Tue, 25 May 2021 12:46:30 -0500, Lynn McGuire <lynnmc...@gmail.com> napis'o:

Maybe this can help:
https://stackoverflow.com/questions/639540/how-much-memory-can-a-32-bit-process-access-on-a-64-bit-operating-system

Lynn McGuire

unread,

May 25, 2021, 2:49:11 PM5/25/21

to

Thanks !

Lynn

Lynn McGuire

unread,

May 25, 2021, 2:49:39 PM5/25/21

to

Thanks ! I was aware of that but I had not tried it yet.

Lynn

Bonita Montero

unread,

May 26, 2021, 12:16:59 AM5/26/21

to

Maybe it would be an idea to process your file in pieces ?

Christian Gollwitzer

unread,

May 26, 2021, 2:36:19 AM5/26/21

to

Am 25.05.21 um 19:20 schrieb Lynn McGuire:

> My 450,000 lines of C++ program is so tied to the Win32 API that it is
> not funny. My calculation engine is 850,000 lines of F77 and about
> 20,000 lines of C++ but it is still portable to the Unix boxen, probably
> mainframes too if any engineers ran them anymore.

Have you actually tried to recompile in 64bit? The Win32-API is the same
AFAIUI. Are you casting pointers to ints/longs? I haven't ported
million-LOC programs to 64bit, but in smaller projects there was
surprisingly little to do to make it work. I have no idea what goes
wrong when you link to F77, though.

Christian

MrSpook_...@okc9_pd48oig5.info

unread,

May 26, 2021, 3:16:01 AM5/26/21

to

On Tue, 25 May 2021 19:57:47 +0300

Things would have to be pretty badly FUBARed for the VM to run out of virtual
memory address space on a 64 bit system given the 16 exabyte max size!

Bonita Montero

unread,

May 26, 2021, 3:23:06 AM5/26/21

to

> Things would have to be pretty badly FUBARed for the VM to run out of virtual
> memory address space on a 64 bit system given the 16 exabyte max size!

Actually AMD64 supports "only" 48 bit page-tables where the lower 47
bit fall into user-space. Intel 64 has a recent change to 56 bit page
-tables, but I think that's rather for systems with large file-mappings.

Paavo Helde

unread,

May 26, 2021, 3:54:43 AM5/26/21

to

It looks like you have overlooked the small fact that the OP is having a
32-bit program and does not want to upgrade to 64-bit at this moment.

MrSpook_...@kak0_42x.edu

unread,

May 26, 2021, 4:30:15 AM5/26/21

to

Fair enough. In which case running out of address space would be pretty
easy given modern application sizes.

Alf P. Steinbach

unread,

May 26, 2021, 12:34:05 PM5/26/21

to

On 2021-05-25 03:46, Lynn McGuire wrote:
> I am getting std::bad_alloc from the following code when I try to
> reserve a std::string of size 937,180,144:
>
> std::string filename = getFormsMainOwner () -> getOutputFileName ();
> FILE * pOutputFile = nullptr;
> errno_t err = fopen_s_UTF8 ( & pOutputFile, filename.c_str (), "rt");
> if (err == 0)
> {
>     std::string outputFileBuffer;
>         // need to preallocate the space in case the output file is a
> gigabyte or more, PMR 6408
>     fseek (pOutputFile, 0, SEEK_END);
>     size_t outputFileLength = ftell (pOutputFile) + 42; // give it
> some slop
>     fseek (pOutputFile, 0, SEEK_SET);
>     outputFileBuffer.reserve (outputFileLength);

[snip]

In the above code `ftell` will fail in Windows if the file is 2GB or
more, because in Windows, even in 64-bit Windows, the `ftell` return
type `long` is just 32 bits.

However, the C++ level iostreams can report the file size correctly:

----------------------------------------------------------------------------
#include <stdio.h> // fopen, fseek, ftell, fclose
#include <stdlib.h> // EXIT_...

#include <iostream>
#include <fstream>
#include <stdexcept> // runtime_error
using namespace std;

auto hopefully( const bool e ) -> bool { return e; }
auto fail( const char* s ) -> bool { throw runtime_error( s ); }

struct Is_zero {};
auto operator>>( int x, Is_zero ) -> bool { return x == 0; }

const auto& filename = "large_file";

void c_level_check()
{
struct C_file
{
FILE* handle;
~C_file() { if( handle != 0 ) { fclose( handle ); } }
};

auto const f = C_file{ fopen( ::filename, "rb" ) };
hopefully( !!f.handle )
or fail( "fopen failed" );
fseek( f.handle, 0, SEEK_END )
>> Is_zero()
or fail( "fseek failed, probably rather biggus filus" );
const long pos = ftell( f.handle );
hopefully( pos >= 0 )
or fail( "ftell failed" );
cout << "`ftell` says the file is " << pos << " byte(s)." << endl;
}

void cpp_level_check()
{
auto f = ifstream( ::filename, ios::in | ios::binary );
f.seekg( 0, ios::end );
const ifstream::pos_type pos = f.tellg();
hopefully( pos != -1 )
or fail( "ifstream::tellg failed" );
cout << "`ifstream::tellg` says the file is " << pos << " bytes."
<< endl;
}

void cpp_main()
{
try {
c_level_check();
} catch( const exception& x ) {
cerr << "!" << x.what() << endl;
cpp_level_check();
}
}

auto main() -> int
{
try {
cpp_main();
return EXIT_SUCCESS;
} catch( const exception& x ) {
cerr << "!" << x.what() << endl;
}
return EXIT_FAILURE;
}
-------------------------------------------------------------------------------

When I tested this with `large_file` as a copy of the roughly 4GB
"Bad.Boys.for.Life.2020.1080p.WEB-DL.DD5.1.H264-FGT.mkv", I got

[c:\root\dev\explore\filesize]
> b
!ftell failed
`ifstream::tellg` says the file is 4542682554 bytes.

- Alf

Lynn McGuire

unread,

May 26, 2021, 2:29:10 PM5/26/21

to

We have casts all over the place that are killing us now. One of my
programmers is currently converting us from ASCII to UNICODE at the
moment and having all kinds of problems due to the casts. This program
originated in 1987 with Windows 2.0 and C coding. The Win16 to Win32
port was a freaking disaster and took three of us 18 months to complete.
Of course, a portion of our software was Smalltalk which was converted
to C++ in that port.

The calculation engine still runs as a separate program so the F77 code
does not matter.

Lynn

Lynn McGuire

unread,

May 26, 2021, 2:30:31 PM5/26/21

to

On 5/25/2021 11:16 PM, Bonita Montero wrote:
> Maybe it would be an idea to process your file in pieces ?

I have thought about that. Not today. I am thinking about trying the
large address space switch though.

Lynn

Lynn McGuire

unread,

May 26, 2021, 2:33:05 PM5/26/21

to

I have already replaced the fell code with _ftelli64.

// get the size of the output file
fseek (pOutputFile, 0, SEEK_END);
__int64 outputFileLength = _ftelli64 (pOutputFile) + 42; // give it
some slop
int outputFileLengthInt = (int) outputFileLength;
fseek (pOutputFile, 0, SEEK_SET);

Thanks,
Lynn

Bonita Montero

unread,

May 26, 2021, 2:38:02 PM5/26/21

to

>> Maybe it would be an idea to process your file in pieces ?

> I have thought about that. Not today.
> I am thinking about trying the large address space switch though.

If you're accessing the file lienary consider file-mapping.
File-mapping is slower for random accesses since the pages
have to be mapped on demand, but with linear accesses pre-
fetching of your drive and the operating-system take effect.

Lynn McGuire

unread,

May 26, 2021, 3:09:29 PM5/26/21

to

I store a compressed copy of the output file in our binary file so that
when the user sends it to us so we get a copy of exactly what happened.
It is not a crisis if it is not there. It is a crisis if the file
processing / storage causes our program to crash.

Thanks,
Lynn

Lynn McGuire

unread,

May 26, 2021, 10:07:34 PM5/26/21

to

I tried the /LARGEADDRESSAWARE linker option but it did not help. I
suspect that the memory is fragmented.

We will need to move to Win64 to fix this problem long term.

Thanks,
Lynn

Christian Gollwitzer

unread,

May 27, 2021, 1:50:53 AM5/27/21

to

Am 26.05.21 um 20:32 schrieb Lynn McGuire: - Alf

>
> I have already replaced the fell code with _ftelli64.
>
> // get the size of the output file
> fseek (pOutputFile, 0, SEEK_END);
> __int64 outputFileLength = _ftelli64 (pOutputFile) + 42; // give it
> some slop
> int outputFileLengthInt = (int) outputFileLength;

...and here you restrict it to 2GB again, or worse, retrieve a negative
file size for sizes between 2GB and 4GB.

To prepare for a 64bit move, you should replace all size variables with
size_t for unsigned or ptrdiff_t for signed. That will correspond to a
32bit integer in 32 bit and a 64 bit integer in 64 bit.

Christian

Lynn McGuire

unread,

May 27, 2021, 3:08:40 PM5/27/21

to

Done. With checking against SIZE_MAX before casting the variable to size_t.

Yeah, if 1 GB is having trouble in Win32, 2+ GB will be much worse. The
code is now ok to fail without crashing the program. Porting to Win64
is needed in the near future. So many thing to do, so little time. I
will be 61 in a couple of weeks, kinda hoping to retire before 75.

Thanks,
Lynn

Scott Lurndal

unread,

May 27, 2021, 5:59:38 PM5/27/21

to

Lynn McGuire <lynnmc...@gmail.com> writes:
>On 5/27/2021 12:50 AM, Christian Gollwitzer wrote:
>> Am 26.05.21 um 20:32 schrieb Lynn McGuire: - Alf
>>>
>>> I have already replaced the fell code with _ftelli64.
>>>
>>> // get the size of the output file
>>> fseek (pOutputFile, 0, SEEK_END);
>>> __int64 outputFileLength = _ftelli64 (pOutputFile) + 42; // give it
>>> some slop
>>> int outputFileLengthInt = (int) outputFileLength;
>>
>> ...and here you restrict it to 2GB again, or worse, retrieve a negative
>> file size for sizes between 2GB and 4GB.
>>
>>
>> To prepare for a 64bit move, you should replace all size variables with
>> size_t for unsigned or ptrdiff_t for signed. That will correspond to a
>> 32bit integer in 32 bit and a 64 bit integer in 64 bit.
>>
>> Christian
>
>Done. With checking against SIZE_MAX before casting the variable to size_t.

Why? size_t is guaranteed to hold the size of any object, which implies that
it must be large enough to accomodate an object the size of the virtual address
space. Generally it's minimum size in bits is the same as long.

Keith Thompson

unread,

May 27, 2021, 6:34:05 PM5/27/21

to

sc...@slp53.sl.home (Scott Lurndal) writes:
> Lynn McGuire <lynnmc...@gmail.com> writes:
>>On 5/27/2021 12:50 AM, Christian Gollwitzer wrote:
>>> Am 26.05.21 um 20:32 schrieb Lynn McGuire: - Alf
>>>>
>>>> I have already replaced the fell code with _ftelli64.
>>>>

>>>> //Â get the size of the output file
>>>> fseek (pOutputFile, 0, SEEK_END);
>>>> __int64 outputFileLength = _ftelli64 (pOutputFile) + 42;Â // give it

>>>> some slop
>>>> int outputFileLengthInt = (int) outputFileLength;
>>>
>>> ...and here you restrict it to 2GB again, or worse, retrieve a negative
>>> file size for sizes between 2GB and 4GB.
>>>
>>>
>>> To prepare for a 64bit move, you should replace all size variables with
>>> size_t for unsigned or ptrdiff_t for signed. That will correspond to a
>>> 32bit integer in 32 bit and a 64 bit integer in 64 bit.
>>>

>>> Â Â Â Â Christian

>>
>>Done. With checking against SIZE_MAX before casting the variable to size_t.
>
> Why? size_t is guaranteed to hold the size of any object, which implies that
> it must be large enough to accomodate an object the size of the virtual address
> space. Generally it's minimum size in bits is the same as long.

That's likely to be true, but it's not absolutely guaranteed.

size_t is intended to hold the size of any single object, but it may
not be able to hold the sum of sizes of all objects or the size of
the virtual address space. An implementation might restrict the
size of any single object to something smaller than the size of
the entire virtual address space. (Think segments.)

Also, I haven't found anything in the standard that says you
can't at least try to create an object bigger than SIZE_MAX bytes.
calloc(SIZE_MAX, 2) attempts to allocate such an object, and I don't
see a requirement that it must fail. If an implementation lets you
define a named object bigger than SIZE_MAX bytes, then presumably
applying sizeof to it would result in an overflow, and therefore
undefined behavior.

Any reasonable implementation will simply make size_t big enough
to hold the size of any object it can create, but I don't see a
requirement for it.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Lynn McGuire

unread,

May 27, 2021, 6:45:23 PM5/27/21

to

size_t outputFileLengthSizeT = 0;
if (outputFileLength < SIZE_MAX)
outputFileLengthSizeT = (size_t) outputFileLength;
fseek (pOutputFile, 0, SEEK_SET);

// need to preallocate the space in case the output file is a gigabyte
or more, PMR 6408

// if the try fails then just don't store the output file in the flowsheet
if (outputFileLengthSizeT > 0)
{
try
{
// PMR 6408 will cause this to fail
outputFileBuffer.reserve (outputFileLengthSizeT);

Thanks,
Lynn

Keith Thompson

unread,

May 27, 2021, 6:59:01 PM5/27/21

to

Keith Thompson <Keith.S.T...@gmail.com> writes:
> sc...@slp53.sl.home (Scott Lurndal) writes:
>> Lynn McGuire <lynnmc...@gmail.com> writes:
>>>On 5/27/2021 12:50 AM, Christian Gollwitzer wrote:
>>>> Am 26.05.21 um 20:32 schrieb Lynn McGuire: - Alf
>>>>>
>>>>> I have already replaced the fell code with _ftelli64.
>>>>>
>>>>> //Â get the size of the output file
>>>>> fseek (pOutputFile, 0, SEEK_END);
>>>>> __int64 outputFileLength = _ftelli64 (pOutputFile) + 42;Â // give it
>>>>> some slop
>>>>> int outputFileLengthInt = (int) outputFileLength;
>>>>
>>>> ...and here you restrict it to 2GB again, or worse, retrieve a negative
>>>> file size for sizes between 2GB and 4GB.
>>>>
>>>>
>>>> To prepare for a 64bit move, you should replace all size variables with
>>>> size_t for unsigned or ptrdiff_t for signed. That will correspond to a
>>>> 32bit integer in 32 bit and a 64 bit integer in 64 bit.
>>>>
>>>> Â Â Â Â Christian
>>>
>>>Done. With checking against SIZE_MAX before casting the variable to size_t.
>>
>> Why? size_t is guaranteed to hold the size of any object, which implies that
>> it must be large enough to accomodate an object the size of the virtual address
>> space. Generally it's minimum size in bits is the same as long.
>
> That's likely to be true, but it's not absolutely guaranteed.

My apologies, I was wrong.

> size_t is intended to hold the size of any single object, but it may
> not be able to hold the sum of sizes of all objects or the size of
> the virtual address space. An implementation might restrict the
> size of any single object to something smaller than the size of
> the entire virtual address space. (Think segments.)

I believe this is still correct.

> Also, I haven't found anything in the standard that says you
> can't at least try to create an object bigger than SIZE_MAX bytes.
> calloc(SIZE_MAX, 2) attempts to allocate such an object, and I don't
> see a requirement that it must fail. If an implementation lets you
> define a named object bigger than SIZE_MAX bytes, then presumably
> applying sizeof to it would result in an overflow, and therefore
> undefined behavior.
>
> Any reasonable implementation will simply make size_t big enough
> to hold the size of any object it can create, but I don't see a
> requirement for it.

The above is correct in C, but not in C++, which makes an additional
guarantee that C doesn't. C++17 21.2.4 [support.types.layout] says:

The type size_t is an implementation-defined unsigned integer type
that is large enough to contain the size in bytes of any object
(8.3.3).

Lynn McGuire

unread,

May 27, 2021, 7:17:55 PM5/27/21

to

Except the actual size of a FILE * object in the filesystem. Size_t can
hold the size of the FILE * structure but if the actual file size is
greater than 4 GB in a Win32 program, size_t will be wrong.

Now if the program is Win64, size_t can hold the actual size of any file
in the filesystem.

#ifdef _WIN64
typedef unsigned __int64 size_t;
typedef __int64 ptrdiff_t;
typedef __int64 intptr_t;
#else
typedef unsigned int size_t;
typedef int ptrdiff_t;
typedef int intptr_t;
#endif

Lynn

Lynn McGuire

unread,

May 27, 2021, 7:19:21 PM5/27/21

to

On 5/27/2021 5:58 PM, Keith Thompson wrote:

Here is the definition for SIZE_MAX:

#ifndef SIZE_MAX
#ifdef _WIN64
#define SIZE_MAX _UI64_MAX
#else
#define SIZE_MAX UINT_MAX
#endif
#endif

Lynn

Keith Thompson

unread,

May 27, 2021, 7:58:18 PM5/27/21

to

A FILE* object is a pointer, likely 4 or 8 bytes. A FILE object is
probably a struct, 216 bytes on my system.

> Now if the program is Win64, size_t can hold the actual size of any
> file in the filesystem.
>
> #ifdef _WIN64
> typedef unsigned __int64 size_t;
> typedef __int64 ptrdiff_t;
> typedef __int64 intptr_t;
> #else
> typedef unsigned int size_t;
> typedef int ptrdiff_t;
> typedef int intptr_t;
> #endif

size_t isn't for representing sizes of files, which are not objects.
In fact C doesn't define a type for representing file sizes.
(It might be nice if it did.) ftell() returns a long, which is not
big enough on systems with 32-bit long (and reasonable file systems).
fgetpos gives you an fpos_t, which isn't necessarily an integer type.

You can use long long or unsigned long long, guaranteed to be at least
64 bits (or [u]int64_t, or [u]intmax_t).

Öö Tiib

unread,

May 27, 2021, 9:36:49 PM5/27/21

to

On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
> In fact C doesn't define a type for representing file sizes.
> (It might be nice if it did.)

Or may be it might be nice if it did take it back from C++ where
std::filesystem::file_size() is of type std::uintmax_t.
Also standard should end the woo allowing implementations to
have integers that are not integers. It is awful.

Bonita Montero

unread,

May 27, 2021, 9:47:39 PM5/27/21

to

> size_t is intended to hold the size of any single object, but it may
> not be able to hold the sum of sizes of all objects or the size of

> the virtual address space. ...

You are an absolute nutcase. size_t is the same size as a pointer on
all systems with flat memory, so you can use it to assume the size of
any object.

> the virtual address space. An implementation might restrict the
> size of any single object to something smaller than the size of
> the entire virtual address space. (Think segments.)

There is no such implementation with flat memory and there won't be
such system in the future because there's no reason to design a plat-
form in that way.

Lynn McGuire

unread,

May 27, 2021, 10:39:03 PM5/27/21

to

For some reason, Microsoft is using __int64. Not my favorite variable
type. Even "long long" is better (and the same) imho. __int64
_ftelli64 (FILE *stream);

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ftell-ftelli64?view=msvc-160

Apparently no one else is using _ftelli64. This sucks.
https://www.vogons.org/viewtopic.php?t=65070

Lynn

Keith Thompson

unread,

May 27, 2021, 10:55:11 PM5/27/21

to

Öö Tiib <oot...@hot.ee> writes:
> On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
>> In fact C doesn't define a type for representing file sizes.
>> (It might be nice if it did.)
>
> Or may be it might be nice if it did take it back from C++ where
> std::filesystem::file_size() is of type std::uintmax_t.

Quite right. I post in comp.lang.c more than in comp.lang.c++ and I
forgot where I was. Apologies for any confusion.

> Also standard should end the woo allowing implementations to
> have integers that are not integers. It is awful.

Not sure what you're referring to here.

James Kuyper

unread,

May 27, 2021, 11:16:30 PM5/27/21

to

On 5/27/21 7:17 PM, Lynn McGuire wrote:
> On 5/27/2021 5:58 PM, Keith Thompson wrote:

...

>> The above is correct in C, but not in C++, which makes an additional
>> guarantee that C doesn't. C++17 21.2.4 [support.types.layout] says:
>>
>> The type size_t is an implementation-defined unsigned integer type
>> that is large enough to contain the size in bytes of any object
>> (8.3.3).
>
> Except the actual size of a FILE * object in the filesystem.

I think you mean the actual size of the file associated with the FILE*
object. A FILE* object is simply a pointer. A FILE object is typically a
struct object whose members contain the information needed by <cstdio>
functions to manage access to that file. In particular, it generally
contains a pointer to a dynamically allocated buffer which is used to
store parts of the file as they are being read from or written to the
file. It does not normally store the entire contents of the file.

Öö Tiib

unread,

May 28, 2021, 12:02:39 AM5/28/21

to

On Friday, 28 May 2021 at 05:55:11 UTC+3, Keith Thompson wrote:
> Öö Tiib <oot...@hot.ee> writes:
> > On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
> >> In fact C doesn't define a type for representing file sizes.
> >> (It might be nice if it did.)
> >
> > Or may be it might be nice if it did take it back from C++ where
> > std::filesystem::file_size() is of type std::uintmax_t.
>
> Quite right. I post in comp.lang.c more than in comp.lang.c++ and I
> forgot where I was. Apologies for any confusion.

No problems, all that pandemic and growing pile of communication
channels blinking make me also often confused where I am.

> > Also standard should end the woo allowing implementations to
> > have integers that are not integers. It is awful.
> Not sure what you're referring to here.

I meant "In addition to the standard integer types, the C99 standard
allows implementation-defined extended integer types, both signed
and unsigned. For example, a compiler might be provide signed
and unsigned 128-bit integer types."

That is good. But it misses the opportunity to tell that standard forbids
implementations from making their custom integer types that are not
extended integer types. And lack of that turns the above into useless
woo, as it says nothing as result. Not a thing. Waste of space.

Paavo Helde

unread,

May 28, 2021, 1:47:18 AM5/28/21

to

It's true that in C++ there cannot be any in-memory contiguous object
with a size larger than what fits in size_t. But that's not relevant
here, the file is not in the memory, but on the disk.

A file can be 300 GB or whatever; an int64 ought to be able to express
the size of any file for the foreseeable future. OTOH, size_t in a
32-bit program can only express up to 4 GB. So it means one indeed needs
to check the number before casting to size_t.

Paavo Helde

unread,

May 28, 2021, 1:53:20 AM5/28/21

to

28.05.2021 05:38 Lynn McGuire kirjutas:
> On 5/27/2021 8:36 PM, Öö Tiib wrote:
>> On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
>>> In fact C doesn't define a type for representing file sizes.
>>> (It might be nice if it did.)
>>
>> Or may be it might be nice if it did take it back from C++ where
>> std::filesystem::file_size() is of type std::uintmax_t.
>> Also standard should end the woo allowing implementations to
>> have integers that are not integers. It is awful.
>
> For some reason, Microsoft is using __int64. Not my favorite variable
> type. Even "long long" is better (and the same) imho. __int64
> _ftelli64 (FILE *stream);
>
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ftell-ftelli64?view=msvc-160

In C++ we have std::int64_t.

>
> Apparently no one else is using _ftelli64. This sucks.

I'm using _filelengthi64() in some places. Less hassle than with seek/tell.

Bo Persson

unread,

May 28, 2021, 3:47:45 AM5/28/21

to

On 2021-05-28 at 04:38, Lynn McGuire wrote:
> On 5/27/2021 8:36 PM, Öö Tiib wrote:
>> On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
>>> In fact C doesn't define a type for representing file sizes.
>>> (It might be nice if it did.)
>>
>> Or may be it might be nice if it did take it back from C++ where
>> std::filesystem::file_size() is of type std::uintmax_t.
>> Also standard should end the woo allowing implementations to
>> have integers that are not integers. It is awful.
>
> For some reason, Microsoft is using __int64. Not my favorite variable
> type.

The reason being that they invented this type at a time when there was
no long long.

> Even "long long" is better (and the same) imho.

Yes, *now* it is. :-)

Scott Lurndal

unread,

May 28, 2021, 10:34:53 AM5/28/21

to

The size of a file is defined by off_t, not size_t.

Scott Lurndal

unread,

May 28, 2021, 10:39:13 AM5/28/21

to

Lynn McGuire <lynnmc...@gmail.com> writes:
>On 5/27/2021 4:59 PM, Scott Lurndal wrote:
>> Lynn McGuire <lynnmc...@gmail.com> writes:
>>> On 5/27/2021 12:50 AM, Christian Gollwitzer wrote:
>>>> Am 26.05.21 um 20:32 schrieb Lynn McGuire: - Alf
>>>>>
>>>>> I have already replaced the fell code with _ftelli64.
>>>>>
>>>>> // get the size of the output file
>>>>> fseek (pOutputFile, 0, SEEK_END);
>>>>> __int64 outputFileLength = _ftelli64 (pOutputFile) + 42; // give it
>>>>> some slop
>>>>> int outputFileLengthInt = (int) outputFileLength;
>>>>
>>>> ...and here you restrict it to 2GB again, or worse, retrieve a negative
>>>> file size for sizes between 2GB and 4GB.
>>>>
>>>>
>>>> To prepare for a 64bit move, you should replace all size variables with
>>>> size_t for unsigned or ptrdiff_t for signed. That will correspond to a
>>>> 32bit integer in 32 bit and a 64 bit integer in 64 bit.
>>>>
>>>> Christian
>>>
>>> Done. With checking against SIZE_MAX before casting the variable to size_t.
>>
>> Why? size_t is guaranteed to hold the size of any object, which implies that
>> it must be large enough to accomodate an object the size of the virtual address
>> space. Generally it's minimum size in bits is the same as long.
>
> // get the size of the output file

You have a fundamental misunderstanding. A file isn't an object
from the C standard perspective.

struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* Inode number */
mode_t st_mode; /* File type and mode */
nlink_t st_nlink; /* Number of hard links */
uid_t st_uid; /* User ID of owner */
gid_t st_gid; /* Group ID of owner */
dev_t st_rdev; /* Device ID (if special file) */
off_t st_size; /* Total size, in bytes */
blksize_t st_blksize; /* Block size for filesystem I/O */
blkcnt_t st_blocks; /* Number of 512B blocks allocated */
struct timespec st_atim; /* Time of last access */
struct timespec st_mtim; /* Time of last modification */
struct timespec st_ctim; /* Time of last status change */

#define st_atime st_atim.tv_sec /* Backward compatibility */
#define st_mtime st_mtim.tv_sec
#define st_ctime st_ctim.tv_sec
};

File sizes use the 'off_t' type.

If windows did not define an off_t equivelent, then the windows API
is insufficient.

From the compiler perspective, size_t applies only to in-memory objects.

Bonita Montero

unread,

May 28, 2021, 10:42:25 AM5/28/21

to

> If windows did not define an off_t equivelent, then the windows API
> is insufficient.

https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfilesize

A file-size under Windows isn't a special type but just a 64-bit number.
And this isn't insufficient.

Bonita Montero

unread,

May 28, 2021, 10:45:17 AM5/28/21

to

Am 28.05.2021 um 16:42 schrieb Bonita Montero:
>> If windows did not define an off_t equivelent, then the windows API
>> is insufficient.
>
> https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfilesize

This is more convenient:
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfilesizeex

Paavo Helde

unread,

May 28, 2021, 11:22:24 AM5/28/21

to

28.05.2021 17:38 Scott Lurndal kirjutas:
> File sizes use the 'off_t' type.
>
> If windows did not define an off_t equivelent, then the windows API
> is insufficient.

The size of off_t is implementation and macro dependent. So it is not
quite clear what you mean by "off_t equivalent", and if it would be a
good idea to have such equivalent.

Windows API aims to use fixed sized types, for better binary
compatibility. This means that as there exist files over 4GB, the chosen
fixed size type in the Win32 API for the file sizes and offsets must be
64-bit. And indeed, this is how the relevant functions in Windows API
are defined:

DWORD GetFileSize(HANDLE hFile, LPDWORD lpFileSizeHigh);

This returns the file size in two 32-bit pieces. Ugly as hell, but at
least the result is always 64-bit. There is a newer function with a
better interface:

BOOL GetFileSizeEx(HANDLE hFile, PLARGE_INTEGER lpFileSize);

Here, LARGE_INTEGER is defined as strictly 64-bit.

One can see from the names that MS has tried to avoid binding them to
specific bit sizes, but this has failed and nowadays these types are
documented as having fixed size. As part of that failure, DWORD is
nowadays usually half a word, not double word.

Keith Thompson

unread,

May 28, 2021, 3:10:07 PM5/28/21

to

Öö Tiib <oot...@hot.ee> writes:
> On Friday, 28 May 2021 at 05:55:11 UTC+3, Keith Thompson wrote:
>> Öö Tiib <oot...@hot.ee> writes:
>> > On Friday, 28 May 2021 at 02:58:18 UTC+3, Keith Thompson wrote:
>> >> In fact C doesn't define a type for representing file sizes.
>> >> (It might be nice if it did.)
>> >
>> > Or may be it might be nice if it did take it back from C++ where
>> > std::filesystem::file_size() is of type std::uintmax_t.
>>
>> Quite right. I post in comp.lang.c more than in comp.lang.c++ and I
>> forgot where I was. Apologies for any confusion.
>
> No problems, all that pandemic and growing pile of communication
> channels blinking make me also often confused where I am.
>
>> > Also standard should end the woo allowing implementations to
>> > have integers that are not integers. It is awful.
>> Not sure what you're referring to here.
>
> I meant "In addition to the standard integer types, the C99 standard
> allows implementation-defined extended integer types, both signed
> and unsigned. For example, a compiler might be provide signed
> and unsigned 128-bit integer types."

Right. Extended integer types are a nice idea, but I've never seen a
compiler that actually implements them.

> That is good. But it misses the opportunity to tell that standard forbids
> implementations from making their custom integer types that are not
> extended integer types. And lack of that turns the above into useless
> woo, as it says nothing as result. Not a thing. Waste of space.

gcc has __int128 *as an extension* (not supported for 32-bit targets).
The standard permits extensions:

A conforming implementation may have extensions (including
additional library functions), provided they do not alter the
behavior of any strictly conforming program.

I'm not sure how (or why!) you'd forbid extensions that happen to act
almost like integer types.

gcc doesn't support 128-bit integer constants. Also, making __int128 an
extended integer type would require intmax_t to be 128 bits, which would
cause serious problems with ABIs (there are standard library functions
that take arguments of type intmax_t). The alternative would have been
not to support 128-bit integers at all.

I'd like to see full support for 128-bit integers, but gcc's __int128 is
IMHO better than nothing (though to be honest I've never used it except
in small test programs).

Scott Lurndal

unread,

May 28, 2021, 4:26:59 PM5/28/21

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

>gcc has __int128 *as an extension* (not supported for 32-bit targets).
>The standard permits extensions:

>

>I'd like to see full support for 128-bit integers, but gcc's __int128 is
>IMHO better than nothing (though to be honest I've never used it except
>in small test programs).

We use it when modeling 128-bit bus structures. We don't use
arithmetic operations on 128-bit data but do
use masking and shifting.

Öö Tiib

unread,

May 28, 2021, 8:10:59 PM5/28/21

to

On Friday, 28 May 2021 at 22:10:07 UTC+3, Keith Thompson wrote:
> gcc has __int128 *as an extension* (not supported for 32-bit targets).
> The standard permits extensions:
>
> A conforming implementation may have extensions (including
> additional library functions), provided they do not alter the
> behavior of any strictly conforming program.
>
> I'm not sure how (or why!) you'd forbid extensions that happen to act
> almost like integer types.

I hoped to managed to express it. Technically the integer types
are causing some of trouble as the rules of promotion and implicit
conversion especially with implementation defined features in mix
seem not to be intuitive to many programmers. Therefore desire to
regulate it is welcome and appearance of regulating something
without actually regulating anything is double unwelcome.

>
> gcc doesn't support 128-bit integer constants. Also, making __int128 an
> extended integer type would require intmax_t to be 128 bits, which would
> cause serious problems with ABIs (there are standard library functions
> that take arguments of type intmax_t). The alternative would have been
> not to support 128-bit integers at all.

I think of it as nonsense. The monsters are just thinking they are clever
and fooling each other and so there is illusion of consensus. Actual
desire is to have support to 128 bit (or perhaps arbitrary amount of bit)
integers in their Golangs, Swifts, Javas or C#s before C and C++ and so
hopefully others. But their proprieritary language infrastructures are
mostly written in C or C++ so I see no point in pretending that we don't
see it through.

David Brown

unread,

May 29, 2021, 5:47:50 AM5/29/21

to

On 28/05/2021 21:09, Keith Thompson wrote:

> gcc has __int128 *as an extension* (not supported for 32-bit targets).
> The standard permits extensions:
>
> A conforming implementation may have extensions (including
> additional library functions), provided they do not alter the
> behavior of any strictly conforming program.
>
> I'm not sure how (or why!) you'd forbid extensions that happen to act
> almost like integer types.
>
> gcc doesn't support 128-bit integer constants. Also, making __int128 an
> extended integer type would require intmax_t to be 128 bits, which would
> cause serious problems with ABIs (there are standard library functions
> that take arguments of type intmax_t). The alternative would have been
> not to support 128-bit integers at all.

The definition of intmax_t is a problem - it is a limitation for integer
types in C and C++. I'd have preferred to see functions like "abs" be
type-generic macros in C and template functions in C++. From C90 there
was "abs" and "labs" - C99 could have skipped "llabs" and "imaxabs", and
similar functions. The "div" functions wouldn't need extended for
bigger types - they are a hangover from an era of weaker compilers. No
doubt there would be complications with some other functions that today
use intmax_t types - no doubt there would be alternative ways of
handling them, given a bit of thought.

But of course it is too late to change all that now. The gcc solution
of __int128 covers most purposes without affecting backwards compatibility.

>
> I'd like to see full support for 128-bit integers, but gcc's __int128 is
> IMHO better than nothing (though to be honest I've never used it except
> in small test programs).
>

There is nothing stopping the C++ standard library introducing types
std::int<N> and std::uint<N> types, where implementations can choose
which sizes of N they support (but requiring support for any N for which
std::intN_t exists). These would work just like integer types for most
purposes, but not be /called/ integer types. So in gcc, std::int<128>
would be the same as __int128_t.

Constants would be handled by user-defined literals.

(I also don't see much need for 128-bit or bigger types - until you get
to cryptography-sized integers - but I guess some people do.)

daniel...@gmail.com

unread,

May 30, 2021, 6:15:41 PM5/30/21

to

On Saturday, May 29, 2021 at 5:47:50 AM UTC-4, David Brown wrote:
> The definition of intmax_t is a problem - it is a limitation for integer

> types in C and C++. Hopefully eventually deprecate intmax_t.

One proposal is to make intmax_t mean int64_t, and leave it at that.
Have no requirement that integer types can't be larger. No more ABI
problem.

> I'd have preferred to see functions like "abs" be
> type-generic macros in C and template functions in C++. From C90 there
> was "abs" and "labs" - C99 could have skipped "llabs" and "imaxabs", and
> similar functions.

Yes, of course, and to_integer<T> and from_integer<T>, and others. Many libraries
have to reinvent their own version of these things.

>
> The gcc solution of __int128 covers most purposes without affecting
> backwards compatibility.
> >

Hardly "most purposes", far from it. Without compiling with "-std=gnu++11",
you don't even have std::numeric_limits<__int128>. The absence of
standard support for int128_t makes genericity much harder. While other
languages such as rust with better type support see rapid growth
of open source libraries that cover all manner of data interchange
standards, C++ is comparatively stagnant.

Daniel

David Brown

unread,

May 31, 2021, 2:33:52 AM5/31/21

to

On 31/05/2021 00:15, daniel...@gmail.com wrote:
> On Saturday, May 29, 2021 at 5:47:50 AM UTC-4, David Brown wrote:
>> The definition of intmax_t is a problem - it is a limitation for integer
>> types in C and C++. Hopefully eventually deprecate intmax_t.
>
> One proposal is to make intmax_t mean int64_t, and leave it at that.
> Have no requirement that integer types can't be larger. No more ABI
> problem.
>

It might make more sense to tie it to "long long int" rather than
"int64_t", but someone would first have to check if it affected any real
implementations before making such a change. But yes, that might be a
way out and a way forward.

>> I'd have preferred to see functions like "abs" be
>> type-generic macros in C and template functions in C++. From C90 there
>> was "abs" and "labs" - C99 could have skipped "llabs" and "imaxabs", and
>> similar functions.
>
> Yes, of course, and to_integer<T> and from_integer<T>, and others. Many libraries
> have to reinvent their own version of these things.
>>
>> The gcc solution of __int128 covers most purposes without affecting
>> backwards compatibility.
>>>
> Hardly "most purposes", far from it. Without compiling with "-std=gnu++11",
> you don't even have std::numeric_limits<__int128>.

Well, it /is/ a gcc extension - choosing to enable it on the command
line makes sense to me. But I was thinking of the core language, rather
than the library, which can be somewhat independent of the compiler itself.

> The absence of
> standard support for int128_t makes genericity much harder. While other
> languages such as rust with better type support see rapid growth
> of open source libraries that cover all manner of data interchange
> standards, C++ is comparatively stagnant.
>

Those relatively few programs that have need of int128_t can simply do a
typedef. It won't magically allow literals of the type, but it will
cover most cases.

daniel...@gmail.com

unread,

May 31, 2021, 11:21:18 AM5/31/21

to

A typedef? You've lost me.

Daniel

David Brown

unread,

May 31, 2021, 12:13:30 PM5/31/21

to

typedef signed __int128 int128_t;
typedef unsigned __int128 uint128_t;

You can wrap them in #ifdef's to check for gcc and support for the 128
bit almost integer types (and perhaps to check that your target doesn't
support standard int128_t types, as it will if "long long" is 128 bits).

Keith Thompson

unread,

May 31, 2021, 5:43:23 PM5/31/21

to

David Brown <david...@hesbynett.no> writes:
> On 31/05/2021 00:15, daniel...@gmail.com wrote:
>> On Saturday, May 29, 2021 at 5:47:50 AM UTC-4, David Brown wrote:
>>> The definition of intmax_t is a problem - it is a limitation for integer
>>> types in C and C++. Hopefully eventually deprecate intmax_t.
>>
>> One proposal is to make intmax_t mean int64_t, and leave it at that.
>> Have no requirement that integer types can't be larger. No more ABI
>> problem.
>
> It might make more sense to tie it to "long long int" rather than
> "int64_t", but someone would first have to check if it affected any real
> implementations before making such a change. But yes, that might be a
> way out and a way forward.

[...]

That would allow intmax_t to be 128 bits on implementations with
128-bit long long (are there any?), which seems like a good idea.

I think the point of both these proposals is purely for backward
compatibility, avoiding breaking code that already uses [u]intmax_t.
Both of them destroy the point of intmax_t, providing a type that's
guaranteed to be the longest integer type. Should intmax_t be
deprecated?

Perhaps some future version of C might have enough capabilities to
allow defining a longest integer type without causing ABI issues
the way intmax_t did.

And since, as far as I've been able to tell, no implementation
supports extended integer types, I wonder if they should be
reconsidered.

Keith Thompson

unread,

May 31, 2021, 5:45:04 PM5/31/21

to

Keith Thompson <Keith.S.T...@gmail.com> writes:
[...]

> Perhaps some future version of C might have enough capabilities to
> allow defining a longest integer type without causing ABI issues
> the way intmax_t did.

And I did it again. s/C/C++/, or s/comp.lang.c++/comp.lang.c/.

[...]

daniel...@gmail.com

unread,

May 31, 2021, 6:20:28 PM5/31/21

to

On Monday, May 31, 2021 at 5:43:23 PM UTC-4, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> > On 31/05/2021 00:15, daniel...@gmail.com wrote:
> >> On Saturday, May 29, 2021 at 5:47:50 AM UTC-4, David Brown wrote:
> >>> The definition of intmax_t is a problem - it is a limitation for integer
> >>> types in C and C++. Hopefully eventually deprecate intmax_t.
> >>
> >> One proposal is to make intmax_t mean int64_t, and leave it at that.
> >> Have no requirement that integer types can't be larger. No more ABI
> >> problem.
> >
> > It might make more sense to tie it to "long long int" rather than
> > "int64_t", but someone would first have to check if it affected any real
> > implementations before making such a change. But yes, that might be a
> > way out and a way forward.
> [...]
>
> That would allow intmax_t to be 128 bits on implementations with
> 128-bit long long (are there any?), which seems like a good idea.
>
> I think the point of both these proposals is purely for backward
> compatibility, avoiding breaking code that already uses [u]intmax_t.
> Both of them destroy the point of intmax_t, providing a type that's
> guaranteed to be the longest integer type. Should intmax_t be
> deprecated?
>

Yes. "Give me the biggest integer type there is" is not a reasonable
thing to ask for, in any code that is intended to be portable across platforms
or over time on the same platform. You may as well have intwhatever_t.

Daniel

David Brown

unread,

Jun 1, 2021, 2:30:04 AM6/1/21

to

On 31/05/2021 23:43, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
>> On 31/05/2021 00:15, daniel...@gmail.com wrote:
>>> On Saturday, May 29, 2021 at 5:47:50 AM UTC-4, David Brown wrote:
>>>> The definition of intmax_t is a problem - it is a limitation for integer
>>>> types in C and C++. Hopefully eventually deprecate intmax_t.
>>>
>>> One proposal is to make intmax_t mean int64_t, and leave it at that.
>>> Have no requirement that integer types can't be larger. No more ABI
>>> problem.
>>
>> It might make more sense to tie it to "long long int" rather than
>> "int64_t", but someone would first have to check if it affected any real
>> implementations before making such a change. But yes, that might be a
>> way out and a way forward.
>
> [...]
>
> That would allow intmax_t to be 128 bits on implementations with
> 128-bit long long (are there any?), which seems like a good idea.
>
> I think the point of both these proposals is purely for backward
> compatibility, avoiding breaking code that already uses [u]intmax_t.
> Both of them destroy the point of intmax_t, providing a type that's
> guaranteed to be the longest integer type. Should intmax_t be
> deprecated?

In my opinion, yes - it should be deprecated. But of course you'd want
to check with people who actually use it, to see why the use it and
whether there are better alternatives.

>
> Perhaps some future version of C might have enough capabilities to
> allow defining a longest integer type without causing ABI issues
> the way intmax_t did.
>
> And since, as far as I've been able to tell, no implementation
> supports extended integer types, I wonder if they should be
> reconsidered.
>

Maybe it would be worth reconsidering exactly what the definition of
"integer type" should be in the C and C++ standards (keeping both
languages in sync here is, I think, important). I'd like to see
intmax_t removed and the definition of "integer type" modified such that
gcc's __int128 /is/ an extended integer type. After all, people use it
as though it were, and assume it is.

Bo Persson

unread,

Jun 1, 2021, 7:59:37 AM6/1/21

to

The problem is that we in general don't know what "whatever" is. At the
time when intmax_t was introduced, at least in C there were
implementations with 36 bit ints and 72-bit longs. So just using int64_t
would not be portable.

Bo Persson

unread,

Jun 1, 2021, 8:05:29 AM6/1/21

to

Nowadays it is very likely long long, for "reasonably wide integer
type". But that hasn't always been available.

>
>>
>> Perhaps some future version of C might have enough capabilities to
>> allow defining a longest integer type without causing ABI issues
>> the way intmax_t did.
>>
>> And since, as far as I've been able to tell, no implementation
>> supports extended integer types, I wonder if they should be
>> reconsidered.
>>
>
> Maybe it would be worth reconsidering exactly what the definition of
> "integer type" should be in the C and C++ standards (keeping both
> languages in sync here is, I think, important). I'd like to see
> intmax_t removed and the definition of "integer type" modified such that
> gcc's __int128 /is/ an extended integer type. After all, people use it
> as though it were, and assume it is.
>

And it really *is*, except for the documentation saying "integer type
extension" (and not "extended integer type"), only to avoid the intmax_t
problem.

David Brown

unread,

Jun 1, 2021, 9:55:38 AM6/1/21

to

Have there ever been C99 compilers for 36-bit int machines? Were there
even conforming C90 compilers?

AFAIK (and I fully admit my knowledge may be lacking), the only systems
that made it past the 1980's which did not have two's complement signed
integers with 8-bit bytes and power-of-two sized integer types are some
DSPs and other niche embedded devices (for which no one would use an
integer type without knowing /exactly/ how big it is), and
Burroughs/Unisys systems for legacy compatibility.

My suggestion would be to lock intmax_t to "long long", which would keep
compatibility here (including for systems that have 128-bit long long,
if there are any other than a hypothetical RISC-V version).

David Brown

unread,

Jun 1, 2021, 9:58:19 AM6/1/21

to

There is also no way to make constant literals of __int128, nor is there
support for printf and a wide variety of the builtin functions, and
standard library functions, and other bits and pieces. It's fine for
basic usage, but missing many features of int64_t and other sized
integer types. (I'm not complaining, just noting.)

James Kuyper

unread,

Jun 1, 2021, 11:33:55 AM6/1/21

to

On 6/1/21 9:58 AM, David Brown wrote:
> On 01/06/2021 14:05, Bo Persson wrote:
>> On 2021-06-01 at 08:29, David Brown wrote:

...

>>> Maybe it would be worth reconsidering exactly what the definition of
>>> "integer type" should be in the C and C++ standards (keeping both
>>> languages in sync here is, I think, important). I'd like to see
>>> intmax_t removed and the definition of "integer type" modified such that
>>> gcc's __int128 /is/ an extended integer type. After all, people use it
>>> as though it were, and assume it is.
>>>
>>
>> And it really *is*, except for the documentation saying "integer type
>> extension" (and not "extended integer type"), only to avoid the intmax_t
>> problem.
>
> There is also no way to make constant literals of __int128, nor is there
> support for printf and a wide variety of the builtin functions, and
> standard library functions, and other bits and pieces. It's fine for
> basic usage, but missing many features of int64_t and other sized
> integer types. (I'm not complaining, just noting.)
>

If they changed their documentation to identify __int128_t as an
extended integer type, then they would be required to support int128_t,
along with all of the corresponding features of <cinttypes> and
<cstdint>, which would address that issue.

Manfred

unread,

Jun 1, 2021, 12:00:41 PM6/1/21

to

I think this would be problematic as well, or at least useless and
confusing.
The idea for intmax_t is to give a standard name to the widest integer
type that is available, which is by definition implementation defined.
Having intmax_t an alias for "long long" would change its meaning to the
widest /standard/ integer type defined by the standard itself - a
useless repetition (we have "long long" for that), and confusing too,
given its change in meaning.

As far as I understand the purpose for intmax_t is to allow for (sort
of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
etc that are supposed to operate on larger integer types - these are the
only functions that take this kind of arguments.
The confusing part is that all of these facilities are implementation
dependent, so even if they are part of the standard they are /not/
portable, meaning that the programmer is supposed (the way I see it) to
use them under the guard of appropriate preprocessor directives.

The use for them is to allow the programmer to use some optimized
routines for larger types, if available.

For example, in case operations like ldiv are needed on 128 bit
integers, then IF the implementation supports 128 bit intmax_t then
imaxdiv can be a better choice rather than implementing your own routine
- note that /IF/ is the keyword here, that should map directly to
appropriate #if directives.

It's most probably somewhat a niche field of use (possibly growing due
to the diffuse demand for cryptography), or meant for applications that
are supposed to be run on hardware that is known to support the
appropriate types, so that the #if directives can be as simple as
denying compilation for implementations that don't have a wide enough
intmax_t.

My 2c.

David Brown

unread,

Jun 1, 2021, 1:15:24 PM6/1/21

to

Yes, that is all true. The point is not to find another useful purpose
for intmax_t - the point is to get rid of it, marking it as deprecated,
but to do so in a way that won't break existing code.

I am at a loss to understand why anyone would have a use for intmax_t in
the first place. When would you want an integer type whose sole
characteristic is "big" ? To me, it is logical to want a type that is
at least N bits, or exactly N bits. These requirements are covered by
the normal "short", "int", "long" and "long long" types, or - better for
my use, but not necessarily other people's - the <stdint.h> fixed size
types. "intmax_t" gives you absolutely /nothing/ that "long long" does not.

Given that there are, as far as we know, no implementations where
intmax_t does not correspond directly to "long long", I would like to
see "intmax_t" be dropped to the maximum extent allowable by backwards
compatibility.

>
> As far as I understand the purpose for intmax_t is to allow for (sort
> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
> etc that are supposed to operate on larger integer types - these are the
> only functions that take this kind of arguments.
> The confusing part is that all of these facilities are implementation
> dependent, so even if they are part of the standard they are /not/
> portable, meaning that the programmer is supposed (the way I see it) to
> use them under the guard of appropriate preprocessor directives.
>
> The use for them is to allow the programmer to use some optimized
> routines for larger types, if available.

But they don't allow that. If you are trying to make optimised routines
for larger types, you need to know your sizes - you either use
implementation extensions (like __int128), or fixed size types, or if
you need maximal portability, you use "int_fast64_t".

>
> For example, in case operations like ldiv are needed on 128 bit
> integers, then IF the implementation supports 128 bit intmax_t then
> imaxdiv can be a better choice rather than implementing your own routine
> - note that /IF/ is the keyword here, that should map directly to
> appropriate #if directives.
>

The situation we have now is that on a compiler like gcc you can get
128-bit division using __int128, but /not/ using intmax_t. It is a
useless type.

Keith Thompson

unread,

Jun 1, 2021, 1:53:12 PM6/1/21

to

*And* they'd have to make intmax_t 128 bits, which would cause more
problems.

Keith Thompson

unread,

Jun 1, 2021, 2:17:16 PM6/1/21

to

David Brown <david...@hesbynett.no> writes:
> On 01/06/2021 18:00, Manfred wrote:

[...]

>> I think this would be problematic as well, or at least useless and
>> confusing.
>> The idea for intmax_t is to give a standard name to the widest integer
>> type that is available, which is by definition implementation defined.
>> Having intmax_t an alias for "long long" would change its meaning to the
>> widest /standard/ integer type defined by the standard itself - a
>> useless repetition (we have "long long" for that), and confusing too,
>> given its change in meaning.
>
> Yes, that is all true. The point is not to find another useful purpose
> for intmax_t - the point is to get rid of it, marking it as deprecated,
> but to do so in a way that won't break existing code.
>
> I am at a loss to understand why anyone would have a use for intmax_t in
> the first place. When would you want an integer type whose sole
> characteristic is "big" ? To me, it is logical to want a type that is
> at least N bits, or exactly N bits. These requirements are covered by
> the normal "short", "int", "long" and "long long" types, or - better for
> my use, but not necessarily other people's - the <stdint.h> fixed size
> types. "intmax_t" gives you absolutely /nothing/ that "long long" does not.

Suppose you have a type foo_t, and all you know is that it's a signed
integer type. You can print it using printf("%ju", (intmax_t)n).
(Or in C++ you can use std::cout << n, but intmax_t is inherited from
C's standard library.)

Or suppose you're implementing an arbitrary-width integer type as
an array of fixed-width integers. You might want each element to
be as wide as possible, because arithmetic on a 2*N-bit integer is
likely to be faster than synthesizing it from two N-bit integers.
(That's assuming that, for example, 128-bit integer arithmetic
isn't drastically less efficient than 64-bit integer arithmetic.)

> Given that there are, as far as we know, no implementations where
> intmax_t does not correspond directly to "long long", I would like to
> see "intmax_t" be dropped to the maximum extent allowable by backwards
> compatibility.

I'd *like* to keep intmax_t and make it work the way it was intended,
but ABI issues made that impractical. (I'm guessing those issues
weren't anticipated when intmax_t was added in C99.) Deprecating it
might be the least bad solution.

>> As far as I understand the purpose for intmax_t is to allow for (sort
>> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
>> etc that are supposed to operate on larger integer types - these are the
>> only functions that take this kind of arguments.
>> The confusing part is that all of these facilities are implementation
>> dependent, so even if they are part of the standard they are /not/
>> portable, meaning that the programmer is supposed (the way I see it) to
>> use them under the guard of appropriate preprocessor directives.
>>
>> The use for them is to allow the programmer to use some optimized
>> routines for larger types, if available.
>
> But they don't allow that. If you are trying to make optimised routines
> for larger types, you need to know your sizes - you either use
> implementation extensions (like __int128), or fixed size types, or if
> you need maximal portability, you use "int_fast64_t".

And my impression (which could be mistaken) is that __int128 is not an
extended integer type partly *because* it would require making intmax_t
128 bits, which would cause ABI problems. That, and __int128 is
unfinished (some features are missing), but if it could have been made
an extended integer type perhaps more effort would have been spent
making it fully functional.

>> For example, in case operations like ldiv are needed on 128 bit
>> integers, then IF the implementation supports 128 bit intmax_t then
>> imaxdiv can be a better choice rather than implementing your own routine
>> - note that /IF/ is the keyword here, that should map directly to
>> appropriate #if directives.
>
> The situation we have now is that on a compiler like gcc you can get
> 128-bit division using __int128, but /not/ using intmax_t. It is a
> useless type.

I suspect it's useful enough for some purposes. There's probably code
out there that uses __int128 that doesn't need 128-bit division.

[...]

Manfred

unread,

Jun 1, 2021, 2:22:05 PM6/1/21

to

The use I see is with imaxdiv and friends as I wrote below, at least
this is my understanding.

When would you want an integer type whose sole
> characteristic is "big" ? To me, it is logical to want a type that is
> at least N bits, or exactly N bits. These requirements are covered by
> the normal "short", "int", "long" and "long long" types, or - better for
> my use, but not necessarily other people's - the <stdint.h> fixed size
> types. "intmax_t" gives you absolutely /nothing/ that "long long" does not.
>
> Given that there are, as far as we know, no implementations where
> intmax_t does not correspond directly to "long long", I would like to
> see "intmax_t" be dropped to the maximum extent allowable by backwards
> compatibility.
>
>>
>> As far as I understand the purpose for intmax_t is to allow for (sort
>> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
>> etc that are supposed to operate on larger integer types - these are the
>> only functions that take this kind of arguments.
>> The confusing part is that all of these facilities are implementation
>> dependent, so even if they are part of the standard they are /not/
>> portable, meaning that the programmer is supposed (the way I see it) to
>> use them under the guard of appropriate preprocessor directives.
>>
>> The use for them is to allow the programmer to use some optimized
>> routines for larger types, if available.
>
> But they don't allow that. If you are trying to make optimised routines
> for larger types, you need to know your sizes - you either use
> implementation extensions (like __int128), or fixed size types, or if
> you need maximal portability, you use "int_fast64_t".

When I wrote "use" I didn't mean "make". I meant imaxdiv may be a 128
bit division routine provided by the implementation that the programmer
can to use without the need to write one, possible a less efficient one.

>
>>
>> For example, in case operations like ldiv are needed on 128 bit
>> integers, then IF the implementation supports 128 bit intmax_t then
>> imaxdiv can be a better choice rather than implementing your own routine
>> - note that /IF/ is the keyword here, that should map directly to
>> appropriate #if directives.
>>
>
> The situation we have now is that on a compiler like gcc you can get
> 128-bit division using __int128, but /not/ using intmax_t. It is a
> useless type.
>

The way I see it this is a problem with gcc, not with the standard.
Unless the committee managed to produce some wording that is too
problematic for __int128 to fit as extended integer type.

Note that I am not talking about plain integer division (the '/'
operator) I am talking about 128 bit ldiv.
Now we have div, ldiv and lldiv too (supposedly for 64 bit), but instead
of going on with llldiv, and then llllllllldiv, they decided to stop
with imaxdiv. It makes sense.

Manfred

unread,

Jun 1, 2021, 2:25:53 PM6/1/21

to

Can you be more specific about which ABI issues?

>
>>> As far as I understand the purpose for intmax_t is to allow for (sort
>>> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
>>> etc that are supposed to operate on larger integer types - these are the
>>> only functions that take this kind of arguments.
>>> The confusing part is that all of these facilities are implementation
>>> dependent, so even if they are part of the standard they are /not/
>>> portable, meaning that the programmer is supposed (the way I see it) to
>>> use them under the guard of appropriate preprocessor directives.
>>>
>>> The use for them is to allow the programmer to use some optimized
>>> routines for larger types, if available.
>>
>> But they don't allow that. If you are trying to make optimised routines
>> for larger types, you need to know your sizes - you either use
>> implementation extensions (like __int128), or fixed size types, or if
>> you need maximal portability, you use "int_fast64_t".
>
> And my impression (which could be mistaken) is that __int128 is not an
> extended integer type partly *because* it would require making intmax_t
> 128 bits, which would cause ABI problems.

That sounds like a possible explanation, but again I'd need to know more
about such ABI issues.

Lynn McGuire

unread,

Jun 1, 2021, 2:26:49 PM6/1/21

to

Don't forget the 60 bit int / 120 bit long CDC 7600 machines. The only
36 bit machine that I knew was the Univac 1108 that the IRS reputedly
used until 2010 or so.

Lynn

Keith Thompson

unread,

Jun 1, 2021, 7:55:56 PM6/1/21

to

Manfred <non...@add.invalid> writes:
> On 6/1/2021 8:17 PM, Keith Thompson wrote:

[...]

>> I'd *like* to keep intmax_t and make it work the way it was intended,
>> but ABI issues made that impractical. (I'm guessing those issues
>> weren't anticipated when intmax_t was added in C99.) Deprecating it
>> might be the least bad solution.
>
> Can you be more specific about which ABI issues?

It's discussed, and a solution proposed, here:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2465.pdf

Quoting the problem description section:

The interaction between the definition of extended integer types and
[u]intmax_t has resulted in a lack of extensibility for existing
ABI. Platforms that have anchored their specifications for the basic
integer types and for [u]intmax_t cannot add an extended integer type
that is wider than their current [u]intmax_t to their specification. As
the current text of the C standard stands, such an addition would
force a redefinition of [u]intmax_t to the wider types. This would
have the following consequences
- The parts of the C library that use [u]intmax_t (specific
functions but also printf and related functions) must be
rewritten or recompiled with the new ABI and become binary
incompatible with existing programs.
- Programs compiled with the new ABI would be binary incompatible
on platforms that have not been upgraded.
- The preprocessor of the implementation must be re-engineered
to comply with the standard. In particular, there would be
severe specification problems for preprocessor numbers and
their evaluation. E.g., the value of ULLONG_MAX+1 is not
expressible as a literal in the language proper but would
be for the preprocessor. The expression ULLONG_MAX+1 would
evaluate to true in a preprocessor conditional but to 0
(false) in later compilation phases.

See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2425.pdf

Chris M. Thomasson

unread,

Jun 1, 2021, 8:35:30 PM6/1/21

to

On 5/24/2021 6:46 PM, Lynn McGuire wrote:
> I am getting std::bad_alloc from the following code when I try to
> reserve a std::string of size 937,180,144:
>
> std::string filename = getFormsMainOwner () -> getOutputFileName ();
> FILE * pOutputFile = nullptr;
> errno_t err = fopen_s_UTF8 ( & pOutputFile, filename.c_str (), "rt");
> if (err == 0)
> {
>     std::string outputFileBuffer;
>         // need to preallocate the space in case the output file is a
> gigabyte or more, PMR 6408
>     fseek (pOutputFile, 0, SEEK_END);
>     size_t outputFileLength = ftell (pOutputFile) + 42; // give it
> some slop
>     fseek (pOutputFile, 0, SEEK_SET);
>     outputFileBuffer.reserve (outputFileLength);
>
> Any thoughts here on how to handle the std::bad_alloc in std::string
> reserve ?

Fwiw, way back when I was working with C server code using WinNT and
IOCP, if a malloc failed, I would put the server into a so-called
"panic" mode that would trigger the event loops to dump resources. A
dumped resource would be a connection that has not responded and in the
timeout detection logic. It would do other things like free buffers in
connections, ect... When a malloc fails, well, it was not as bad as
non-paged pool memory crapping out!

The failed malloc would be deferred to try again.

David Brown

unread,

Jun 2, 2021, 3:06:55 AM6/2/21

to

On 01/06/2021 20:17, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
>> On 01/06/2021 18:00, Manfred wrote:
> [...]
>>> I think this would be problematic as well, or at least useless and
>>> confusing.
>>> The idea for intmax_t is to give a standard name to the widest integer
>>> type that is available, which is by definition implementation defined.
>>> Having intmax_t an alias for "long long" would change its meaning to the
>>> widest /standard/ integer type defined by the standard itself - a
>>> useless repetition (we have "long long" for that), and confusing too,
>>> given its change in meaning.
>>
>> Yes, that is all true. The point is not to find another useful purpose
>> for intmax_t - the point is to get rid of it, marking it as deprecated,
>> but to do so in a way that won't break existing code.
>>
>> I am at a loss to understand why anyone would have a use for intmax_t in
>> the first place. When would you want an integer type whose sole
>> characteristic is "big" ? To me, it is logical to want a type that is
>> at least N bits, or exactly N bits. These requirements are covered by
>> the normal "short", "int", "long" and "long long" types, or - better for
>> my use, but not necessarily other people's - the <stdint.h> fixed size
>> types. "intmax_t" gives you absolutely /nothing/ that "long long" does not.
>
> Suppose you have a type foo_t, and all you know is that it's a signed
> integer type. You can print it using printf("%ju", (intmax_t)n).

And if you know "long long" is the largest standard type, you can print
it with "%llu". The reality is that it is currently exactly the same.
intmax_t doesn't let you print anything that you can't print with "long
long" in any existing implementations, and is seems unlikely to do so in
the future.

If implementations regularly had integer types that were bigger than
"long long" then "intmax_t" could have been useful in such cases. It
could be seen as a good reason for introducing intmax_t in the first
place. The reality, however, is different.

I suspect an issue here is that compilers and standard libraries are
often developed somewhat independently - but the introduction of a
larger integer type in a compiler would require changes to the libraries
used, possibly also the ABI for the platform. It's one thing to accept
such chicken-and-egg challenges for a major new C revision, such as the
introduction of "long long" in C99, but quite another to have it for
compiler-specific extensions along the way.

> (Or in C++ you can use std::cout << n, but intmax_t is inherited from
> C's standard library.)
>

C++ does not have the problems or excuses of C here. printf is a pain
because it is a variadic function with types unknown at declaration and
compile time (of the printf implementation). For C++, an implementation
can easily add an overload for << with whatever types the compiler supports.

> Or suppose you're implementing an arbitrary-width integer type as
> an array of fixed-width integers. You might want each element to
> be as wide as possible, because arithmetic on a 2*N-bit integer is
> likely to be faster than synthesizing it from two N-bit integers.
> (That's assuming that, for example, 128-bit integer arithmetic
> isn't drastically less efficient than 64-bit integer arithmetic.)
>

You need to know the sizes of your integers here. intmax_t, specified
merely as "at least as big as long long", is useless. For such tasks,
you use the <stdint.h> types in general, and possibly compiler-specific
extensions like __int128. You don't use intmax_t. (At least, /I/ can't
see how it would be helpful here.)

>> Given that there are, as far as we know, no implementations where
>> intmax_t does not correspond directly to "long long", I would like to
>> see "intmax_t" be dropped to the maximum extent allowable by backwards
>> compatibility.
>
> I'd *like* to keep intmax_t and make it work the way it was intended,
> but ABI issues made that impractical. (I'm guessing those issues
> weren't anticipated when intmax_t was added in C99.) Deprecating it
> might be the least bad solution.

Fair enough.

>
>>> As far as I understand the purpose for intmax_t is to allow for (sort
>>> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
>>> etc that are supposed to operate on larger integer types - these are the
>>> only functions that take this kind of arguments.
>>> The confusing part is that all of these facilities are implementation
>>> dependent, so even if they are part of the standard they are /not/
>>> portable, meaning that the programmer is supposed (the way I see it) to
>>> use them under the guard of appropriate preprocessor directives.
>>>
>>> The use for them is to allow the programmer to use some optimized
>>> routines for larger types, if available.
>>
>> But they don't allow that. If you are trying to make optimised routines
>> for larger types, you need to know your sizes - you either use
>> implementation extensions (like __int128), or fixed size types, or if
>> you need maximal portability, you use "int_fast64_t".
>
> And my impression (which could be mistaken) is that __int128 is not an
> extended integer type partly *because* it would require making intmax_t
> 128 bits, which would cause ABI problems. That, and __int128 is
> unfinished (some features are missing), but if it could have been made
> an extended integer type perhaps more effort would have been spent
> making it fully functional.
>

Agreed.

>>> For example, in case operations like ldiv are needed on 128 bit
>>> integers, then IF the implementation supports 128 bit intmax_t then
>>> imaxdiv can be a better choice rather than implementing your own routine
>>> - note that /IF/ is the keyword here, that should map directly to
>>> appropriate #if directives.
>>
>> The situation we have now is that on a compiler like gcc you can get
>> 128-bit division using __int128, but /not/ using intmax_t. It is a
>> useless type.
>
> I suspect it's useful enough for some purposes. There's probably code
> out there that uses __int128 that doesn't need 128-bit division.
>

I might have been unclear there - I meant "intmax_t" is a useless type.
__int128 is useful as it is (albeit not often useful). Adding abs,
div, etc., support to __int128 would not make it any more useful.

David Brown

unread,

Jun 2, 2021, 3:29:27 AM6/2/21

to

On 01/06/2021 20:21, Manfred wrote:
> On 6/1/2021 7:15 PM, David Brown wrote:
>> On 01/06/2021 18:00, Manfred wrote:

>>> As far as I understand the purpose for intmax_t is to allow for (sort
>>> of) 'standard' prototypes of functions like imaxabs, imaxdiv, strtoimax,
>>> etc that are supposed to operate on larger integer types - these are the
>>> only functions that take this kind of arguments.
>>> The confusing part is that all of these facilities are implementation
>>> dependent, so even if they are part of the standard they are /not/
>>> portable, meaning that the programmer is supposed (the way I see it) to
>>> use them under the guard of appropriate preprocessor directives.
>>>
>>> The use for them is to allow the programmer to use some optimized
>>> routines for larger types, if available.
>>
>> But they don't allow that. If you are trying to make optimised routines
>> for larger types, you need to know your sizes - you either use
>> implementation extensions (like __int128), or fixed size types, or if
>> you need maximal portability, you use "int_fast64_t".
>
> When I wrote "use" I didn't mean "make". I meant imaxdiv may be a 128
> bit division routine provided by the implementation that the programmer
> can to use without the need to write one, possible a less efficient one.
>

On gcc, "imaxdiv" lets you divide 64-bit numbers. If x and y are type
__int128, then "x / y" lets you divide 128-bit numbers.

The "div" functions give you /nothing/ in a modern compiler. They are a
hangover from the bad old days where "x = a / b; y = a % b;" could not
be handled efficiently by compiler.

The "imaxdiv" function must be the most useless function ever specified
- indeed, it is worse than useless because it interferes with changing
or removing intmax_t. (I appreciate why it was introduced - I'm writing
with hindsight that the C99 authors did not have.)

>>
>>>
>>> For example, in case operations like ldiv are needed on 128 bit
>>> integers, then IF the implementation supports 128 bit intmax_t then
>>> imaxdiv can be a better choice rather than implementing your own routine
>>> - note that /IF/ is the keyword here, that should map directly to
>>> appropriate #if directives.
>>>
>>
>> The situation we have now is that on a compiler like gcc you can get
>> 128-bit division using __int128, but /not/ using intmax_t. It is a
>> useless type.
>>
>
> The way I see it this is a problem with gcc, not with the standard.
> Unless the committee managed to produce some wording that is too
> problematic for __int128 to fit as extended integer type.

Keith has given some replies here.

There is nothing in the standards that would have prevented gcc making
__int128 as an extended integer type when C99 was introduced. The
problem is that the definition of intmax_t makes it extremely difficult
to /change/ the type, and therefore to /introduce/ a new larger extended
integer type at a later date. Once the ABI for a platform has been
decided, intmax_t is fixed and no larger integer types can be introduced
without change and disruption that is well out of proportion for the gains.

>
> Note that I am not talking about plain integer division (the '/'
> operator) I am talking about 128 bit ldiv.
> Now we have div, ldiv and lldiv too (supposedly for 64 bit), but instead
> of going on with llldiv, and then llllllllldiv, they decided to stop
> with imaxdiv. It makes sense.
>

Perhaps I am missing something. What do the div functions give you that
the division operators do not (assuming an optimising compiler) ?

Manfred

unread,

Jun 2, 2021, 12:43:29 PM6/2/21

to

On 6/2/2021 9:29 AM, David Brown wrote:
> On 01/06/2021 20:21, Manfred wrote:
>> On 6/1/2021 7:15 PM, David Brown wrote:

[...]

>>>
>>> The situation we have now is that on a compiler like gcc you can get
>>> 128-bit division using __int128, but /not/ using intmax_t. It is a
>>> useless type.
>>>
>>
>> The way I see it this is a problem with gcc, not with the standard.
>> Unless the committee managed to produce some wording that is too
>> problematic for __int128 to fit as extended integer type.
>
> Keith has given some replies here.
>
> There is nothing in the standards that would have prevented gcc making
> __int128 as an extended integer type when C99 was introduced. The
> problem is that the definition of intmax_t makes it extremely difficult
> to /change/ the type, and therefore to /introduce/ a new larger extended
> integer type at a later date. Once the ABI for a platform has been
> decided, intmax_t is fixed and no larger integer types can be introduced
> without change and disruption that is well out of proportion for the gains.
>

Technically, this is still a problem of the implementation, not of the
standard. Granted, implementations and the standard have a long history
of going along together, but still they are different things and they
work at different levels.
I believe you and Keith (his link is indeed instructive) when you say
that there are ABI problems with intmax_t, but I am not convinced that
they are absolutely objective - I may think there is some weight of the
legacy of ABI definitions as they have been structured for decades.
After all, in C passing arguments of varying type is not a new issue -
structs have been part of the ABI since the beginning of time.

It seems more likely to me that the drive to solve this issue is not
strong enough because of the limited range of cases where this is really
needed. As I wrote earlier the real need is probably somewhat for a
niche area.

>>
>> Note that I am not talking about plain integer division (the '/'
>> operator) I am talking about 128 bit ldiv.
>> Now we have div, ldiv and lldiv too (supposedly for 64 bit), but instead
>> of going on with llldiv, and then llllllllldiv, they decided to stop
>> with imaxdiv. It makes sense.
>>
>
> Perhaps I am missing something. What do the div functions give you that
> the division operators do not (assuming an optimising compiler) ?
>

I assume you mean the division /and/ remainder operators.
Obviously the div functions give both formally in one operation, taking
advantage of the ASM instructions that do that.
I know that most optimizing compilers are able to combine a sequence of
'/' and '%' into a single instruction, but this is relying on
optimization, and thus not standardized.
I now we are probably going to disagree on this point, but to me it is
relevant that some feature, if it is important to the program, be
possible to express in source code with no need to assume some
behind-the-scenes compiler behavior.

More importantly, in this last point I took the div functions as one
example, in fact there is a whole family of those, ranging from abs to
strtol, and even printf that are involved with intmax_t.

David Brown

unread,

Jun 2, 2021, 5:32:54 PM6/2/21

to

Legacy and ABI definitions are definitely the issues here, and
technically these are part of the implementation, rather than the
standard. The way the standard defines intmax_t makes it very
impractical (but not impossible) for implementations to provide larger
integer types. I see that as a problem or limitation in the standard,
rather than an implementation issue.

> After all, in C passing arguments of varying type is not a new issue -
> structs have been part of the ABI since the beginning of time.
>

Yes, but structs (and arrays) are defined in terms of existing scaler
types. ABI's generally do not specify how to pass larger integer types
- it is not covered by the specification for a struct comprising of two
smaller types.

> It seems more likely to me that the drive to solve this issue is not
> strong enough because of the limited range of cases where this is really
> needed. As I wrote earlier the real need is probably somewhat for a
> niche area.
>

That seems reasonable.

>>>
>>> Note that I am not talking about plain integer division (the '/'
>>> operator) I am talking about 128 bit ldiv.
>>> Now we have div, ldiv and lldiv too (supposedly for 64 bit), but instead
>>> of going on with llldiv, and then llllllllldiv, they decided to stop
>>> with imaxdiv. It makes sense.
>>>
>>
>> Perhaps I am missing something. What do the div functions give you that
>> the division operators do not (assuming an optimising compiler) ?
>>
>
> I assume you mean the division /and/ remainder operators.

Yes.

> Obviously the div functions give both formally in one operation, taking
> advantage of the ASM instructions that do that.

A compiler will usually do that two, given "x = a / b; y = a % b;". On
many processors, a single division instruction produces both results and
compilers will take advantage of that.

When I did a few tests on <https://godbolt.org>, compilers generated
calls to library "div" functions when these were given in the source
code, and direct cpu division instructions for the division and
remainder operators. I must admit it surprised me a little - I'd have
thought the "div" functions would be handled as builtins. But they are
not on the list of gcc "Other builtins" ("abs" is, as are a great many
other standard library functions). I guess the developers simply
haven't bothered - perhaps because the "div" functions are rarely used.
Certainly the operators give simpler and clearer source code, and
significantly smaller and faster object code in practice.

> I know that most optimizing compilers are able to combine a sequence of
> '/' and '%' into a single instruction, but this is relying on
> optimization, and thus not standardized.

The same could be said about calling "div" - the standard does not give
any indication that it is implemented in any particularly efficient way.
Most likely, it is done by :

div_t div(int a, int b) {
div_t d;
d.quot = a / b;
d.rem = a % b;
return d;

}

> I now we are probably going to disagree on this point, but to me it is
> relevant that some feature, if it is important to the program, be
> possible to express in source code with no need to assume some
> behind-the-scenes compiler behavior.
>

I don't disagree on that principle at all. But I /do/ disagree about
any assumptions you make about how "div" is implemented, and that it has
any required behaviour or guarantees that you don't get from the operators.

> More importantly, in this last point I took the div functions as one
> example, in fact there is a whole family of those, ranging from abs to
> strtol, and even printf that are involved with intmax_t.

The "abs" function family does not need an "intmax_t" specific version -
it could be handled by the <tgmath.h> "abs" generic macro. (That's for
C - for C++, you'd prefer a template.)

Some of the other functions taking or returning an "intmax_t" would add
complications, yes. That's why "intmax_t" would need to be deprecated
rather than just dropped.

Manfred

unread,

Jun 2, 2021, 6:50:17 PM6/2/21

to

On 6/2/2021 11:32 PM, David Brown wrote:
> On 02/06/2021 18:43, Manfred wrote:
>> On 6/2/2021 9:29 AM, David Brown wrote:
>>> On 01/06/2021 20:21, Manfred wrote:
>>>> On 6/1/2021 7:15 PM, David Brown wrote:
>> [...]
>>>>

>>>> Note that I am not talking about plain integer division (the '/'
>>>> operator) I am talking about 128 bit ldiv.
>>>> Now we have div, ldiv and lldiv too (supposedly for 64 bit), but instead
>>>> of going on with llldiv, and then llllllllldiv, they decided to stop
>>>> with imaxdiv. It makes sense.
>>>>
>>>
>>> Perhaps I am missing something. What do the div functions give you that
>>> the division operators do not (assuming an optimising compiler) ?
>>>
>>
>> I assume you mean the division /and/ remainder operators.
>
> Yes.
>
>> Obviously the div functions give both formally in one operation, taking
>> advantage of the ASM instructions that do that.
>
> A compiler will usually do that two, given "x = a / b; y = a % b;". On
> many processors, a single division instruction produces both results and
> compilers will take advantage of that.
>
> When I did a few tests on <https://godbolt.org>, compilers generated
> calls to library "div" functions when these were given in the source
> code, and direct cpu division instructions for the division and
> remainder operators. I must admit it surprised me a little - I'd have
> thought the "div" functions would be handled as builtins. But they are
> not on the list of gcc "Other builtins" ("abs" is, as are a great many
> other standard library functions). I guess the developers simply
> haven't bothered - perhaps because the "div" functions are rarely used.

Interesting, that's surprising.

> Certainly the operators give simpler and clearer source code, and
> significantly smaller and faster object code in practice.
>

'Certainly' smaller and faster because you tested it. As per their
definition, there is no reason for which div should perform worse than
'/' and '%'.
In fact, the only motivation for the *div functions to exist is that
they perform better, or at the very least equal, to the pair '/' and '%'.
To me it sounds like a matter of QoI.

>> I know that most optimizing compilers are able to combine a sequence of
>> '/' and '%' into a single instruction, but this is relying on
>> optimization, and thus not standardized.
>
> The same could be said about calling "div" - the standard does not give
> any indication that it is implemented in any particularly efficient way.
> Most likely, it is done by :
>
> div_t div(int a, int b) {
> div_t d;
> d.quot = a / b;
> d.rem = a % b;
> return d;
> }
>
>
>> I now we are probably going to disagree on this point, but to me it is
>> relevant that some feature, if it is important to the program, be
>> possible to express in source code with no need to assume some
>> behind-the-scenes compiler behavior.
>>
>
> I don't disagree on that principle at all. But I /do/ disagree about
> any assumptions you make about how "div" is implemented, and that it has
> any required behaviour or guarantees that you don't get from the operators.
>

Well, it's the /definition/ of "div" that it calculates the quotient and
the remainder in one go, not an assumption.
In this specific case, the same applies to performance: as I said it is
the only reason for it to exist.
If the implementation is sloppy then it's good to know, but it's also a
different matter.

David Brown

unread,

Jun 3, 2021, 3:08:20 AM6/3/21

to

Yes. While I don't expect the "div" functions to be much used, it seems
to me it should a relatively easy optimisation.

(MSVC manages it, gcc, clang and icc do not.)

>> Certainly the operators give simpler and clearer source code, and
>> significantly smaller and faster object code in practice.
>>
>
> 'Certainly' smaller and faster because you tested it.

The cleaner and simpler source code is the "certainly" part. I am sure
there are some older or more limited compilers for targets without
hardware division and for which calling the "div" function is more
efficient. (Testing gcc on targets like the AVR that don't have
division assembly instructions, basically the same code was generated
for "div" and /, % .)

> As per their
> definition, there is no reason for which div should perform worse than
> '/' and '%'.

The function call overhead here is going to dominate the cost -
shuffling around data into the right registers, calling the function in
a library (imagine if it is in a DLL/so), instruction cache misses,
stacking and restoring other data according to ABI volatile and
preserved register usage, reduced scope for optimisation with constant
propagation, inlining, pre-calculating results, etc. There are many
reasons why it should be worse.

> In fact, the only motivation for the *div functions to exist is that
> they perform better, or at the very least equal, to the pair '/' and '%'.
> To me it sounds like a matter of QoI.
>

I am sure that in the early days of C, the div functions would - on some
systems at least - have performed better than the division operators
together. But not now - and not for a long time, on most targets.

>>> I know that most optimizing compilers are able to combine a sequence of
>>> '/' and '%' into a single instruction, but this is relying on
>>> optimization, and thus not standardized.
>>
>> The same could be said about calling "div" - the standard does not give
>> any indication that it is implemented in any particularly efficient way.
>> Most likely, it is done by :
>>
>> div_t div(int a, int b) {
>>     div_t d;
>>     d.quot = a / b;
>>     d.rem = a % b;
>>     return d;
>> }
>>
>>
>>> I now we are probably going to disagree on this point, but to me it is
>>> relevant that some feature, if it is important to the program, be
>>> possible to express in source code with no need to assume some
>>> behind-the-scenes compiler behavior.
>>>
>>
>> I don't disagree on that principle at all. But I /do/ disagree about
>> any assumptions you make about how "div" is implemented, and that it has
>> any required behaviour or guarantees that you don't get from the
>> operators.
>>
>
> Well, it's the /definition/ of "div" that it calculates the quotient and
> the remainder in one go, not an assumption.

The wording is "in a single operation". Since that concept is not
explicitly defined in the standard (AFAIK), and since there is no way
that standard can insist that a particular implementation does the
operation as a single instruction (not all processors have division
instructions of any sort), that part of the description simply says you
get both results from one function call.

> In this specific case, the same applies to performance: as I said it is
> the only reason for it to exist.
> If the implementation is sloppy then it's good to know, but it's also a
> different matter.

On most modern processors, no implementation could possibly have a
library call here that is faster than doing the operations using / and %
with a single division assembly code. It's not being sloppy - it is
impossible. You'd have to go out of your way to make an intentially
poor quality compiler for a call to "div" to be faster than using the
operators. (Failing to replace the "div" call with inline code is a
missed optimisation opportunity, and therefore QoI.)

David Brown

unread,

Jun 3, 2021, 4:53:55 AM6/3/21

to

On 03/06/2021 09:08, David Brown wrote:
> On 03/06/2021 00:49, Manfred wrote:
>> On 6/2/2021 11:32 PM, David Brown wrote:
>>> On 02/06/2021 18:43, Manfred wrote:

>>>
>>>> Obviously the div functions give both formally in one operation, taking
>>>> advantage of the ASM instructions that do that.
>>>
>>> A compiler will usually do that two, given "x = a / b; y = a % b;". On
>>> many processors, a single division instruction produces both results and
>>> compilers will take advantage of that.
>>>
>>> When I did a few tests on <https://godbolt.org>, compilers generated
>>> calls to library "div" functions when these were given in the source
>>> code, and direct cpu division instructions for the division and
>>> remainder operators. I must admit it surprised me a little - I'd have
>>> thought the "div" functions would be handled as builtins. But they are
>>> not on the list of gcc "Other builtins" ("abs" is, as are a great many
>>> other standard library functions). I guess the developers simply
>>> haven't bothered - perhaps because the "div" functions are rarely used.
>>
>> Interesting, that's surprising.
>>
>
> Yes. While I don't expect the "div" functions to be much used, it seems
> to me it should a relatively easy optimisation.
>

I reported the lack of "div" builtins in the gcc bugzilla, and it was
marked as a duplicate for an existing one that gave a good explanation
for why "div" is awkward for optimising. The problem is that the layout
of the "div_t" struct is not specified in the standard, and gcc can be
used with different standard libraries that might have "quot" and "rem"
in different orders. Thus an optimisation here would depend on the
source code having included <stdlib.h> (and the compiler knowing the
contents of it), or that the compiler can prove that the layout of the
struct doesn't matter. These would both require significant new
optimisation infrastructure in the compiler. So maybe it will happen
one day, but not yet - probably not until the gcc developers have a more
important use for similar infrastructure.

For MSVC, the same supplier makes both the compiler and the library, and
therefore the compiler knows the structure of div_t and can optimise
appropriately.

Manfred

unread,

Jun 3, 2021, 10:20:37 AM6/3/21

to

On 6/3/2021 9:08 AM, David Brown wrote:
> On 03/06/2021 00:49, Manfred wrote:
>> On 6/2/2021 11:32 PM, David Brown wrote:
>>> On 02/06/2021 18:43, Manfred wrote:
>>>> On 6/2/2021 9:29 AM, David Brown wrote:
>>>>> On 01/06/2021 20:21, Manfred wrote:
>>>>>> On 6/1/2021 7:15 PM, David Brown wrote:
>>>> [...]

>> As per their
>> definition, there is no reason for which div should perform worse than
>> '/' and '%'.
>
> The function call overhead here is going to dominate the cost -
> shuffling around data into the right registers, calling the function in
> a library (imagine if it is in a DLL/so), instruction cache misses,
> stacking and restoring other data according to ABI volatile and
> preserved register usage, reduced scope for optimisation with constant
> propagation, inlining, pre-calculating results, etc. There are many
> reasons why it should be worse.
>

Yes, but having "div" as intrinsics is part of picture here.

Yes, but still the only reason for div to exist is to provide /some/
benefit over '/' and '%', which in the end is only a matter of
efficiency, so an implementation that manages to deliver a 'div' family
of functions that performs worse than the pair of operators still
qualifies as sloppy, at least in my book - this includes missing it as
inline or intrinsics, because, as you say, it makes any chance of
efficiency hopeless.

> I reported the lack of "div" builtins in the gcc bugzilla, and it was
> marked as a duplicate for an existing one that gave a good explanation
> for why "div" is awkward for optimising. The problem is that the layout
> of the "div_t" struct is not specified in the standard, and gcc can be
> used with different standard libraries that might have "quot" and "rem"
> in different orders. Thus an optimisation here would depend on the
> source code having included <stdlib.h> (and the compiler knowing the
> contents of it), or that the compiler can prove that the layout of the
> struct doesn't matter. These would both require significant new
> optimisation infrastructure in the compiler. So maybe it will happen
> one day, but not yet - probably not until the gcc developers have a more
> important use for similar infrastructure.
>
> For MSVC, the same supplier makes both the compiler and the library, and
> therefore the compiler knows the structure of div_t and can optimise
> appropriately.

Your other post (quoted above) reports a reasonable explanation of the
complications involved for gcc - I'd say that from the user's
perspective an implementation consists of the combination compiler +
library, so it simply means that the implementation is suboptimal in
this specific case.

Again, most probably the gcc folks and friends simply didn't bother too
much because this topic is low priority.

Manfred

unread,

Jun 3, 2021, 10:44:03 AM6/3/21

to

On 6/2/2021 11:32 PM, David Brown wrote:

As far as I know ABI's do specify how to pass structs (arrays in C are a
bit different in that they are passed by reference unlike all other types).

The point is that in C there is some provision to pass arguments of
arbitrary size, both at the level of the standard and of actual
implementations.
The idea of having a scalar type of arbitrary size is problematic, but
it is also an opportunity, in my opinion.
In fact one of the few architectural changes in the last couple of
decades is the increase of word size. So it's not unreasonable that
looking in perspective the committee wanted to introduce some support
for 'large' scalars - integers being the obvious choice.
We have seen the demand for such types grow from 8 to 128 bits,
processor words grow from 8 to 64 bits, and given the increasing demand
for e.g. cryptography, with the vital role of secure communications
already today, it makes some sense to envision that these numbers could
keep growing sooner or later.

James Kuyper

unread,

Jun 3, 2021, 11:20:38 AM6/3/21

to

On 6/3/21 3:08 AM, David Brown wrote:
> On 03/06/2021 00:49, Manfred wrote:

...

>> As per their
>> definition, there is no reason for which div should perform worse than
>> '/' and '%'.
>
> The function call overhead here is going to dominate the cost -
> shuffling around data into the right registers, calling the function in
> a library (imagine if it is in a DLL/so), instruction cache misses,
> stacking and restoring other data according to ABI volatile and
> preserved register usage, reduced scope for optimisation with constant
> propagation, inlining, pre-calculating results, etc. There are many
> reasons why it should be worse.

I wouldn't expect function call overhead to be relevant unless div() is
used in a context (usually involving function pointers) that prevents
div() from being inlined. When it is inlined, I would expect a call to
div() to be optimized to essentially the same code as would be generated
for separate / and % expressions.
Note: reality often fails to live up (or, in some cases, down) to my
expectations.

David Brown

unread,

Jun 3, 2021, 1:04:32 PM6/3/21

to

If it were inlined, I would expect it to be optimal in speed and size.
But standard library functions are often not inlined unless they are
completely replaced by built-ins that have the same semantics. I'm not
sure if the standard library functions can be declared as "inline" and
defined in headers like <stdlib.h> - it would be interesting to know.
But AFAIUI, glibc - for whatever reason - don't like to have inline
definitions of in their standard C library headers.

Scott Lurndal

unread,

Jun 3, 2021, 6:24:23 PM6/3/21

to

As of GCC 4.8, 'div' wasn't ever inlined, even with -O3.

However, the compiler optimized this into a single idiv:

#include <stdlib.h>

int main(int argc, const char **argv)
{
//div_t qr = div(1234235235, 10);
long q, r;
long a, b;
a = strtol(argv[1], NULL, 0);
b = strtol(argv[2], NULL, 0);

q = a/b;
r = a%b;

return q*8 + r;
}

0000000000400440 <main>:
400440: 55 push %rbp
400441: 31 d2 xor %edx,%edx
400443: 48 89 f5 mov %rsi,%rbp
400446: 53 push %rbx
400447: 48 83 ec 08 sub $0x8,%rsp
40044b: 48 8b 7e 08 mov 0x8(%rsi),%rdi
40044f: 31 f6 xor %esi,%esi
400451: e8 da ff ff ff callq 400430 <strtoul@plt>
400456: 48 8b 7d 10 mov 0x10(%rbp),%rdi
40045a: 48 89 c3 mov %rax,%rbx
40045d: 31 d2 xor %edx,%edx
40045f: 31 f6 xor %esi,%esi
400461: e8 ca ff ff ff callq 400430 <strtoul@plt>
400466: 48 89 c1 mov %rax,%rcx
400469: 48 89 d8 mov %rbx,%rax
40046c: 48 83 c4 08 add $0x8,%rsp
400470: 48 99 cqto
400472: 48 f7 f9 idiv %rcx
400475: 5b pop %rbx
400476: 5d pop %rbp
400477: 8d 04 c2 lea (%rdx,%rax,8),%eax
40047a: c3 retq
40047b: 90 nop

Öö Tiib

unread,

Jun 3, 2021, 7:27:42 PM6/3/21

to

Maybe with -flto it does?

Scott Lurndal

unread,

Jun 3, 2021, 8:36:18 PM6/3/21

to

Perhaps, but gcc generally documents all the functions that will
be inlined in their texinfo documentation, and div is not included
in the list of math functions that automatically get builtin status
at least in GCC 4.8. Since they're up to GCC11 now, they may have
added it to the list.

However, if the compiler recognizes cases where both the quotient
and remainder are used and generates a single divide instruction,
there doesn't seem much need for div at all.