Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

int8_t and char

1,124 views
Skip to first unread message

Ralf Goertz

unread,
Sep 18, 2018, 6:56:37 AM9/18/18
to
Hi,

when I just tried to use uint8_t instead of int in a program (because of
memory issues), I ran into the following problem:

#include <iostream>

int main() {
uint8_t i;
std::cin>>i;
int j=i;
std::cout<<j<<"\n";
}

Running this program and entering 4 results in the output of 52. I am
aware that this because uint8_t is typedef'ed to unsigned char so my
input is actually the character '4' which has ASCII code 52. However, I
had the impression that those [u]intX_t types were there so that I can
do math with them, not to deal with characters.

So what is the canonical way to input/output integral types of varying
size (in my actual program I use templates)?

David Brown

unread,
Sep 18, 2018, 7:08:34 AM9/18/18
to
For local variables, use "int" or "unsigned int" unless you need
specific sizes. You won't save space by using uint8_t or int8_t in
locals (unless you are working on an 8-bit microcontroller - in which
case cin and cout are your problem).

You are right to use int8_t and uint8_t rather than "char", "signed
char" or "unsigned char" for arithmetic - but that is because they are
more appropriate names for the task, not because they are actually
different types.

Ralf Goertz

unread,
Sep 18, 2018, 8:19:17 AM9/18/18
to
Am Tue, 18 Sep 2018 13:08:22 +0200
schrieb David Brown <david...@hesbynett.no>:

> On 18/09/18 12:56, Ralf Goertz wrote:
> > Hi,
> >
> > when I just tried to use uint8_t instead of int in a program
> > (because of memory issues), I ran into the following problem:
> >
> > #include <iostream>
> >
> > int main() {
> > uint8_t i;
> > std::cin>>i;
> > int j=i;
> > std::cout<<j<<"\n";
> > }
> >
> > Running this program and entering 4 results in the output of 52. I
> > am aware that this because uint8_t is typedef'ed to unsigned char
> > so my input is actually the character '4' which has ASCII code 52.
> > However, I had the impression that those [u]intX_t types were there
> > so that I can do math with them, not to deal with characters.
> >
> > So what is the canonical way to input/output integral types of
> > varying size (in my actual program I use templates)?
> >
>
> For local variables, use "int" or "unsigned int" unless you need
> specific sizes. You won't save space by using uint8_t or int8_t in
> locals (unless you are working on an 8-bit microcontroller - in which
> case cin and cout are your problem).

I don't care about single variables. I deal with
set<vector<my_int_type>> where the values within the vector are usually
rather small. But they don't always fit in a byte. That's why I need
flexibility. And the values are to be read in from cin or similar.

> You are right to use int8_t and uint8_t rather than "char", "signed
> char" or "unsigned char" for arithmetic - but that is because they are
> more appropriate names for the task, not because they are actually
> different types.

I find it disturbing that these types behave differently when used with
cin/cout depending on the number in the type's name although the values
read in are within the range common to all of them. I'm curious, AFAIK
there are systems where char is more than 8 bit. So int8_t can't be
typedef'ed to char. What would my program print in those systems?


Sam

unread,
Sep 18, 2018, 8:36:13 AM9/18/18
to
Ralf Goertz writes:

> Running this program and entering 4 results in the output of 52. I am
> aware that this because uint8_t is typedef'ed to unsigned char so my
> input is actually the character '4' which has ASCII code 52. However, I
> had the impression that those [u]intX_t types were there so that I can
> do math with them, not to deal with characters.

Well, I have some good news for you. You can do all the math you want with
them. You can add them, subtract them, multiply and divide them, and do
anything with them that you can also do with their bigger cousins.

And you don't even have to formally use int8_t.

char two=2
char four=two+two;

> So what is the canonical way to input/output integral types of varying
> size (in my actual program I use templates)?

That's the only thing. Formatted input and output operators interpret
anything declared as a char to be a character or a character string, and
treat it as such.

There are many ways to deal with this. There is no "canonical" way.

David Brown

unread,
Sep 18, 2018, 8:53:22 AM9/18/18
to
Then read the values in as "int", check them properly, and store them in
your vector of appropriate type.

>> You are right to use int8_t and uint8_t rather than "char", "signed
>> char" or "unsigned char" for arithmetic - but that is because they are
>> more appropriate names for the task, not because they are actually
>> different types.
>
> I find it disturbing that these types behave differently when used with
> cin/cout depending on the number in the type's name although the values
> read in are within the range common to all of them. I'm curious, AFAIK
> there are systems where char is more than 8 bit. So int8_t can't be
> typedef'ed to char. What would my program print in those systems?
>
>

On systems with char greater than 8 bit, there are no types int8_t and
uint8_t. C and C++ do not support any types (expect bitfields, but they
are a bit odd) smaller than "char".

Ralf Goertz

unread,
Sep 18, 2018, 9:14:07 AM9/18/18
to
Am Tue, 18 Sep 2018 08:36:05 -0400
schrieb Sam <s...@email-scan.com>:

> Ralf Goertz writes:
>
> > Running this program and entering 4 results in the output of 52. I
> > am aware that this because uint8_t is typedef'ed to unsigned char
> > so my input is actually the character '4' which has ASCII code 52.
> > However, I had the impression that those [u]intX_t types were there
> > so that I can do math with them, not to deal with characters.
>
> Well, I have some good news for you. You can do all the math you want
> with them. You can add them, subtract them, multiply and divide them,
> and do anything with them that you can also do with their bigger
> cousins.
>
> And you don't even have to formally use int8_t.
>
> char two=2
> char four=two+two;

Why do I deserve your condescending tone? What made you think I wouldn't
know that?

> > So what is the canonical way to input/output integral types of
> > varying size (in my actual program I use templates)?
>
> That's the only thing. Formatted input and output operators
> interpret anything declared as a char to be a character or a
> character string, and treat it as such.

I also know that. What I didn't know (but figured out before asking my
question) was that uint8_t and unsigned char are actually the same type
(on my platform). This fact renders this type useless for my purpose.
Therefore, the question arose, how can I deal with integral types of
different sizes in a template-friendly way when they are also to be read
in using formatted input.

> There are many ways to deal with this. There is no "canonical" way.

And that's what is surprising to me. I understand that it is nice to
have typenames that explicitly tell you their size. But why stop there?
Why can't those types be arithmetic types which behave like their
"bigger cousins"? If you need character types you can always use
(unsigned) char. There is no point in using [u]int8_t for that. But why
do I need to bend over backwards to deal with the special case of 8 bit?

David Brown

unread,
Sep 18, 2018, 9:47:42 AM9/18/18
to
They are likely to be the same type on any platform for which uint8_t
exists. In theory, uint8_t and int8_t could be "extended integer types"
- in practice you won't find that.

This is an oddity that C++ inherited from C, and unfortunately you are
stuck with it.

> This fact renders this type useless for my purpose.
> Therefore, the question arose, how can I deal with integral types of
> different sizes in a template-friendly way when they are also to be read
> in using formatted input.

As I said, use "int" for your input and check it - then assign it to
int8_t (or int16_t, or whatever).

Other than that, you can make your own types and use them.

>
>> There are many ways to deal with this. There is no "canonical" way.
>
> And that's what is surprising to me. I understand that it is nice to
> have typenames that explicitly tell you their size. But why stop there?
> Why can't those types be arithmetic types which behave like their
> "bigger cousins"? If you need character types you can always use
> (unsigned) char. There is no point in using [u]int8_t for that. But why
> do I need to bend over backwards to deal with the special case of 8 bit?
>

int16_t and uint16_t do not behave like their larger cousins either,
assuming 32-bit int. They are closer than int8_t and uint8_t, but not
identical.

Remember, these are /not/ new types - they are typedefs, or alternative
names for existing types. "uint8_t" is /exactly/ the same as "unsigned
char". "int16_t" is (on most implementations) /exactly/ the same as
"signed short".

You could happily make some classes that really are new types, and
really act like pure integers of different sizes - but you'd have to
make them.


Bo Persson

unread,
Sep 18, 2018, 10:04:25 AM9/18/18
to
On 2018-09-18 15:13, Ralf Goertz wrote:
>
>>> So what is the canonical way to input/output integral types of
>>> varying size (in my actual program I use templates)?
>>
>> That's the only thing. Formatted input and output operators
>> interpret anything declared as a char to be a character or a
>> character string, and treat it as such.
>
> I also know that. What I didn't know (but figured out before asking my
> question) was that uint8_t and unsigned char are actually the same type
> (on my platform). This fact renders this type useless for my purpose.
> Therefore, the question arose, how can I deal with integral types of
> different sizes in a template-friendly way when they are also to be read
> in using formatted input.
>
>> There are many ways to deal with this. There is no "canonical" way.
>
> And that's what is surprising to me. I understand that it is nice to
> have typenames that explicitly tell you their size. But why stop there?
> Why can't those types be arithmetic types which behave like their
> "bigger cousins"? If you need character types you can always use
> (unsigned) char. There is no point in using [u]int8_t for that. But why
> do I need to bend over backwards to deal with the special case of 8 bit?
>

It's not really the types, but the streams that have special overloads
for char types.

If you output an int* you will get the address stored in the pointer,
but if you output a char* it will display the string pointed to. The
designers thought that would be the most common use case.

Similarly the 8-bit types are displayed as the corresponding character
and not the ASCII-value stored in the variable.

If you want a value x displayed as an integer, you can always do

cout << (int)x;


Bo Persson

Ralf Goertz

unread,
Sep 18, 2018, 10:34:57 AM9/18/18
to
Am Tue, 18 Sep 2018 15:47:29 +0200
schrieb David Brown <david...@hesbynett.no>:

> On 18/09/18 15:13, Ralf Goertz wrote:
> > What I didn't know (but figured out before asking my question) was
> > that uint8_t and unsigned char are actually the same type (on my
> > platform).
>
> They are likely to be the same type on any platform for which uint8_t
> exists. In theory, uint8_t and int8_t could be "extended integer
> types"
> - in practice you won't find that.

Okay, that's interesting. The draft of the standard (2015) and the
second column in <https://en.cppreference.com/w/cpp/types/integer>
always state "typedef". So they must be equal to an existing type like
char, must they not? But how can they then be extended?

> This is an oddity that C++ inherited from C, and unfortunately you are
> stuck with it.

But the C-types could also have been made extended, right?

> int16_t and uint16_t do not behave like their larger cousins either,
> assuming 32-bit int. They are closer than int8_t and uint8_t, but not
> identical.

In what way (other than range of course)?

> Remember, these are /not/ new types - they are typedefs, or
> alternative names for existing types. "uint8_t" is /exactly/ the
> same as "unsigned char". "int16_t" is (on most
> implementations) /exactly/ the same as "signed short".
>
> You could happily make some classes that really are new types, and
> really act like pure integers of different sizes - but you'd have to
> make them.

Yes. But I was under the impression that the C++11 standard had done
that for me. And now I wonder why it hadn't. Much ado about (almost)
nothing imho.

Sam

unread,
Sep 18, 2018, 10:57:20 AM9/18/18
to
Ralf Goertz writes:

> > There are many ways to deal with this. There is no "canonical" way.
>
> And that's what is surprising to me. I understand that it is nice to
> have typenames that explicitly tell you their size. But why stop there?
> Why can't those types be arithmetic types which behave like their
> "bigger cousins"?

But they are. They, themselves, behave identically to their bigger cousins.
`<<` and `>>` were not arithmetic operations in that particular case. They
were overloaded operators defined explicitly for some particular class.

If you were to use the `<<` and `>>` operators directly on chars, they would
act exactly the same as they do for ints and longs (subject to their actual
bit size, of course).

They are integer types.

> If you need character types you can always use
> (unsigned) char.

There's no such thing as a "character type" in C++, as the term is used
here. There happens to be a type called "char", but it's an integer type. It
supports all the same operations as other integer types.

Furthermore, C++, like C, decided to support the concept of "characters" by
using a small (8-bit) integer type to represent a character value, and to
underscore the point, that integer type was named "char".

But it's still an integer type.

> There is no point in using [u]int8_t for that. But why
> do I need to bend over backwards to deal with the special case of 8 bit?

There are very few things that C++ hands over on a silver platter. Very
often things require quite a bit of work. I still remember how things used
to be before std::string came along.

And you have to do it because that's how C++ works.

Furthermore, both POSIX (int8_t), and the C++ standard (std::int8_t) require
it to be a typedef to an integer type. As such, no C++ compiler has a choice
in this matter. int8_t has to be a typedef to a natural integer type, and
C/C++ also defines the concept of characters as being represented by (8-bit)
integer types. It is what it is. That's how the world spins around, and
nothing can be done to change that, so I can only think of two possible
options here. The first one is, of course, to continue complaining about it
(if I was inclined to complain about it, for some reason, but I'm not). My
second option would be to spend a few minutes to code a wrapper type that
overloads arithmetic operations, including the appropriate overloads for
stream objects, and a modern C++ compiler will simply optimize away down to
nothing, producing the same code as it would with a native integer type.

I'm wondering which option will be more productive, in general.

james...@alumni.caltech.edu

unread,
Sep 18, 2018, 11:51:37 AM9/18/18
to
On Tuesday, September 18, 2018 at 10:34:57 AM UTC-4, Ralf Goertz wrote:
> Am Tue, 18 Sep 2018 15:47:29 +0200
> schrieb David Brown <david...@hesbynett.no>:
>
> > On 18/09/18 15:13, Ralf Goertz wrote:
> > > What I didn't know (but figured out before asking my question) was
> > > that uint8_t and unsigned char are actually the same type (on my
> > > platform).
> >
> > They are likely to be the same type on any platform for which uint8_t
> > exists. In theory, uint8_t and int8_t could be "extended integer
> > types"
> > - in practice you won't find that.
>
> Okay, that's interesting. The draft of the standard (2015) and the
> second column in <https://en.cppreference.com/w/cpp/types/integer>
> always state "typedef". So they must be equal to an existing type like
> char, must they not? But how can they then be extended?

The standard explicitly allows for extended integer types (3.9.1p2), and
in any case where the standard refers to any category of that includes
integer types, without specifying "standard", that category includes
extended integer types as well as standard integer types. 18.4.1
describes <cstdint>, and frequently refers to "integer types", but never restricts it's statements to "standard integer types".
In practice, this doesn't come up for the smallest exact-sized types.
18.4.1 requires that they match section 7.18, which presumably should
be section 7.20 of the current C standard.
"The typedef name intN_t designates a signed integer type with width N,
no padding bits, and a two’s complement representation." (7.20.1.1p1). Therefore, if uint8_t and int8_t are supported, CHAR_BIT must be 8.
Therefore, unsigned char, char, and signed char must all be 8 bit types,
too.

"The rank of any standard integer type shall be greater than the rank of
any extended integer type with the same width." (4.14p1) The C standard
says something important, and I can't find a corresponding statement in
the C++ standard, nor any cross-reference to this requirement:
"For any two integer types with the same signedness and different
integer conversion rank (see 6.3.1.1), the range of values of the type
with smaller integer conversion rank is a subrange of the values of the
other type." (C 6.2.5p8).

I believe that this is either a failure on my part to find the relevant
C++ text, or an oversight by the authors of the C++ standard - I'm sure
that the intent was to impose this same requirement in C++.
There isn't any room for int8_t to have a range that's a sub-range of
the range of signed char; int8_t already represents the maximum number
of different values that can be stored in an 8-bit byte, so if int8_t
exists and C 6.2.5p8 applies, it must represent the same range of values
as signed char. In principle, they could use different representations
for the same range of values, but that's not much reason for an
implementation to do that. Similarly, uint8_t, if supported must
represent the same range of values as unsigned char.

> > This is an oddity that C++ inherited from C, and unfortunately you are
> > stuck with it.
>
> But the C-types could also have been made extended, right?

Yes, they can.

> > int16_t and uint16_t do not behave like their larger cousins either,
> > assuming 32-bit int. They are closer than int8_t and uint8_t, but not
> > identical.
>
> In what way (other than range of course)?

If UINT16_MAX is less than INT_MAX, the integral promotions convert
uint16_t to int, while unsigned int remains an unsigned type(4.5p1).
This can affect the type, and in some cases, the value, of other
expressions containing expressions with that type.

Öö Tiib

unread,
Sep 18, 2018, 2:37:38 PM9/18/18
to
On Tuesday, 18 September 2018 13:56:37 UTC+3, Ralf Goertz wrote:
> However, I
> had the impression that those [u]intX_t types were there so that I can
> do math with them, not to deal with characters.

That was incorrect impression. Mathematical operations are not defined
for "short int" and shorter integral types in C++.

Note that operator>> with istream is not mathematical but text
input operation. Conforming implementation has to have it
defined like that:

template< class Traits >
basic_istream<char,Traits>& operator>>( basic_istream<char,Traits>& st
, unsigned char& ch );

It is required to behave like you described.

> So what is the canonical way to input/output integral types of varying
> size (in my actual program I use templates)?

Canonical way is to narrow down the choices. For example use only
int_fast32_t, int_fast64_t, double and custom classes for
text input-output and calculations with numerical values.

Canonical way is to avoid using the likes to int16_t, int8_t or
bit-fields for math or text input-output. These are slower and
the possible overflows are harder to control. Use these as
optimizations of storage size and for binary input-output.

David Brown

unread,
Sep 18, 2018, 4:13:58 PM9/18/18
to
On 18/09/18 16:34, Ralf Goertz wrote:
> Am Tue, 18 Sep 2018 15:47:29 +0200
> schrieb David Brown <david...@hesbynett.no>:
>
>> On 18/09/18 15:13, Ralf Goertz wrote:
>>> What I didn't know (but figured out before asking my question) was
>>> that uint8_t and unsigned char are actually the same type (on my
>>> platform).
>>
>> They are likely to be the same type on any platform for which uint8_t
>> exists. In theory, uint8_t and int8_t could be "extended integer
>> types"
>> - in practice you won't find that.
>
> Okay, that's interesting. The draft of the standard (2015) and the
> second column in <https://en.cppreference.com/w/cpp/types/integer>
> always state "typedef". So they must be equal to an existing type like
> char, must they not? But how can they then be extended?

They can be a typedef to an extended integer type. C compilers are
allowed to have additional integer types beyond the main ones (char,
short, int long, long long).

On at least one platform I know (gcc for the 8-bit AVR), these types are
typedef'ed to "int" with gcc extensions giving specific sizes. I don't
actually know what different, if any, this makes.

(James has given a more detailed and complete answer to your questions.)

>
>> This is an oddity that C++ inherited from C, and unfortunately you are
>> stuck with it.
>
> But the C-types could also have been made extended, right?

Yes.


>
>> int16_t and uint16_t do not behave like their larger cousins either,
>> assuming 32-bit int. They are closer than int8_t and uint8_t, but not
>> identical.
>
> In what way (other than range of course)?

The 16-bit types promote to 32-bit "int" before doing calculations.
This can have odd effects. Amongst other things, it means that some
operations could have undefined behaviour because they overflow "int",
even though you are starting off with "uint16_t" types.

>
>> Remember, these are /not/ new types - they are typedefs, or
>> alternative names for existing types. "uint8_t" is /exactly/ the
>> same as "unsigned char". "int16_t" is (on most
>> implementations) /exactly/ the same as "signed short".
>>
>> You could happily make some classes that really are new types, and
>> really act like pure integers of different sizes - but you'd have to
>> make them.
>
> Yes. But I was under the impression that the C++11 standard had done
> that for me. And now I wonder why it hadn't. Much ado about (almost)
> nothing imho.
>

C++11 has a huge amount in it, and turned C++ into almost a different
language from C++03. But the sized integer types are not any different
from those in C (and I thought they were part of C++03 as well?).

The sized types in <stdint.h> are very, very useful in a range of
circumstances. But you have to understand what they are and how they work.


bitrex

unread,
Sep 18, 2018, 11:18:44 PM9/18/18
to
On 09/18/2018 07:08 AM, David Brown wrote:
> On 18/09/18 12:56, Ralf Goertz wrote:
>> Hi,
>>
>> when I just tried to use uint8_t instead of int in a program (because of
>> memory issues), I ran into the following problem:
>>
>> #include <iostream>
>>
>> int main() {
>> uint8_t i;
>> std::cin>>i;
>> int j=i;
>> std::cout<<j<<"\n";
>> }
>>
>> Running this program and entering 4 results in the output of 52. I am
>> aware that this because uint8_t is typedef'ed to unsigned char so my
>> input is actually the character '4' which has ASCII code 52. However, I
>> had the impression that those [u]intX_t types were there so that I can
>> do math with them, not to deal with characters.
>>
>> So what is the canonical way to input/output integral types of varying
>> size (in my actual program I use templates)?
>>
>
> For local variables, use "int" or "unsigned int" unless you need
> specific sizes. You won't save space by using uint8_t or int8_t in
> locals (unless you are working on an 8-bit microcontroller - in which
> case cin and cout are your problem).

Hey man, it's the 21st century! I use output streams on 8 bit. the
"terminal" such as it is, is sometimes 4 segment alphanumeric LED
display. If you stream more than 4 characters it scrolls it like a
message board.

If you want a different "terminal" device then you inject the
appropriate behavior to the display
program-to-the-interface-not-the-implementation calls via
templates/static polymorphism.

Juha Nieminen

unread,
Sep 19, 2018, 2:21:28 AM9/19/18
to
Ralf Goertz <m...@myprovider.invalid> wrote:
> However, I
> had the impression that those [u]intX_t types were there so that I can
> do math with them, not to deal with characters.

You can do math with them. There is essentially no difference between the
char types and the other integral types (other than their size in bits).

It's just that std::istream and std::ostream have an overload for char
types specifically, which treats them differently. Whether that's a good
or a bad thing is up to opinion (and probably way too late to change),
but it's something limited to them.

If you want to output a char type with std::ostream, you have to do a cast.
With std::istream it becomes a bit more complicated.

Tim Rentsch

unread,
Sep 19, 2018, 9:13:49 AM9/19/18
to
Sam <s...@email-scan.com> writes:

> Ralf Goertz writes:
>
>> [...] If you need character types you can always use
>> (unsigned) char.
>
> There's no such thing as a "character type" in C++, as the term
> is used here.

C++17 section 6.9.1 includes this sentence in paragraph 1:

Plain char, signed char, and unsigned char are three
distinct types, collectively called /narrow character
types/.

(The /'s indicate italics in the original, signifying a
definition of the italicized term.)

james...@alumni.caltech.edu

unread,
Sep 19, 2018, 10:01:15 AM9/19/18
to
123456789012345678901234567890123456789012345678901234567890123456789012
While your answer is pedantically correct (and I therefore strongly
approve of it), character types are primarily small integer types as far
as the language proper (section 6 of the standard) is concerned; the
main difference from other types is that an array of character type can
be initialized using a string literal. Most of the specifically
"character" semantics are implemented by standard library routines.

Vir Campestris

unread,
Sep 23, 2018, 6:09:50 PM9/23/18
to
On 18/09/2018 13:36, Sam wrote:
> And you don't even have to formally use int8_t.
>
> char two=2
> char four=two+two;

char may be int8_t or uint8_t - that's implementation dependent.

IIRC it doesn't even have to be 8 bits - but I don't know any
architecture where it isn't. <fx cue exceptions>

Andy

David Brown

unread,
Sep 24, 2018, 1:56:50 AM9/24/18
to
On 24/09/18 00:09, Vir Campestris wrote:
> On 18/09/2018 13:36, Sam wrote:
>> And you don't even have to formally use int8_t.
>>
>> char two=2
>> char four=two+two;
>
> char may be int8_t or uint8_t - that's implementation dependent.
>

The most common implementations of int8_t and uint8_t will be:

typedef signed char int8_t;
typedef unsigned char uint8_t;

In theory, one of int8_t or uint8_t could be a typedef for plain char,
but there is no advantage in it.

Plain char may be signed or unsigned in any given implementation.

> IIRC it doesn't even have to be 8 bits - but I don't know any
> architecture where it isn't. <fx cue exceptions>
>

"char" does not have to be 8-bit. In the modern world, there are some
DSP's that have 16-bit or even 32-bit char. On such implementations,
"uint8_t" and "int8_t" do not exist.


Tim Rentsch

unread,
Sep 26, 2018, 12:20:23 PM9/26/18
to
james...@alumni.caltech.edu writes:

> On Wednesday, September 19, 2018 at 9:13:49 AM UTC-4, Tim Rentsch wrote:
>
>> Sam <s...@email-scan.com> writes:
>>
>>> Ralf Goertz writes:
>>>
>>>> [...] If you need character types you can always use
>>>> (unsigned) char.
>>>
>>> There's no such thing as a "character type" in C++, as the term
>>> is used here.
>>
>> C++17 section 6.9.1 includes this sentence in paragraph 1:
>>
>> Plain char, signed char, and unsigned char are three
>> distinct types, collectively called /narrow character
>> types/.
>>
>> (The /'s indicate italics in the original, signifying a
>> definition of the italicized term.)
>
> While your answer is pedantically correct

You're charging me with being pedantic? That's humorous.

> (and I therefore strongly approve of it),

This makes me think you don't know the term means. Look it up.
"Pedantic" is a negative term that implies someone is showing off
book learning or trivia. If you want to characterize yourself
that way, be my guest, but that's not what I'm doing here.

> character types are primarily small integer types as far as the
> language proper (section 6 of the standard) is concerned; the
> main difference from other types is that an array of character
> type can be initialized using a string literal. Most of the
> specifically "character" semantics are implemented by standard
> library routines.

The defined term is used in more than a dozen other places in the
C++ standard. To say there is no such thing as a "character type"
in C++ is simply wrong, despite character types being covered by the
umbrella terms "integral types" or "integer types" in many other
passages of the C++ standard.

james...@alumni.caltech.edu

unread,
Sep 26, 2018, 1:53:08 PM9/26/18
to
On Wednesday, September 26, 2018 at 12:20:23 PM UTC-4, Tim Rentsch wrote:
> james...@alumni.caltech.edu writes:
>
> > On Wednesday, September 19, 2018 at 9:13:49 AM UTC-4, Tim Rentsch wrote:
> >
> >> Sam <s...@email-scan.com> writes:
> >>
> >>> Ralf Goertz writes:
> >>>
> >>>> [...] If you need character types you can always use
> >>>> (unsigned) char.
> >>>
> >>> There's no such thing as a "character type" in C++, as the term
> >>> is used here.
> >>
> >> C++17 section 6.9.1 includes this sentence in paragraph 1:
> >>
> >> Plain char, signed char, and unsigned char are three
> >> distinct types, collectively called /narrow character
> >> types/.
> >>
> >> (The /'s indicate italics in the original, signifying a
> >> definition of the italicized term.)
> >
> > While your answer is pedantically correct
>
> You're charging me with being pedantic? That's humorous.
>
> > (and I therefore strongly approve of it),
>
> This makes me think you don't know the term means. Look it up.
> "Pedantic" is a negative term that implies someone is showing off
> book learning or trivia.

As used in this newsgroup, pedantic is usually used as an uninitential complement, by people who aren't careful about the meanings of their words, to describe those who are.

If you want to characterize yourself
> that way, be my guest, but that's not what I'm doing here.
>
> > character types are primarily small integer types as far as the
> > language proper (section 6 of the standard) is concerned; the
> > main difference from other types is that an array of character
> > type can be initialized using a string literal. Most of the
> > specifically "character" semantics are implemented by standard
> > library routines.
>
> The defined term is used in more than a dozen other places in the
> C++ standard.

True, but none of what it says in those other places affects the semantics of those data types.

> To say there is no such thing as a "character type" ...

Which I did not.

Ralf Goertz

unread,
Sep 28, 2018, 3:48:58 AM9/28/18
to
Am Wed, 19 Sep 2018 06:13:40 -0700
schrieb Tim Rentsch <t...@alumni.caltech.edu>:
Thanks for pointing that out. It somehow proves my point. There are
character types whose primary task is to serve the purpose of character
and string handling. So operator overloads to facilitate formatted in-
and output are very desirable. However, why can't [u]int8_t be separate
types and not typedef'ed to [un]signed char thereby »inheriting« those
operator overloads? What is gained by these typedef? If there is no 8
bit char type on a certain platform then there is also no [u]int8_t as
David pointed out. I can't think of any task done with [u]int8_t that
can't also be done as easily without it.

The situation is different for wide characters. The standard says:
»Types char16_t and char32_t denote distinct types with the same size,
signedness, and alignment as uint_least16_t and uint_least32_t,
respectively, in <cstdint>, called the underlying types.« And yet
std::is_same<uint_least32_t,char32_t>::value is false. I know this is
the other way around char32_t being »defined« in terms of
uint_least32_t. But at least here is a clear distinction between a
character type and a corresponding integer type. Why is that not the
case for 8 bit char?

Bo Persson

unread,
Sep 28, 2018, 4:33:36 AM9/28/18
to
The 8-bit integer types were designed by the C committee. As C doesn't
have operator overloading there was no problem with using typedefs.

C++ just followed the C lead here, for compatibility reasons.

char16_t and char32_t are newer types, and could therefore be done properly.



Bo Persson

Paavo Helde

unread,
Sep 28, 2018, 5:46:08 AM9/28/18
to
So this means we are lacking the char8_t type. Oh, other people have
discovered this as well:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0482r1.html

Alas, this wouldn't fix the ostream << uint8_t problem.

Ralf Goertz

unread,
Sep 28, 2018, 5:47:05 AM9/28/18
to
Am Fri, 28 Sep 2018 10:33:27 +0200
schrieb Bo Persson <b...@gmb.dk>:

> On 2018-09-28 09:48, Ralf Goertz wrote:

> > I can't think of any task done with [u]int8_t that can't also be
> > done as easily without it.

> The 8-bit integer types were designed by the C committee. As C doesn't
> have operator overloading there was no problem with using typedefs.
>
> C++ just followed the C lead here, for compatibility reasons.

Fair enough. But the question above remains also in C.

Öö Tiib

unread,
Sep 28, 2018, 6:16:35 AM9/28/18
to
On Friday, 28 September 2018 10:48:58 UTC+3, Ralf Goertz wrote:
> Am Wed, 19 Sep 2018 06:13:40 -0700
> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>
> > Sam <s...@email-scan.com> writes:
> >
> > > Ralf Goertz writes:
> > >
> > >> [...] If you need character types you can always use
> > >> (unsigned) char.
> > >
> > > There's no such thing as a "character type" in C++, as the term
> > > is used here.
> >
> > C++17 section 6.9.1 includes this sentence in paragraph 1:
> >
> > Plain char, signed char, and unsigned char are three
> > distinct types, collectively called /narrow character
> > types/.
> >
> > (The /'s indicate italics in the original, signifying a
> > definition of the italicized term.)
>
> Thanks for pointing that out. It somehow proves my point. There are
> character types whose primary task is to serve the purpose of character
> and string handling. So operator overloads to facilitate formatted in-
> and output are very desirable. However, why can't [u]int8_t be separate
> types and not typedef'ed to [un]signed char thereby »inheriting« those
> operator overloads? What is gained by these typedef? If there is no 8
> bit char type on a certain platform then there is also no [u]int8_t as
> David pointed out. I can't think of any task done with [u]int8_t that
> can't also be done as easily without it.

Manipulating individual bytes in a row is most portable form
of data processing and so C had whopping three distinct types
of bytes. Note that the difference between bytes, narrow character
types and tiniest integers is dim to nonexistent in C. C++ did not
want to be booed down for dropping compatibility with most portable
form of data processing and so followed it.

> The situation is different for wide characters. The standard says:
> »Types char16_t and char32_t denote distinct types with the same size,
> signedness, and alignment as uint_least16_t and uint_least32_t,
> respectively, in <cstdint>, called the underlying types.« And yet
> std::is_same<uint_least32_t,char32_t>::value is false. I know this is
> the other way around char32_t being »defined« in terms of
> uint_least32_t. But at least here is a clear distinction between a
> character type and a corresponding integer type. Why is that not the
> case for 8 bit char?

But from C++17 we *have* also byte in C++, std::byte. That is not
typedef of some other integral type and also does not promote or
convert into those out of blue because it is required to be enum class:

enum class byte : unsigned char {};

That is somewhat safer for representing raw memory. It has number of
bitwise operators defined for it instead of promoting into int silently
like those chars do. Also it has to be explicitly converted to int
with static_cast or with to_integer<int>(b) when that is needed.

Ralf Goertz

unread,
Sep 28, 2018, 11:14:03 AM9/28/18
to
m Fri, 28 Sep 2018 03:16:25 -0700 (PDT)
schrieb Öö Tiib <oot...@hot.ee>:

> But from C++17 we *have* also byte in C++, std::byte. That is not
> typedef of some other integral type and also does not promote or
> convert into those out of blue because it is required to be enum
> class: enum class byte : unsigned char {};

But that doesn't help either if you are interested in an 8 bit integral
type with formatted input. One could probably overload the appropriate
operators oneself, but std::byte doesn't even have arithmetics or did I
miss something?

By the way, on Tuesday 11:37:27 -0700 (PDT) you said in this thread:

> Mathematical operations are not defined for "short int" and shorter
> integral types in C++.

What do you mean by that? I can certainly add to short ints.

#include <iostream>

int main() {
short int i1(7), i2(8);
i1+=i2;
std::cout<<i1<<std::endl;
}

compiles fine and prints 15


james...@alumni.caltech.edu

unread,
Sep 28, 2018, 11:36:52 AM9/28/18
to
On Friday, September 28, 2018 at 11:14:03 AM UTC-4, Ralf Goertz wrote:
> m Fri, 28 Sep 2018 03:16:25 -0700 (PDT)
> schrieb Öö Tiib <oot...@hot.ee>:
...
> By the way, on Tuesday 11:37:27 -0700 (PDT) you said in this thread:
>
> > Mathematical operations are not defined for "short int" and shorter
> > integral types in C++.
>
> What do you mean by that? I can certainly add to short ints.
>
> #include <iostream>
>
> int main() {
> short int i1(7), i2(8);
> i1+=i2;

According to the standard, that code does not add two short ints. Both
i1 and i2 get promoted to int, then they are added, producing a result
of type 'int', then that result is converted back to short int. If it
makes no difference for the observable behavior, an implementation for a
platform that has hardware support for adding short ints will probably
optimize the code to use that support - but that's not how the C
standard defines the behavior.

Öö Tiib

unread,
Sep 28, 2018, 12:28:40 PM9/28/18
to
On Friday, 28 September 2018 18:14:03 UTC+3, Ralf Goertz wrote:
> m Fri, 28 Sep 2018 03:16:25 -0700 (PDT)
> schrieb Öö Tiib <oot...@hot.ee>:
>
> > But from C++17 we *have* also byte in C++, std::byte. That is not
> > typedef of some other integral type and also does not promote or
> > convert into those out of blue because it is required to be enum
> > class: enum class byte : unsigned char {};
>
> But that doesn't help either if you are interested in an 8 bit integral
> type with formatted input. One could probably overload the appropriate
> operators oneself, but std::byte doesn't even have arithmetics or did I
> miss something?

The std::byte has all bitwise arithmetic operators defined. What you
mean by "formatted input"? Sure, when our use case needs something
extra or different then that can be written.

The std::byte is good precedent. Since standard library took to use
enum class for it they now have to patch up all the defects in those
like that one ... https://wg21.cmeerw.net/cwg/issue2338 ... and
so it is potentially becoming good tool for making whatever type-safe
integral types you need in C++.

> By the way, on Tuesday 11:37:27 -0700 (PDT) you said in this thread:
>
> > Mathematical operations are not defined for "short int" and shorter
> > integral types in C++.
>
> What do you mean by that? I can certainly add to short ints.

James Kuyper gave perfect answer to that. Note that exactly because
of such magical silent promotions and conversions sometimes some code
will give unexpected diagnostics or results.

Ralf Goertz

unread,
Sep 29, 2018, 3:01:35 AM9/29/18
to
Am Fri, 28 Sep 2018 09:28:30 -0700 (PDT)
schrieb Öö Tiib <oot...@hot.ee>:

> On Friday, 28 September 2018 18:14:03 UTC+3, Ralf Goertz wrote:
> > m Fri, 28 Sep 2018 03:16:25 -0700 (PDT)
> > schrieb Öö Tiib <oot...@hot.ee>:
> >
> > > But from C++17 we *have* also byte in C++, std::byte. That is not
> > > typedef of some other integral type and also does not promote or
> > > convert into those out of blue because it is required to be enum
> > > class: enum class byte : unsigned char {};
> >
> > But that doesn't help either if you are interested in an 8 bit
> > integral type with formatted input. One could probably overload the
> > appropriate operators oneself, but std::byte doesn't even have
> > arithmetics or did I miss something?
>
> The std::byte has all bitwise arithmetic operators defined. What you
> mean by "formatted input"? Sure, when our use case needs something
> extra or different then that can be written.

That was the whole point why I started this thread. I need
set<vector<some_integral_type>> e.g. for counting certain permutations.
Sometimes the range of values exceeds what fits in a byte, then I have
to use 16 or 32 bit types. In some cases 8 bit is enough but there are
so many vectors in that set that I get into memory trouble when using 16
bit. That's why I templated my program. Then I fell into the operator
overload trap that cin>>i means something completely different
dependeing on whether i is int or char. I never had to use int8_t before
but as I said I was under the impression, that it would be an integral
and not a char type. Finding out that it is nothing but a typedef to
char made to curious as to what good this typedef does.

std::byte seemed promising to me, but as far as I can tell there is
neither a an overload of e.g >>

error: no match for ‘operator>>’ (operand types are ‘std::istream’ {aka
‘std::basic_istream<char>’} and ‘std::byte’)

which I mean by formatted input nor can I do something like b%=42 when b
is a std::byte. That's what I mean by arithmetics.

> > By the way, on Tuesday 11:37:27 -0700 (PDT) you said in this thread:
> >
> > > Mathematical operations are not defined for "short int" and
> > > shorter integral types in C++.
> >
> > What do you mean by that? I can certainly add to short ints.
>
> James Kuyper gave perfect answer to that. Note that exactly because
> of such magical silent promotions and conversions sometimes some code
> will give unexpected diagnostics or results.

Thanks, James. But what difference does that make? If the short operands
get promoted to int and the result gets demoted to short, can that
really be different from doing it directly? The program


#include <iostream>
#include <limits>

int main() {
typedef int Int;
Int i1(std::numeric_limits<short int>::max());
Int i2(i1/2);
Int i=i1+i2;
std::cout<<i1<<" "<<i2<<" "<<static_cast<short int>(i)<<std::endl;
}

prints

32767 16383 -16386

here regardless of whether Int is int or short int. Can the observable
behaviour really be different (you said unexpected results)? I get that
the platform may or may not support the addition of shorts directly. But
isn't that merely an implementation detail?

David Brown

unread,
Sep 29, 2018, 8:06:31 AM9/29/18
to
On 28/09/18 18:28, Öö Tiib wrote:
> On Friday, 28 September 2018 18:14:03 UTC+3, Ralf Goertz wrote:
>> m Fri, 28 Sep 2018 03:16:25 -0700 (PDT)
>> schrieb Öö Tiib <oot...@hot.ee>:
>>
>>> But from C++17 we *have* also byte in C++, std::byte. That is not
>>> typedef of some other integral type and also does not promote or
>>> convert into those out of blue because it is required to be enum
>>> class: enum class byte : unsigned char {};
>>
>> But that doesn't help either if you are interested in an 8 bit integral
>> type with formatted input. One could probably overload the appropriate
>> operators oneself, but std::byte doesn't even have arithmetics or did I
>> miss something?
>
> The std::byte has all bitwise arithmetic operators defined. What you
> mean by "formatted input"? Sure, when our use case needs something
> extra or different then that can be written.

I have always thought that was a bad idea. std::byte should have been
pure "raw memory" - a type you can use to read or write anything,
accessing any data for reading and writing (replacing the "char" types
from C for things like memcpy), and being a "pointer to memory". It is
/almost/ such a type, but bitwise operators have no place in such a type.

Öö Tiib

unread,
Sep 29, 2018, 8:11:42 AM9/29/18
to
These are not there because most sane software does not need those.
Take few steps back, I try once more to explain the big picture that
may help you out of that puzzle (or not). ;)

Rules of thumb:
1) For text input/output and for arithmetic processing use only
(u)int32_t, (u)int64_t, double and/or custom fast arithmetic
classes.

2) For binary input/output and with large storage use packing
level that is optimal. Optimal may be not to pack at all or
to go down to exact bit positions or even further depending
on bang needed and buck available.

The explanations of above rules:
1) When not constrained with I/O then the text formats like
JSON are most convenient. But narrow data types are still bad
for processing. The modern processors do not usually have
instructions for narrow data types. So these have to unpack
from narrow types and later pack back and therefore running
complex computations on narrow types is slower than with
already unpacked fast types. Packed data that is being
under heavy processing cripples.

2) The processors are fast on modern platforms so bandwidth and
throughput and latency of communications with anything
(including RAM) can't often feed those with enough data. That is
the common bottle-neck. If that is the case then it can be
profitable to go down to bits or even farther than that
with packing.
However do not reinvent wheels there, use existing formats.
For example most multimedia is packed about 50 times so average
byte of it is packed into about 0.16 bits. For other example
unpacking with the Lempel–Ziv–Welch (LZW) algorithm (Patent
expired on June 20, 2003) can be faster than memcpy() from
unpacked buffer.

The conclusions from those points above:
Attempt to read std::byte from text input is breaking all logic!
There are whopping 4 bytes (3 numbers and separator) used in
slow communication that is then turned into inefficient for
processing packed format (a byte). If text is used because of nature
of channel then use Base64 to transferring bytes.

>
> > > By the way, on Tuesday 11:37:27 -0700 (PDT) you said in this thread:
> > >
> > > > Mathematical operations are not defined for "short int" and
> > > > shorter integral types in C++.
> > >
> > > What do you mean by that? I can certainly add to short ints.
> >
> > James Kuyper gave perfect answer to that. Note that exactly because
> > of such magical silent promotions and conversions sometimes some code
> > will give unexpected diagnostics or results.
>
> Thanks, James. But what difference does that make? If the short operands
> get promoted to int and the result gets demoted to short, can that
> really be different from doing it directly?

There is poster supercat in comp.lang.c who loves to post examples
about that. He has only few of those IIRC like that:

unsigned mul(unsigned short a, unsigned short b) {return a*b;}

The examples of supercat are of interest to me only as stuff that
smells, how to estimate amounts of potential improvements or
simplifications that can be made and likely impact of those.

If to follow the "rules of thumb" that I posted above then we
actually do not need these narrow types for anything else
but only when we are constrained with storage or communications.
And then we use these only for converting into larger types
for processing and back for storing. Only sometimes when we
really pack or convert on bit level or further we will need
bitwise operations that std::byte implements for us already.

Öö Tiib

unread,
Sep 29, 2018, 8:25:13 AM9/29/18
to
I see a point of bitwise operations with bytes only for bit-wise
altering the data to fit with some binary layout. That may mean
things like array of 12 bit wide ints packed into a block of bytes
without any padding or results of other such packing algorithms.
Subtracting, adding, multiplying and dividing however feel totally
useless.

Chris Vine

unread,
Sep 29, 2018, 6:13:58 PM9/29/18
to
On Sat, 29 Sep 2018 09:01:25 +0200
Ralf Goertz <m...@myprovider.invalid> wrote:
[snip]
> That was the whole point why I started this thread. I need
> set<vector<some_integral_type>> e.g. for counting certain permutations.
> Sometimes the range of values exceeds what fits in a byte, then I have
> to use 16 or 32 bit types. In some cases 8 bit is enough but there are
> so many vectors in that set that I get into memory trouble when using 16
> bit. That's why I templated my program. Then I fell into the operator
> overload trap that cin>>i means something completely different
> dependeing on whether i is int or char. I never had to use int8_t before
> but as I said I was under the impression, that it would be an integral
> and not a char type.

A char type is an integral type. (As are signed char, short int, int,
long int, long long int and their unsigned equivalents and bool,
wchar_t, char16_t and char32_t, together with the fixed/minimum sized
integer types, if they exist in the implementation in question):
according to the standard "bool, char, char16_t, char32_t, wchar_t, and
the signed and unsigned integer types are collectively called integral
types". char can be signed or unsigned but it is also a distinct type
from signed char or unsigned char. It has no special status beyond
that.

If all you are concerned about is printing char types with C++ streams'
operator << so they print in the same way that ints print, then cast
them to int in your call to operator << for the stream. Given the
length of this thread though I suspect you must be concerned about
something else.

James Kuyper

unread,
Sep 30, 2018, 9:23:08 AM9/30/18
to
On 09/29/2018 03:01 AM, Ralf Goertz wrote:
> Am Fri, 28 Sep 2018 09:28:30 -0700 (PDT)
> schrieb Öö Tiib <oot...@hot.ee>:
...
>> James Kuyper gave perfect answer to that. Note that exactly because
>> of such magical silent promotions and conversions sometimes some code
>> will give unexpected diagnostics or results.
>
> Thanks, James. But what difference does that make? If the short operands
> get promoted to int and the result gets demoted to short, can that
> really be different from doing it directly? The program

Yes, it can. First of all, expressions that would overflow when using
the smaller type can be perfectly safe wehn evaluated using the larger
type. Overflow is undefined behavior, which is generally a bad thing,
but it is quite common on modern machines for signed integer types to
use 2's complement notation, and for overflow to be handled accordingly
without any other negative consequence, which means that expressions
which would overflow in the smaller type might produce exactly the same
result as if that type had been used. However, C still allows
implementations to use sign-magnitude and one's complement notation, for
which that equivalence would not hold.

Non 2's complement signed types are rare. However, being aware of the
integer promotions is important for understanding the behavior of other
expressions which are more problematic. Unsigned types with a maximum
representable value that is <= INT_MAX promote to int. As a result,
expressions that you might otherwise have expected to be resolved using
unsigned math may end up using signed math instead.

Tim Rentsch

unread,
Sep 30, 2018, 1:23:38 PM9/30/18
to
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:

>> [...]
>
> A char type is an integral type. [...] char can be signed or
> unsigned but it is also a distinct type from signed char or
> unsigned char. It has no special status beyond that.

Actually it does have one: it is one of only three types (the
other two being unsigned char and std::byte) that are exempt from
type access rules given in section 6.10.

(It's interesting that signed char is not on that list, which
in C it is.)

Tim Rentsch

unread,
Sep 30, 2018, 1:58:21 PM9/30/18
to
Ralf Goertz <m...@myprovider.invalid> writes:

> Am Wed, 19 Sep 2018 06:13:40 -0700
> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>
>> Sam <s...@email-scan.com> writes:
>>
>>> Ralf Goertz writes:
>>>
>>>> [...] If you need character types you can always use
>>>> (unsigned) char.
>>>
>>> There's no such thing as a "character type" in C++, as the term
>>> is used here.
>>
>> C++17 section 6.9.1 includes this sentence in paragraph 1:
>>
>> Plain char, signed char, and unsigned char are three
>> distinct types, collectively called /narrow character
>> types/.
>>
>> (The /'s indicate italics in the original, signifying a
>> definition of the italicized term.)
>
> [...] Why can't [u]int8_t be separate types and not typedef'ed to
> [un]signed char [...]?

They can be. They don't have to be, but they can be. (I should
add, certainly in C, and I am pretty sure for C++.)

> What is gained by these typedef?

They make things easy for lazy implementors.

To be fair I should add that allowing the [u]intN_t types to be
typedefs gives a degree of freedom to implementations, and which
could be passed on to developers with a compiler option, if that
were thought to be important. Personally I would favor having
such an option, although it isn't one I would put high on a
priority list.

> [Assuming CHAR_BIT == 8] I can't think of any task done with
> [u]int8_t that can't also be done as easily without it.

Advantages of unsigned char (the type, not necessarily using that
name) over uint8_t:

(1) It's guaranteed to exist;

(2) It's guaranteed to allow access to any type of
object; and

(3) No #include is needed to use it.

Advantages of signed char (the type, not necessarily using that
name) over int8_t:

(1) It's guaranteed to exist;

(2) (In C, not C++) It's guaranteed to allow access to any
type of object; and

(3) No #include is needed to use it.

The minimum value for signed char may be -127, whereas the
minimum value for int8_t (if it exists) must be -128. But if
that property is desired, it can be checked staticly using the
value of SCHAR_MIN, and if int8_t exists then signed char will
also have a minimum value of -128.

Considering the above I see no reason to ever use the [u]int8_t
types, except in cases where it's important to conform to an
existing interface that uses them.

Tim Rentsch

unread,
Sep 30, 2018, 2:11:51 PM9/30/18
to
Ralf Goertz <m...@myprovider.invalid> writes:

> On Fri, 28 Sep 2018 09:28:30 -0700 (PDT) Tiib <oot...@hot.ee>:
>> On Friday, 28 September 2018 18:14:03 UTC+3, Ralf Goertz wrote:
>>> On Fri, 28 Sep 2018 03:16:25 -0700 (PDT) Tiib <oot...@hot.ee>:
>>>
>>>> But from C++17 we *have* also byte in C++, std::byte. That is
>>>> not typedef of some other integral type and also does not promote
>>>> or convert into those out of blue because it is required to be
>>>> enum class: enum class byte : unsigned char {};
>>>
>>> But that doesn't help either if you are interested in an 8 bit
>>> integral type with formatted input. One could probably overload
>>> the appropriate operators oneself, but std::byte doesn't even have
>>> arithmetics or did I miss something?
>>
>> The std::byte has all bitwise arithmetic operators defined. What
>> you mean by "formatted input"? Sure, when our use case needs
>> something extra or different then that can be written.
>
> That was the whole point why I started this thread. I need
> set<vector<some_integral_type>> e.g. for counting certain
> permutations. Sometimes the range of values exceeds what fits in
> a byte, then I have to use 16 or 32 bit types. In some cases 8
> bit is enough but there are so many vectors in that set that I get
> into memory trouble when using 16 bit. [...]

If you really need an 8-bit integer type, you might consider
making one yourself as a class (or maybe one of the newfangled
enum's, but I haven't used those much). Here is a sketch:

#include <iostream>

class U8 {
unsigned char v;
public:
U8() : v( 0 ) {}
U8( unsigned uc ) : v( uc ) {}
operator unsigned(){ return v; }
U8 &operator=( unsigned uc ){ v = uc; return *this; }
unsigned value(){ return v; }
};

std::istream &
operator >>( std::istream &in, U8 &u8 ){
unsigned u;
in >> u;
u8 = u;
return in;
}

int
main(){
U8 u8;
std::cout << "Hello, world\n";
std::cin >> u8;
std::cout << "Value of u8 is " << u8 << "\n";
std::cout << "sizeof u8 = " << sizeof u8 << "\n";
return 0;
}


Disclaimer: the code compiles and runs, but I'm not sure I've
made good choices about the conversions or really anything else.
The point is it should be possible to define a type with the
properties that you want, provided of course the compiler being
used is good enough to make the class be a single byte in size.

Richard Damon

unread,
Sep 30, 2018, 2:26:44 PM9/30/18
to
The main reason I use uint8_t/int8_t is stylistic to indicate that what
it holds in treated as a small number and not something 'character'
related. It also signals that I may be making an implicit assumption
that the machine is 'normal' (twos complement, 8 bit byte) rather than
adding a static test of preprocessor symbols. To me, char is only used
to actually hold text, and unsigned char for text that all values need
to be positive (for text processing function calls) or accessing 'raw'
memory.

Chris Vine

unread,
Sep 30, 2018, 3:01:09 PM9/30/18
to
OK, well if one of three counts as having special status, then another
related one is assigning or otherwise evaluating integrals, pointers and
some other objects which have not been initialized and so have
indeterminate value. In most cases this is undefined behaviour, but in
the case of char, unsigned char and std::byte it just gives you another
indeterminate value.

int i; // indeterminate value
int j = i; // undefined behaviour

char c; // indeterminate value
char d = c; // indeterminate value

There may be other special cases of that kind. But these all relate to
the fact that these types may be used for low level byte access in C++
rather than that they are "characters".

Chris Vine

unread,
Sep 30, 2018, 3:03:13 PM9/30/18
to
One correction - this only works with char (as opposed to unsigned char
or std::byte) if, in the implementation in question, char is unsigned.

Ralf Goertz

unread,
Oct 2, 2018, 5:45:28 AM10/2/18
to
Am Sun, 30 Sep 2018 10:58:08 -0700
schrieb Tim Rentsch <t...@alumni.caltech.edu>:

> Ralf Goertz <m...@myprovider.invalid> writes:
>
> > Am Wed, 19 Sep 2018 06:13:40 -0700
> > schrieb Tim Rentsch <t...@alumni.caltech.edu>:
> >
> >> Sam <s...@email-scan.com> writes:
> >>
> >>> Ralf Goertz writes:
> >>>
> >>>> [...] If you need character types you can always use
> >>>> (unsigned) char.
> >>>
> >>> There's no such thing as a "character type" in C++, as the term
> >>> is used here.
> >>
> >> C++17 section 6.9.1 includes this sentence in paragraph 1:
> >>
> >> Plain char, signed char, and unsigned char are three
> >> distinct types, collectively called /narrow character
> >> types/.
> >>
> >> (The /'s indicate italics in the original, signifying a
> >> definition of the italicized term.)
> >
> > [...] Why can't [u]int8_t be separate types and not typedef'ed to
> > [un]signed char [...]?
>
> They can be. They don't have to be, but they can be. (I should
> add, certainly in C, and I am pretty sure for C++.)

But if they are separate types they need to overload the stream
operators "<<" and ">>" in the same way the char types do, right?
Otherwise a respective program would behave differently, depending on
these types being separate or typedefs. Which would make the advantage
of them being separate types moot.

James Kuyper

unread,
Oct 2, 2018, 8:05:04 AM10/2/18
to
On 10/02/2018 05:45 AM, Ralf Goertz wrote:
> Am Sun, 30 Sep 2018 10:58:08 -0700
> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>
>> Ralf Goertz <m...@myprovider.invalid> writes:
...
>>> [...] Why can't [u]int8_t be separate types and not typedef'ed to
>>> [un]signed char [...]?
>>
>> They can be. They don't have to be, but they can be. (I should
>> add, certainly in C, and I am pretty sure for C++.)
>
> But if they are separate types they need to overload the stream
> operators "<<" and ">>" in the same way the char types do, right?

What makes you think so? The relevant extractors for char, unsigned
char, and signed char are described in 27.7.2.2.3p11, while character
inserters get their own numbered section, 27.7.2.3.4. No mention is made
in either place of any extended integer type or size-named type.

As far as I can see, there's no requirement that any extractors or
inserters be provided for any extended integer type; 27.7.2.2.2 and
27.7.2.3.2 only describe arithmetic extractors and inserters for
standard types. Using the PRI/SCN macros defined in <cinttypes> or
<inttypes.h> with the formatted I/O routines declared in <cstdio> or
<stdio.h> are the only ways provided by C++ to directly input or output
any size-named types that are typedefed to extended integer types.

Ralf Goertz

unread,
Oct 2, 2018, 11:47:22 AM10/2/18
to
Am Tue, 2 Oct 2018 08:04:50 -0400
schrieb James Kuyper <james...@alumni.caltech.edu>:

> On 10/02/2018 05:45 AM, Ralf Goertz wrote:
> > Am Sun, 30 Sep 2018 10:58:08 -0700
> > schrieb Tim Rentsch <t...@alumni.caltech.edu>:
> >
> >> Ralf Goertz <m...@myprovider.invalid> writes:
> ...
> >>> [...] Why can't [u]int8_t be separate types and not typedef'ed to
> >>> [un]signed char [...]?
> >>
> >> They can be. They don't have to be, but they can be. (I should
> >> add, certainly in C, and I am pretty sure for C++.)
> >
> > But if they are separate types they need to overload the stream
> > operators "<<" and ">>" in the same way the char types do, right?
>
> What makes you think so? The relevant extractors for char, unsigned
> char, and signed char are described in 27.7.2.2.3p11, while character
> inserters get their own numbered section, 27.7.2.3.4. No mention is
> made in either place of any extended integer type or size-named type.

Well, what about this program:

#include <iostream>
#include <cstdint>

int main() {
uint8_t i(42);
std::cout<<i<<std::endl;
}

If uint8_t were not typedefed to unsigned char but a separate type then
it would not even compile?

> As far as I can see, there's no requirement that any extractors or
> inserters be provided for any extended integer type; 27.7.2.2.2 and
> 27.7.2.3.2 only describe arithmetic extractors and inserters for
> standard types. Using the PRI/SCN macros defined in <cinttypes> or
> <inttypes.h> with the formatted I/O routines declared in <cstdio> or
> <stdio.h> are the only ways provided by C++ to directly input or
> output any size-named types that are typedefed to extended integer
> types.

Seems to be the case then.

Öö Tiib

unread,
Oct 2, 2018, 1:15:08 PM10/2/18
to
Yes, it would be ill-formed because of ambiguity. Extended integer types
can be converted to standard integer types but compiler can't figure
which standard integer type you meant.

Tim Rentsch

unread,
Oct 3, 2018, 12:41:31 PM10/3/18
to
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:

> On Sun, 30 Sep 2018 10:23:24 -0700
> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
>> Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
>>
>>>> [...]
>>>
>>> A char type is an integral type. [...] char can be signed or
>>> unsigned but it is also a distinct type from signed char or
>>> unsigned char. It has no special status beyond that.
>>
>> Actually it does have one: it is one of only three types (the
>> other two being unsigned char and std::byte) that are exempt from
>> type access rules given in section 6.10.
>>
>> (It's interesting that signed char is not on that list, which
>> in C it is.)
>
> OK, well if one of three counts as having special status, then
> another related one is [status when an object of that type has not
> been initialized. ...]

Yes, I didn't mean to imply that there is only one, just that
there is at least one. Also I didn't mean it as a gotcha, just
a property you may have forgotten or overlooked.

> There may be other special cases of that kind. But these all
> relate to the fact that these types may be used for low level byte
> access in C++ rather than that they are "characters".

The property I mentioned relates more to the type 'char' being a
character type than it does to the type 'char' being an integer
type: it's a carryover from C, where all character types, and
only character types, have the property in question. Or one might
say it relates to the type(s) being the smallest object type(s)
(other than bitfields), which also relates to characters. In any
case my comment was not about "characterness" or "integerness",
only that the type 'char' does have a status beyond that of
integer types generally. The type 'char' also has a status beyond
that of *character* types generally (even considering just narrow
character types). What matters is the distinction in status, not
what labels might be used to describe it.

Tim Rentsch

unread,
Oct 3, 2018, 3:48:24 PM10/3/18
to
Richard Damon <Ric...@Damon-Family.org> writes:

> On 9/30/18 1:58 PM, Tim Rentsch wrote:
>
>> [.. differences between [u]int8_t and [un]signed char ..]
>>
>> Considering the above I see no reason to ever use the [u]int8_t
>> types, except in cases where it's important to conform to an
>> existing interface that uses them.
>
> The main reason I use uint8_t/int8_t is stylistic to indicate that
> what it holds in treated as a small number and not something
> 'character' related.

I think it's a good idea to use a different name for a type that
is meant to hold a small integer (or actually two names, one for
signed, one for unsigned) as opposed to the special properties of
signed char and unsigned char. In most cases though [u]int8_t
are bad choices for that purpose, due to ambiguity and cultural
baggage. (continues below)

> It also signals that I may be making an implicit assumption
> that the machine is 'normal' (twos complement, 8 bit byte)
> rather than adding a static test of preprocessor symbols. To
> me, char is only used to actually hold text, and unsigned char
> for text that all values need to be positive (for text
> processing function calls) or accessing 'raw' memory.

To get one case out of the way - no sensible person uses 'char'
for anything other than character or text processing. In
particular using 'char' for holding an integer value is nutty
(not counting cases where the values come from character
constants or things like that).

As a matter of style, I think it's good practice to use names
other than 'signed char' or 'unsigned char' when the objects in
question are meant only to hold small integer values, to avoid
confusion with other uses of character types, and unsigned char in
particular. The big problem with the [u]int8_t types is the names
say both too much and too little. For example, if I see a
declaration 'int8_t x;', is it important that the type has a two's
complement representation, or not? Similarly, for 'uint8_t u;',
is it important that its partner type use two's complement? In
most cases I suspect it isn't. Another aspect is "characterness":
some people treat [u]int8_t as being synonyms for character types,
whereas others (which IIUC includes you) treat [u]int8_t as
completely separate from the "character" aspects of the [un]signed
char types. If all I want is a small and unsigned type, I don't
want to use uint8_t, because it has other connotational baggage
that I really don't want to convey. Similarly for a small and
signed type. And besides the [u]int8_t names being read as
something other than what is meant, how they actually are read is
ambiguous - different readers will infer different connotations.
To me that's the bottom line: a name should say what it means,
and not more, and not less. The types [u]int8_t hardly ever do
that.

Tim Rentsch

unread,
Oct 3, 2018, 4:15:49 PM10/3/18
to
Ralf Goertz <m...@myprovider.invalid> writes:

> Am Sun, 30 Sep 2018 10:58:08 -0700
> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>
>> Ralf Goertz <m...@myprovider.invalid> writes:
>>
>>> Am Wed, 19 Sep 2018 06:13:40 -0700
>>> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>>>
>>>> Sam <s...@email-scan.com> writes:
>>>>
>>>>> Ralf Goertz writes:
>>>>>
>>>>>> [...] If you need character types you can always use
>>>>>> (unsigned) char.
>>>>>
>>>>> There's no such thing as a "character type" in C++, as the term
>>>>> is used here.
>>>>
>>>> C++17 section 6.9.1 includes this sentence in paragraph 1:
>>>>
>>>> Plain char, signed char, and unsigned char are three
>>>> distinct types, collectively called /narrow character
>>>> types/.
>>>>
>>>> (The /'s indicate italics in the original, signifying a
>>>> definition of the italicized term.)
>>>
>>> [...] Why can't [u]int8_t be separate types and not typedef'ed to
>>> [un]signed char [...]?
>>
>> They can be. They don't have to be, but they can be. (I should
>> add, certainly in C, and I am pretty sure for C++.)
>
> But if they are separate types they need to overload the stream
> operators "<<" and ">>" in the same way the char types do, right?

If you mean, does the Standard require they be overloaded, I'm
pretty sure it doesn't.

If you mean, would it make sense to provide additional overloads
for these distinct types, I agree that it would.

If you mean, should the semantics of the additional overloads
match the semantics for character types, I'm sure there are
(at least) two schools of thought on that question. AFAICT
the Standard doesn't impose a requirement either way.

> Otherwise a respective program would behave differently,
> depending on these types being separate or typedefs.

It could behave differently. It wouldn't necessarily behave
differently. The choice is up to each implementation.

> Which would make the advantage of them being separate types
> moot.

If the types are separate then the implementation has an
additional degree of freedom, which it could in turn pass along
in the form of a compiler option. To me that enhances the
advantage, not eliminates it.

Tim Rentsch

unread,
Oct 3, 2018, 4:24:52 PM10/3/18
to
I'm not an expert on C++ overloading, but I believe it could
compile if the implementation wanted it to (and presumably it
would if the types were defined as distinct). This result
occurs because the C++ standard allows additional overloads
for non-virtual member functions in library classes:


20.5.5.5 Member functions [member.functions]

1 [...]

2 For a non-virtual member function described in the C++
standard library, an implementation may declare a different
set of member function signatures, provided that any call to
the member function that would select an overload from the
set of declarations described in this International Standard
behaves as if that overload were selected. [ Note: For
instance, an implementation may add parameters with default
values, or replace a member function with default arguments
with two or more member functions with equivalent behavior,
or add additional signatures for a member function name.
--end note ]

Paavo Helde

unread,
Oct 3, 2018, 6:06:53 PM10/3/18
to
On 3.10.2018 23:15, Tim Rentsch wrote:
> Ralf Goertz <m...@myprovider.invalid> writes:
>
>> Am Sun, 30 Sep 2018 10:58:08 -0700
>> schrieb Tim Rentsch <t...@alumni.caltech.edu>:
>>
>>> Ralf Goertz <m...@myprovider.invalid> writes:
>>>> [...] Why can't [u]int8_t be separate types and not typedef'ed to
>>>> [un]signed char [...]?
>>>
>>> They can be. They don't have to be, but they can be. (I should
>>> add, certainly in C, and I am pretty sure for C++.)
>>
>> But if they are separate types they need to overload the stream
>> operators "<<" and ">>" in the same way the char types do, right?
>
> If you mean, does the Standard require they be overloaded, I'm
> pretty sure it doesn't.
>
> If you mean, would it make sense to provide additional overloads
> for these distinct types, I agree that it would.
>
> If you mean, should the semantics of the additional overloads
> match the semantics for character types, I'm sure there are
> (at least) two schools of thought on that question. AFAICT
> the Standard doesn't impose a requirement either way.

Wow, after following this thread for quite some time I believe I now
have finally found a case where C is qualitatively better then C++ ;-)

char x = 65;
printf("%c", x);
printf("%d", x);

Here, the value and its interpretation are separated, which makes sense
because there are different interpretations. This makes the code more
modular and more flexible than is possible with C++ streams.




James Kuyper

unread,
Oct 3, 2018, 7:46:10 PM10/3/18
to
On 10/03/2018 06:06 PM, Paavo Helde wrote:
...
> Wow, after following this thread for quite some time I believe I now
> have finally found a case where C is qualitatively better then C++ ;-)
>
> char x = 65;
> printf("%c", x);
> printf("%d", x);

Keep in mind that you can do exactly the same thing in C++, using the
exact same code. It would be more appropriate to compare <cstdio> and
<iostream>.

Paavo Helde

unread,
Oct 4, 2018, 1:47:24 AM10/4/18
to
printf() is not typesafe so cannot be really advocated for C++ usage.
What one can do is to create a typesafe wrapper around printf, I'm using
a home-grown one which allows me to write e.g.

char x = 65;
std::cout << Sprintf("%c %d\n")(x)(x);

OUTPUT: A 65

There are other typesafe formatting libraries, but nothing in the
standard AFAIK. Regretfully I see that Boost.Format gets the %d+char
output wrong, at least in the version I have.

There is a standards proposal
http://open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3716.html
which says for %d "as if it is formatted by snprintf" so hopefully it
would get it right.

Ralf Goertz

unread,
Oct 4, 2018, 4:57:19 AM10/4/18
to
Am Wed, 03 Oct 2018 12:48:11 -0700
schrieb Tim Rentsch <t...@alumni.caltech.edu>:
+1

David Brown

unread,
Oct 4, 2018, 7:47:22 AM10/4/18
to
I agree entirely.

>
> As a matter of style, I think it's good practice to use names
> other than 'signed char' or 'unsigned char' when the objects in
> question are meant only to hold small integer values, to avoid
> confusion with other uses of character types, and unsigned char in
> particular.

Agreed.

> The big problem with the [u]int8_t types is the names
> say both too much and too little. For example, if I see a
> declaration 'int8_t x;', is it important that the type has a two's
> complement representation, or not? Similarly, for 'uint8_t u;',
> is it important that its partner type use two's complement? In
> most cases I suspect it isn't.

I agree that in many cases, the fact that a type is two's complement is
not important. But baring /really/ odd machines - so odd that for
almost everyone, the possibility can be ignored - your integer types
/will/ be two's complement. And though the C standards say "int8_t"
must be two's complement, the name itself does not. I think generally
the range of the type - knowing it is from -128 to 127 - is more
important than the representation.

To me, "int8_t" says "8-bit signed integer". It is as simple as that.
So when I want an 8-bit signed integer, "int8_t" is as good a name as
you can get.

It might be that people want just a "small integer", in which case 8-bit
might be more specific than they need. And for that purpose,
int_least8_t is not a typename that rolls of the tongue.

> Another aspect is "characterness":
> some people treat [u]int8_t as being synonyms for character types,
> whereas others (which IIUC includes you) treat [u]int8_t as
> completely separate from the "character" aspects of the [un]signed
> char types.

I think it is a disadvantage that the types are - in all but
hypothetical implementations - synonyms for character types. It mixes
different things. I'd prefer a clearer separation between types for
holding characters (which might have different sizes for different
encoding systems, but for which "signed" and "unsigned" make no sense to
me), types for dealing with raw memory (bypassing the "strict aliasing"
and rules, and preferably available in different sizes), and integer types.

> If all I want is a small and unsigned type, I don't
> want to use uint8_t, because it has other connotational baggage
> that I really don't want to convey. Similarly for a small and
> signed type.

To me, the only "baggage" is that the size is specific at 8 bits. But I
usually see that as an advantage - I like to know exactly what ranges I
have and what space is taken up. That may be because of the kind of
programming I do, and may not be the same for other people.

> And besides the [u]int8_t names being read as
> something other than what is meant, how they actually are read is
> ambiguous - different readers will infer different connotations.
> To me that's the bottom line: a name should say what it means,
> and not more, and not less. The types [u]int8_t hardly ever do
> that.
>

It seems that these type names say something a little different to you
and to me, and that we have slightly different needs from them. So I am
happy with the names int8_t and uint8_t, but I can understand your
points for why you don't find them ideal.

However, you haven't addressed the elephant in the room here - what
would /you/ suggest as names for types here, and what characteristics
would you prefer them to have?



David Brown

unread,
Oct 4, 2018, 7:54:17 AM10/4/18
to
On 04/10/18 07:47, Paavo Helde wrote:
> On 4.10.2018 2:45, James Kuyper wrote:
>> On 10/03/2018 06:06 PM, Paavo Helde wrote:
>> ...
>>> Wow, after following this thread for quite some time I believe I now
>>> have finally found a case where C is qualitatively better then C++ ;-)
>>>
>>> char x = 65;
>>> printf("%c", x);
>>> printf("%d", x);
>>
>> Keep in mind that you can do exactly the same thing in C++, using the
>> exact same code. It would be more appropriate to compare <cstdio> and
>> <iostream>.
>
> printf() is not typesafe so cannot be really advocated for C++ usage.

Of course printf is fine to use in C++. It has exactly the same
advantages and disadvantages as it has in C. std::cout has many good
features in C++, but it also has at least three glaring problems in
comparison to printf - the complexity of translating strings, the
statefulness of things like outputting hex format, and the one you have
mentioned here.


For some compilers, as long as the format string of printf is fixed at
compile time the compiler can check the number and types of the
parameters. It is not as good as being type-safe, but it is a good help
for avoiding bugs.

Paavo Helde

unread,
Oct 4, 2018, 8:14:34 AM10/4/18
to
On 3.10.2018 22:48, Tim Rentsch wrote:
> To get one case out of the way - no sensible person uses 'char'
> for anything other than character or text processing. In
> particular using 'char' for holding an integer value is nutty
> (not counting cases where the values come from character
> constants or things like that).

Ouch, that hurts! I have recently spent a lot of time on converting and
storing numeric arrays in the smallest possible datatype, including
8-bit, in order to reduce the memory footprint and enhance the
performance with large arrays. It's sad to hear I'm not sensible ;-)

On a related note, lots of image formats support 8-bit pixels, both in
grayscale and as components of RGB colors. Whenever you see yet another
cat picture on your screen there has been a heavy amount of processing
8-bit data numerically.

Scott Lurndal

unread,
Oct 4, 2018, 8:46:36 AM10/4/18
to
Paavo Helde <myfir...@osa.pri.ee> writes:
>On 4.10.2018 2:45, James Kuyper wrote:
>> On 10/03/2018 06:06 PM, Paavo Helde wrote:
>> ...
>>> Wow, after following this thread for quite some time I believe I now
>>> have finally found a case where C is qualitatively better then C++ ;-)
>>>
>>> char x = 65;
>>> printf("%c", x);
>>> printf("%d", x);
>>
>> Keep in mind that you can do exactly the same thing in C++, using the
>> exact same code. It would be more appropriate to compare <cstdio> and
>> <iostream>.
>
>printf() is not typesafe so cannot be really advocated for C++ usage.

That's a fallacy. The former doesn't imply the latter.

David Brown

unread,
Oct 4, 2018, 9:15:25 AM10/4/18
to
There is nothing "nutty" about using small numerical data - but there
/is/ something nutty about using types called "char", "signed char" or
"unsigned char" for it. (This is my own opinion - I am not trying to
speak for Tim. But it looks like we agree here.) Use types "int8_t" or
"uint8_t" instead - or, if you prefer, given them a different typedef'ed
name that matches the usage.

Paavo Helde

unread,
Oct 4, 2018, 9:27:31 AM10/4/18
to
On 4.10.2018 14:54, David Brown wrote:
> On 04/10/18 07:47, Paavo Helde wrote:
>>
>> printf() is not typesafe so cannot be really advocated for C++ usage.
>
> Of course printf is fine to use in C++.

Yes, I agree it's fine to use. I just said it cannot be advocated,
trying to do that e.g. in this group would bring a huge backlash I'm sure.

I have tried to avoid printf() myself as I sometimes like to refactor
code massively which is dangerous in presence of printf and such.
However, now I checked and found out that my primary compiler now
detects printf format string warnings and can also turn them into errors
via '#pragma warning (error: 4477)' so maybe I should rethink my position.


Paavo Helde

unread,
Oct 4, 2018, 10:00:12 AM10/4/18
to
Yes, maybe I misunderstood Tim and he just talked about using the "char"
*name*. Agreed this should not be used for non-strings. However, the
problem is that std::int8_t and std::uint8_t are really just typedefs,
not real types, which makes them just cosmetics and makes it impossible
to fix the std stream behavior, for example.

Anyway, in my code the 8-bit types typically appear in this context as T
in template<typename T>, so the naming issue is moot.



James Kuyper

unread,
Oct 4, 2018, 10:07:51 AM10/4/18
to
On 10/04/2018 08:14 AM, Paavo Helde wrote:
> On 3.10.2018 22:48, Tim Rentsch wrote:
>> To get one case out of the way - no sensible person uses 'char'
>> for anything other than character or text processing. In
>> particular using 'char' for holding an integer value is nutty
>> (not counting cases where the values come from character
>> constants or things like that).
>
> Ouch, that hurts! I have recently spent a lot of time on converting and
> storing numeric arrays in the smallest possible datatype, including
> 8-bit, in order to reduce the memory footprint and enhance the
> performance with large arrays. It's sad to hear I'm not sensible ;-)

Using the smallest possible datatype to store numeric arrays in is
perfectly sensible - but there are many different names that refer to
types which have that same minimum size: char, signed char, unsigned
char, intleast8_t, uintleast8_t, and if supported, int8_t and uint8_t.
Of all of those different names, "char" is by far the least appropriate
one to use for that purpose, because it's the only one where you cannot
be portably certain whether it's a signed or unsigned type. When using
them to store numbers, that's a critically important thing you need to
know. That's what's not sensible about your approach.

James Kuyper

unread,
Oct 4, 2018, 10:11:29 AM10/4/18
to
On 10/04/2018 01:47 AM, Paavo Helde wrote:
> On 4.10.2018 2:45, James Kuyper wrote:
>> On 10/03/2018 06:06 PM, Paavo Helde wrote:
>> ...
>>> Wow, after following this thread for quite some time I believe I now
>>> have finally found a case where C is qualitatively better then C++ ;-)
>>>
>>> char x = 65;
>>> printf("%c", x);
>>> printf("%d", x);
>>
>> Keep in mind that you can do exactly the same thing in C++, using the
>> exact same code. It would be more appropriate to compare <cstdio> and
>> <iostream>.
>
> printf() is not typesafe so cannot be really advocated for C++ usage.
> What one can do is to create a typesafe wrapper around printf, I'm using
> a home-grown one which allows me to write e.g.

I was responding to a claim that, for this purpose, C is better than
C++. That judgement was made despite the fact that C's printf() is not
typesafe. Is the C++ version of printf() any less typesafe than the C one?

James Kuyper

unread,
Oct 4, 2018, 10:15:21 AM10/4/18
to
On 10/04/2018 09:27 AM, Paavo Helde wrote:
> On 4.10.2018 14:54, David Brown wrote:
>> On 04/10/18 07:47, Paavo Helde wrote:
>>>
>>> printf() is not typesafe so cannot be really advocated for C++ usage.
>>
>> Of course printf is fine to use in C++.
>
> Yes, I agree it's fine to use. I just said it cannot be advocated,

It can't? I don't see what's preventing it from being advocated. The
advantage you originally described is perfectly real. It's not
necessarily important enough to justify using printf(), but I wouldn't
see anything wrong with someone deciding otherwise and expressing that
opinion here.

Paavo Helde

unread,
Oct 4, 2018, 10:45:59 AM10/4/18
to
One can argue that it is, because the expectations about the language
and compiler are different. For example, if I am changing a type I
expect the C++ compiler to diagnose all code places which are now broken
because of this change. In C I know I cannot really expect that and have
to be much more careful and tedious when making changes.

Chris Vine

unread,
Oct 4, 2018, 12:15:48 PM10/4/18
to
I don't seem to feel the angst others do with how the standard streams
print the value of the fixed or minimum sized integer types provided by
stdint.h. If your integer type is uint8_t and you want it to print out
with std::printf as an integer and not a character, you use the "%d" or
"%u" format specifier (it won't make any difference which) and not the
"%c" format specifier. If you want to print it out with std::ostream's
operator << as an integer and not a character, you cast it to int or
unsigned int when invoking operator <<.[1] It seems somewhat extreme
to break congruity with C and make all the fixed and minimum width
types in stdint.h their own distinct types just to please operator <<
and >>.

Likewise, if you want to print out the address held in a pointer, you
cast to void*. I don't see the problem. After all, by allowing casts
(for good reason) C++ is not soundly typed to begin with.

Chris

[1] In theory on some weird platform uint8_t might not be a typedef to
unsigned char, so maybe if you want to print it as a character you
should cast it to char although that seems somewhat pedantic.

james...@alumni.caltech.edu

unread,
Oct 4, 2018, 12:48:36 PM10/4/18
to
Correct. I was talking about the single object inserters and extractors.
Since values of type uint8_t get promoted to 'int', the inserter for int
should work with uint8_t objects. However, the extractor for int uses an
int&, and therefore won't work for uint8_t, and the same is true for
both the inserters and extractors that work with pointers to character
types, since integer promotions don't help with references and pointers.

Ralf Goertz

unread,
Oct 5, 2018, 2:57:02 AM10/5/18
to
Am Thu, 4 Oct 2018 17:15:29 +0100
schrieb Chris Vine <chris@cvine--nospam--.freeserve.co.uk>:

> I don't seem to feel the angst others do with how the standard streams
> print the value of the fixed or minimum sized integer types provided
> by stdint.h.

[OT] As a German it is always a bit strange to see or hear the word
angst in a conversation in english. According to dict.leo.org it
literally translates to its counterpart Angst with the modifiers
"Lebensangst" (which describes the mood of hopelessness and fear of the
future and which is rearely used AFAICT) or panisch (panic). Both those
meanings I have a hard time recognising as appropriate here. This is
probably another case of differences of dictionary definition and usage.
One other expample being "idiosyncratic" for which the primary
dictionary meaning apart from the medical term seems to be something
like "unbearably disgusting" which rarely fits in the contexts I read or
hear this word.

> If your integer type is uint8_t and you want it to print out with
> std::printf as an integer and not a character, you use the "%d" or
> "%u" format specifier (it won't make any difference which) and not the
> "%c" format specifier. If you want to print it out with
> std::ostream's operator << as an integer and not a character, you cast
> it to int or unsigned int when invoking operator <<.[1] It seems
> somewhat extreme to break congruity with C and make all the fixed and
> minimum width types in stdint.h their own distinct types just to
> please operator << and >>.

But you wouldn't need to break congruity, would you? I don't know that
much C but it doesn't have the *stream* operators << and >>. So it would
have been possible to make [u]int8_t a separate type which behaves like
[un]signed char in all aspects shared between C and C++ and still define
those operators to behave the way you would expect from integer types.


Chris Vine

unread,
Oct 5, 2018, 7:40:14 AM10/5/18
to
On Fri, 5 Oct 2018 08:56:54 +0200
Ralf Goertz <m...@myprovider.invalid> wrote:
> Am Thu, 4 Oct 2018 17:15:29 +0100
> schrieb Chris Vine <chris@cvine--nospam--.freeserve.co.uk>:
[snip]
> > If your integer type is uint8_t and you want it to print out with
> > std::printf as an integer and not a character, you use the "%d" or
> > "%u" format specifier (it won't make any difference which) and not the
> > "%c" format specifier. If you want to print it out with
> > std::ostream's operator << as an integer and not a character, you cast
> > it to int or unsigned int when invoking operator <<.[1] It seems
> > somewhat extreme to break congruity with C and make all the fixed and
> > minimum width types in stdint.h their own distinct types just to
> > please operator << and >>.
>
> But you wouldn't need to break congruity, would you? I don't know that
> much C but it doesn't have the *stream* operators << and >>. So it would
> have been possible to make [u]int8_t a separate type which behaves like
> [un]signed char in all aspects shared between C and C++ and still define
> those operators to behave the way you would expect from integer types.

I do not think that it is possible to "make [u]int8_t a separate type
which behaves like [un]signed char in all aspects shared between C and
C++ and still define [operators << and >>] to behave the way you would
expect from integer types". To provide overloading on operator <<
and >> then uint8_t would require to be a distinct type, rather as
wchar_t is (and presumably you would apply this to each fixed or
minimum width type). But if you did that then uint8_t and the other
fixed and minimum width integers _would_ behave differently between C
and C++. In particular, code which does not break the strict aliasing
rule in C might break it in C++.

Tim Rentsch

unread,
Oct 6, 2018, 8:46:10 AM10/6/18
to
Paavo Helde <myfir...@osa.pri.ee> writes:

> On 3.10.2018 22:48, Tim Rentsch wrote:
>
>> To get one case out of the way - no sensible person uses 'char'
>> for anything other than character or text processing. In
>> particular using 'char' for holding an integer value is nutty
>> (not counting cases where the values come from character
>> constants or things like that).
>
> Ouch, that hurts! I have recently spent a lot of time on converting
> and storing numeric arrays in the smallest possible datatype,
> including 8-bit, in order to reduce the memory footprint and enhance
> the performance with large arrays. It's sad to hear I'm not sensible
> ;-)

Did you use signed/unsigned char, or just plain char? My comment
was only about char, not signed char or unsigned char.

Tim Rentsch

unread,
Oct 6, 2018, 2:36:33 PM10/6/18
to
My comment was about the 'char' type, not the name. I suppose
there could be cases where using the 'char' type but with a
different name could be a good choice for some small integer
type, although I haven't yet thought of one.

> However, the problem is that std::int8_t and std::uint8_t
> are really just typedefs, not real types,

They could be typedefs, or they could be distinct types. AFAICT
both the C standard and C++ standard allow this. (I believe it
is also the case that the types in C++ must match the types in C,
but I haven't tried to verify that.)

> which makes them just
> cosmetics and makes it impossible to fix the std stream behavior, for
> example.

Personally I would prefer the [u]int8_t types in particular(*) to
be distinct from the standard character types, to accommodate
better behavior with respect to streams among other reasons.
Stronger type checking is another reason.

(*) Also other <stdint.h> types that could be character types.

Alf P. Steinbach

unread,
Oct 6, 2018, 3:13:57 PM10/6/18
to
You can do that with `enum` types.

I discovered that there's a difference between doing it with `enum` type
and doing it with a `struct` type when I tried to define an UTF-8
encoding unit type to use instead of `char`. Namely, that the `enum` is
compatible with the common small buffer optimization of `std::string`,
which typically uses a `union`, while the `struct` type isn't. Or
wasn't: the rules may have changed a little, C++ has generally gotten a
little more permissive.

That permissiveness includes that there's now a Defect Report for the
wording in the standard that it's Undefined Behavior to cast a value to
an `enum` when its outside the range of values defined for the `enum`,
even when it's in range for the underlying type. That UB, known to
everybody else, was discovered by the mighty as a consequence of
scrutinizing the definition of `std::byte`. So `std::byte` was good for
/something/, hallelujah :)


Cheers!,

- Alf

Chris Vine

unread,
Oct 6, 2018, 3:47:56 PM10/6/18
to
On Sat, 06 Oct 2018 11:36:18 -0700
Tim Rentsch <t...@alumni.caltech.edu> wrote:
> Paavo Helde <myfir...@osa.pri.ee> writes:
> > However, the problem is that std::int8_t and std::uint8_t
> > are really just typedefs, not real types,
>
> They could be typedefs, or they could be distinct types. AFAICT
> both the C standard and C++ standard allow this. (I believe it
> is also the case that the types in C++ must match the types in C,
> but I haven't tried to verify that.)

They must be type aliases in C, declared with typedef.

C++ requires the types declared in C's stdint.h to have the same
declaration in cstdint.

Tim Rentsch

unread,
Oct 6, 2018, 7:30:57 PM10/6/18
to
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:

> On Sat, 06 Oct 2018 11:36:18 -0700
> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
>> Paavo Helde <myfir...@osa.pri.ee> writes:
>>
>>> However, the problem is that std::int8_t and std::uint8_t
>>> are really just typedefs, not real types,
>>
>> They could be typedefs, or they could be distinct types. AFAICT
>> both the C standard and C++ standard allow this. (I believe it
>> is also the case that the types in C++ must match the types in C,
>> but I haven't tried to verify that.)
>
> They must be type aliases in C, declared with typedef.

Sorry, I am guilty of using sloppy language. The names do
need to be declared in <stdint.h> as if they were declared
with a typedef. What I meant is they do not need to be
aliases for any standard integer type: they could instead be
aliases for extended integer types, and thus distinct types
from any standard integer type (and in particular the standard
character types). C and C++ both allow extended integer
types.

> C++ requires the types declared in C's stdint.h to have the same
> declaration in cstdint.

I believe that is not right. C++ n4659 21.4 covers <cstdint>,
and 21.4.1 p2 says

The header defines all types and macros the same as the C
standard library header <stdint.h>.

The types have to match, but not the declarations. The synopsis
directly under 21.4.1 make it clear that how the types are
declared in <cstdint> is different, or at least can be different,
than how they are declared in the C header <stdint.h>. Then in
section D.5, which discusses the C++ headers corresponding to C
headers, paragraph 3 says this:

Every other C header [including <stdint.h>], each of which
has a name of the form name.h, behaves as if each name placed
in the standard library namespace by the corresponding cname
header is placed within the global namespace scope, except
for the functions described in 29.9.5 [Mathematical special
functions], the declaration of std::byte (21.2.1), and the
functions and function templates described in 21.2.5 [byte
type operations]. It is unspecified whether these names are
first declared or defined within namespace scope (6.3.6) of
the namespace std and are then injected into the global
namespace scope by explicit using-declarations (10.3.3).

To me this looks like pretty convincing evidence that the types
have to be the same, but how they are declared does not.

Tim Rentsch

unread,
Oct 6, 2018, 7:32:10 PM10/6/18
to
It also could be done using extended integer types, which IMO
would be a better choice.

Alf P. Steinbach

unread,
Oct 6, 2018, 10:11:04 PM10/6/18
to
On 07.10.2018 01:31, Tim Rentsch wrote:
> "Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:
>> On 06.10.2018 20:36, Tim Rentsch wrote:
>>> [snip]
>>> Personally I would prefer the [u]int8_t types in particular(*) to
>>> be distinct from the standard character types, to accommodate
>>> better behavior with respect to streams among other reasons.
>>> Stronger type checking is another reason.
>>>
>>> (*) Also other <stdint.h> types that could be character types.
>>
>> You can do that with `enum` types.
>
> It also could be done using extended integer types, which IMO
> would be a better choice.

A compiler vendor can offer extended integer types. For a compiler
vendor it would not be reasonable to offer such types as `enum` types.
So that was not what I was talking about.

But you can not create extended raw integer types, without making your
own version of the compiler. You have to choose between either `enum` or
class type.

However, it's much easier to make your own compiler version now than it
used to be, especially with clang I believe. In the old days g++ was
implemented in pre-K&R C. And it was ugly. ;-)


Cheers & hth.,

- Alf

Chris M. Thomasson

unread,
Oct 7, 2018, 12:46:23 AM10/7/18
to
On 9/18/2018 3:56 AM, Ralf Goertz wrote:
> Hi,
>
> when I just tried to use uint8_t instead of int in a program (because of
> memory issues), I ran into the following problem:
>
> #include <iostream>
>
> int main() {
> uint8_t i;
> std::cin>>i;
> int j=i;
> std::cout<<j<<"\n";
> }
>
> Running this program and entering 4 results in the output of 52. I am
> aware that this because uint8_t is typedef'ed to unsigned char so my
> input is actually the character '4' which has ASCII code 52. However, I
> had the impression that those [u]intX_t types were there so that I can
> do math with them, not to deal with characters.
>
> So what is the canonical way to input/output integral types of varying
> size (in my actual program I use templates)?
>

You want signed 8-bits use int8_t.

You want a signed char, use signed char. ;^)

signed char is at least 8-bits.

Chris Vine

unread,
Oct 7, 2018, 6:17:06 AM10/7/18
to
On Sat, 06 Oct 2018 16:30:42 -0700
Tim Rentsch <t...@alumni.caltech.edu> wrote:
> Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
>
> > On Sat, 06 Oct 2018 11:36:18 -0700
> > Tim Rentsch <t...@alumni.caltech.edu> wrote:
> >
> >> Paavo Helde <myfir...@osa.pri.ee> writes:
> >>
> >>> However, the problem is that std::int8_t and std::uint8_t
> >>> are really just typedefs, not real types,
> >>
> >> They could be typedefs, or they could be distinct types. AFAICT
> >> both the C standard and C++ standard allow this. (I believe it
> >> is also the case that the types in C++ must match the types in C,
> >> but I haven't tried to verify that.)
> >
> > They must be type aliases in C, declared with typedef.
>
> Sorry, I am guilty of using sloppy language. The names do
> need to be declared in <stdint.h> as if they were declared
> with a typedef. What I meant is they do not need to be
> aliases for any standard integer type: they could instead be
> aliases for extended integer types, and thus distinct types
> from any standard integer type (and in particular the standard
> character types). C and C++ both allow extended integer
> types.

You may be theoretically right, but it is entirely theoretical and is
never going to be done in practice for [u]int8_t. On any system
supporting [u]int8_t that you are going to come across in practice, char
will also be 8 bits in size. A C compiler is not going to provide
an extended integer type with the same size and other characteristics as
unsigned char and signed char just on the off chance that it might be
useful for C++'s type overloading on operator << when encountering
[u]int8_t, even if it could. I am not even sure that that would be
legal: "extended" implies something different to the standard integer
types, not something the same as a standard integer type.

> > C++ requires the types declared in C's stdint.h to have the same
> > declaration in cstdint.
>
> I believe that is not right. [snip]
> The types have to match, but not the declarations.

I think you are right about that. However, it doesn't change the
overarching point.

Chris Vine

unread,
Oct 7, 2018, 6:36:22 AM10/7/18
to
On Sun, 7 Oct 2018 11:16:44 +0100
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> wrote:
> You may be theoretically right, but it is entirely theoretical and is
> never going to be done in practice for [u]int8_t. On any system
> supporting [u]int8_t that you are going to come across in practice, char
> will also be 8 bits in size. A C compiler is not going to provide
> an extended integer type with the same size and other characteristics as
> unsigned char and signed char just on the off chance that it might be
> useful for C++'s type overloading on operator << when encountering
> [u]int8_t, even if it could. I am not even sure that that would be
> legal: "extended" implies something different to the standard integer
> types, not something the same as a standard integer type.

The last point is also sort of implied by the wording for the fixed size
types in C99/11:

"These types are optional. However, if an implementation provides
integer types with widths of 8, 16, 32, or 64 bits, no padding bits,
and (for the signed types) that have a two's complement
representation, it shall define the corresponding typedef names."

You can only have the uint8_t typedef aliasing one integer type, and
likewise the int8_t typedef. As in practically all systems supporting
[u]int8_t, signed and unsigned char will be 8 bits without padding and
(for signed) with 2's complement representation, that implies you cannot
on such a system have identical but distinct types by some other name
for the [u]int8_t typedefs.

James Kuyper

unread,
Oct 7, 2018, 12:49:50 PM10/7/18
to
On 10/07/2018 06:16 AM, Chris Vine wrote:
> On Sat, 06 Oct 2018 16:30:42 -0700
> Tim Rentsch <t...@alumni.caltech.edu> wrote:
...
>> Sorry, I am guilty of using sloppy language. The names do
>> need to be declared in <stdint.h> as if they were declared
>> with a typedef. What I meant is they do not need to be
>> aliases for any standard integer type: they could instead be
>> aliases for extended integer types, and thus distinct types
>> from any standard integer type (and in particular the standard
>> character types). C and C++ both allow extended integer
>> types.
>
> You may be theoretically right, but it is entirely theoretical and is
> never going to be done in practice for [u]int8_t. On any system
> supporting [u]int8_t that you are going to come across in practice, char
> will also be 8 bits in size.

That is absolutely mandatory. [u]int8 is required to have exactly 8
bits, they're not allowed to have padding bits. CHAR_BIT is required to
be >= 8. Since every non-bitfield type is required to occupy an integral
number of bytes, the only way that works is if CHAR_BIT == 8.

[u]int8_t can be a typedef for an extended integer type, but it must be
the same size as the character types.

...
> [u]int8_t, even if it could. I am not even sure that that would be
> legal: "extended" implies something different to the standard integer
> types, not something the same as a standard integer type.

You may consider that implied, but there's not a single clause in the
standard that would be violated by supporting such an extended type.

Chris Vine

unread,
Oct 7, 2018, 2:28:12 PM10/7/18
to
I do consider it implied, so it follows that I disagree with you. See
also my other post on the issue.

In any event, no implementation that I know of does what you say, and I
do not think ever will.

David Brown

unread,
Oct 7, 2018, 2:36:24 PM10/7/18
to
In gcc implementations, these types are often defined with something like:

typedef __INT8_TYPE__ int8_t;
typedef __UINT8_TYPE__ uint8_t;

These types are internal symbols in the compiler - whether or not they
are "signed char" and "unsigned char" or extended integer types is not
particularly clear.

On the AVR port of gcc, they are defined as :

typedef int int8_t __attribute__((__mode__(__QI__)));
typedef unsigned int uint8_t __attribute__((__mode__(__QI__)));


That is, they are defined as "int" and "unsigned int" with some
gcc-specific size modifications.

I have no idea /why/ they are defined this way. (I must remember to ask
one of the port maintainers about that!) But they are not simple
typedef's to signed and unsigned char.

Chris Vine

unread,
Oct 7, 2018, 3:08:03 PM10/7/18
to
On Sun, 7 Oct 2018 20:36:12 +0200
David Brown <david...@hesbynett.no> wrote:
[snip]
> In gcc implementations, these types are often defined with something like:
>
> typedef __INT8_TYPE__ int8_t;
> typedef __UINT8_TYPE__ uint8_t;
>
> These types are internal symbols in the compiler - whether or not they
> are "signed char" and "unsigned char" or extended integer types is not
> particularly clear.
>
> On the AVR port of gcc, they are defined as :
>
> typedef int int8_t __attribute__((__mode__(__QI__)));
> typedef unsigned int uint8_t __attribute__((__mode__(__QI__)));
>
>
> That is, they are defined as "int" and "unsigned int" with some
> gcc-specific size modifications.
>
> I have no idea /why/ they are defined this way. (I must remember to ask
> one of the port maintainers about that!) But they are not simple
> typedef's to signed and unsigned char.

Is that a 8-bit processor, with chars and ints both of size 1? Or is
that implementation not strictly C conforming?

James Kuyper

unread,
Oct 7, 2018, 4:01:36 PM10/7/18
to
So, if there is a clause that it violates, you can cite that clause and
explain how such an implementation would be considered in violation of
that clause.

David Brown

unread,
Oct 7, 2018, 4:03:20 PM10/7/18
to
It is an 8-bit processor, chars are 8-bit, but ints are 16-bit. It is
the "mode" attribute that determines the size of the type.

Chris Vine

unread,
Oct 7, 2018, 4:24:08 PM10/7/18
to
On Sun, 7 Oct 2018 16:01:24 -0400
You seem super aggressive today.

I said it was implied, for reasons given in my earlier posts which you
are free to read at your leisure. That should make my position clear
enough I hope. The standard leaves many things implicit.

I am not asking you to agree with me. I am perfectly happy that you do
not.

James Kuyper

unread,
Oct 7, 2018, 5:05:44 PM10/7/18
to
The only reasons you gave aren't sufficient to make an implementation
non-conforming. Someone could say "I think the term 'null pointer
constant' implies that it must have a pointer type" - but the fact that
he feels that way wouldn't make an implementation that recognizes 0 ans
an NPC non=conforming.

Chris Vine

unread,
Oct 7, 2018, 7:22:58 PM10/7/18
to
On Sun, 7 Oct 2018 17:05:32 -0400
James Kuyper <james...@alumni.caltech.edu> wrote:
[snip]
> The only reasons you gave aren't sufficient to make an implementation
> non-conforming. Someone could say "I think the term 'null pointer
> constant' implies that it must have a pointer type" - but the fact that
> he feels that way wouldn't make an implementation that recognizes 0 ans
> an NPC non=conforming.

That is a complete non-sequitur. The standard explicitly states that 0
is a null pointer constant, so implication plays no role: "A null
pointer constant is an integer literal with value zero or a prvalue of
type std::nullptr_t".

James Kuyper

unread,
Oct 8, 2018, 10:44:08 AM10/8/18
to
My point was that when the standard provides a definition for a term,
the only thing that matters to the meaning of that term is the
definition, not any of the implications you might derive from the
ordinary English meaning of the words that make up the term. Here's the
relevant definitions:

"There may also be implementation-defined _extended signed integer
types_." (3.9.1p2).

"for each of the extended signed integer types there exists a
corresponding _extended unsigned integer type_ with the same
amount of storage and alignment requirements.... the extended signed
integer types and _extended unsigned integer_ types are collectively
called the extended integer types." (3.9.1p3)

I've use underscores to bracket the phrase which were italicized, an ISO
convention indicating that the sentences in which the italicized phrase
occurs is considered to constitute the official definition of the term.

If you pay any attention to the ordinary English meaning of "extended"
to reach your conclusion, you're ignoring ISO's conventions that allow
standards to define a meaning for a phrase that need not be particularly
closely related to the ordinary English interpretation of that phrase.

Can you derive the requirement that an extended integer type must be
implemented differently from any standard integer type, without
referencing implications you've derived from the word "extended"?

Tim Rentsch

unread,
Oct 8, 2018, 11:08:17 AM10/8/18
to
David Brown <david...@hesbynett.no> writes:

> To me, "int8_t" says "8-bit signed integer".

The problem is that to other people it says something
different, and none of your arguments or biases is
ever going to change that.

Chris Vine

unread,
Oct 8, 2018, 12:11:09 PM10/8/18
to
On Mon, 8 Oct 2018 10:43:55 -0400
I don't think this one works much better than your last.

The words "there may also be implementation-defined extended signed
integer types" do not tell you what extended signed integer types are
or how they differ from the standard signed integer types (if they did
there would be no argument), just that they may exist.

David Brown

unread,
Oct 9, 2018, 4:12:16 AM10/9/18
to
That is an important point, yes.

I think few people will see it as significantly different, but I have to
accept that /some/ people will.

I can't give anything more than my own experience as a justification for
thinking that "int8_t means 8-bit signed integer" is going to be the
most common interpretation. Perhaps this is biased by by field of
programming - in my line, the size-specific types are very heavily used,
as are "home made" equivalents from before C99.

You didn't answer my question - what names would you personally prefer
or recommend for "small signed integer" and "8-bit -128 to +127 signed
integer"? You have said you feel "int8_t" is a poor name, as it "says
both too much and too little". Have you alternative suggestions?


james...@alumni.caltech.edu

unread,
Oct 9, 2018, 12:36:57 PM10/9/18
to
I left out a couple of other relevant clauses:
"... The standard and extended signed integer types are collectively
called signed integer types." (6.7.1p2)
"The standard and extended unsigned integer types are collectively
called unsigned integer types. ..." (6.7.1p3)
This is what tells you everything you need to know about what extended
integer types are, in order to cope with the possibility that your code
might use them indirectly, such as through a typedef or template
parameter. Everything that the standard says about singed integer types,
or unsigned integer types, or integer types in general, or about
arithmetic types - all apply to the extended integer types the same way
it applies to the standard integer types.

> or how they differ from the standard signed integer types (if they did
> there would be no argument), just that they may exist.

Here's some more relevant quotes that I left out:
"The rank of any standard integer type shall be greater than the rank of
any extended integer type with the same size.
...
The rank of any extended signed integer type relative to another
extended signed integer type with the same size is implementation-
defined, but still subject to the other rules for determining the
integer conversion rank." (6.7.4p1)

The definitions of the extended integer types explains the main way in
which they differ from the standard integer types: they are
implementation-defined rather than defined by the standard. This is the
other main thing you need to know how to use them - where to go to find
out their names: the implementation's documentation. The two clauses
cited above are the only other way that they differ from standard types.

In one sense you're correct - if an extended integer type was otherwise
implemented exactly the same way as a standard integer type, it would
still have to have a lower integer conversion rank, which would probably
have some testable consequences. But it's entirely permissible for that
to be the only difference between them, just as it is permissible for
that to be the only difference between "short" and "int" (as used to be
commonplace), or between "int" and "long" (as is currently commonplace)
or between "short" and "long" (which, as far as I know, was unique to
Cray implementations), or between "long" and "long long" (which has also
occurred in many real-world implementations).

Chris Vine

unread,
Oct 9, 2018, 2:25:29 PM10/9/18
to
> parameter. ... [snip]

Of course it doesn't. Nor does the part of your posting concerning the
fact that integer types have conversion ranks (which I have snipped for
clarity's sake) lend anything at all to the point.

The standard actually says very little about what integer types are,
other than by specifying minimum ranges for them and requiring unsigned
integers to implement a 2^n binary representation for overflow purposes,
leaving the natural (implicit) meaning of "integer" and "integral" to
carry the weight. It also says nothing about how, say, an "extended"
signed integer differs from a "standard" signed integer, saying only
that the extended signed integer type is implementation defined, and
that it must have an unsigned analogue.

The fact that it is implementation defined does not mean that the
implementation can define anything it wants to as an "extended signed
integer type". Each word must be examined for its meaning. The word
"integer" carries the necessary implication that what is implementation
defined must be capable of holding whole numbers in an exact form (it
cannot be outwardly represented in floating point form). The word
"signed" means that that form must be capable of representing negative
numbers. Consideration of what implication (if any) the word "extended"
carries, which is an issue on which two reasonable people could differ,
has been followed by your dogmatic insistence, by reference to portions
of the standard that say nothing about the issue, that the standard
demands that "extended" must have no meaning at all. I disagree on
that.

james...@alumni.caltech.edu

unread,
Oct 9, 2018, 3:32:50 PM10/9/18
to
On Tuesday, October 9, 2018 at 2:25:29 PM UTC-4, Chris Vine wrote:
> On Tue, 9 Oct 2018 09:36:43 -0700 (PDT)
> james...@alumni.caltech.edu wrote:
....
> > "... The standard and extended signed integer types are collectively
> > called signed integer types." (6.7.1p2)
> > "The standard and extended unsigned integer types are collectively
> > called unsigned integer types. ..." (6.7.1p3)
> > This is what tells you everything you need to know about what extended
> > integer types are, in order to cope with the possibility that your code
> > might use them indirectly, such as through a typedef or template
> > parameter. ... [snip]
>
> Of course it doesn't. Nor does the part of your posting concerning the
> fact that integer types have conversion ranks (which I have snipped for
> clarity's sake) lend anything at all to the point.

You said that the definitions didn't explain the differences between
extended and standard integer types. I started out writing that they
fully explained the only difference between the two categories. Then I
remembered that "only" was incorrect. I added the bit about conversion
ranks so as to properly qualify my assertion that the definitions cover
the main difference between those two categories. If I had only said
"main difference" without explaining what the other differences were, it
would only have led to questions.

> The standard actually says very little about what integer types are,
> other than by specifying minimum ranges for them and requiring unsigned
> integers to implement a 2^n binary representation for overflow purposes,
> leaving the natural (implicit) meaning of "integer" and "integral" to
> carry the weight. It also says nothing about how, say, an "extended"
> signed integer differs from a "standard" signed integer, saying only
> that the extended signed integer type is implementation defined,

Every statement the standard makes about signed integer types, integer
types, or arithmetic types, is a statement that constrains the
implementation of extended signed integer types - and there's a lot of
statements of that kind scattered through the standard, especially
section 7. For example, 7.6.9p2 implies that relational operators must
be supported for extended integer types. You've touched on only a small
fraction of all the things it says about such types.
The standard doesn't impose any other constraints on the implementation
of signed integer types.

> ... and
> that it must have an unsigned analogue.

Which is NOT a difference from the standard signed integer types.

> ... Consideration of what implication (if any) the word "extended"
> carries, which is an issue on which two reasonable people could differ,
> has been followed by your dogmatic insistence, by reference to portions
> of the standard that say nothing about the issue, that the standard
> demands that "extended" must have no meaning at all. I disagree on
> that.

No, the word "extended" in "extended integer types" refers very
specifically to the fact that the types are defined by the
implementation as an extension to C++. A type with the same arithmetic properties and representation as signed char, but with a different
conversion rank and without any corresponding operator overloads for
standard library functions that treat it as a character type, would be a
perfectly reasonable extension to C++ - I don't see how the concept of a
extension could be interpreted as prohibiting such a type.

Chris Vine

unread,
Oct 9, 2018, 8:05:37 PM10/9/18
to
On Tue, 9 Oct 2018 12:32:39 -0700 (PDT)
james...@alumni.caltech.edu wrote:
> On Tuesday, October 9, 2018 at 2:25:29 PM UTC-4, Chris Vine wrote:
> > On Tue, 9 Oct 2018 09:36:43 -0700 (PDT)
> > james...@alumni.caltech.edu wrote:
> ....
> > > "... The standard and extended signed integer types are collectively
> > > called signed integer types." (6.7.1p2)
> > > "The standard and extended unsigned integer types are collectively
> > > called unsigned integer types. ..." (6.7.1p3)
> > > This is what tells you everything you need to know about what extended
> > > integer types are, in order to cope with the possibility that your code
> > > might use them indirectly, such as through a typedef or template
> > > parameter. ... [snip]
> >
> > Of course it doesn't. Nor does the part of your posting concerning the
> > fact that integer types have conversion ranks (which I have snipped for
> > clarity's sake) lend anything at all to the point.
>
> You said that the definitions didn't explain the differences between
> extended and standard integer types. I started out writing that they
> fully explained the only difference between the two categories.

I know that, you are restating yourself. I disagree.

> Then I
> remembered that "only" was incorrect. I added the bit about conversion
> ranks so as to properly qualify my assertion that the definitions cover
> the main difference between those two categories. If I had only said
> "main difference" without explaining what the other differences were, it
> would only have led to questions.
>
> > The standard actually says very little about what integer types are,
> > other than by specifying minimum ranges for them and requiring unsigned
> > integers to implement a 2^n binary representation for overflow purposes,
> > leaving the natural (implicit) meaning of "integer" and "integral" to
> > carry the weight. It also says nothing about how, say, an "extended"
> > signed integer differs from a "standard" signed integer, saying only
> > that the extended signed integer type is implementation defined,
>
> Every statement the standard makes about signed integer types, integer
> types, or arithmetic types, is a statement that constrains the
> implementation of extended signed integer types

I never said they didn't.

- and there's a lot of
> statements of that kind scattered through the standard, especially
> section 7. For example, 7.6.9p2 implies that relational operators must
> be supported for extended integer types. You've touched on only a small
> fraction of all the things it says about such types.
> The standard doesn't impose any other constraints on the implementation
> of signed integer types.

None of which are relevant to the issue which began this. I dealt with
the (irrelevant) provisions of the standard which you had successively
argued supported your assertions. I am not going to deal with the
irrelevant provisions you haven't previously argued do so. As an
aside, the most essential property of a signed integer type, that
within a given range it must be capable of holding whole numbers on the
number line in an exact form, is to the best of my knowledge not
explicitly stated in the standard. Nor need it be, as it is
sufficiently implicit.

> > ... and
> > that it must have an unsigned analogue.
>
> Which is NOT a difference from the standard signed integer types.

I never said it was.

> > ... Consideration of what implication (if any) the word "extended"
> > carries, which is an issue on which two reasonable people could differ,
> > has been followed by your dogmatic insistence, by reference to portions
> > of the standard that say nothing about the issue, that the standard
> > demands that "extended" must have no meaning at all. I disagree on
> > that.
>
> No, the word "extended" in "extended integer types" refers very
> specifically to the fact that the types are defined by the
> implementation as an extension to C++. A type with the same arithmetic properties and representation as
> signed char, but with a different
> conversion rank and without any corresponding operator overloads for
> standard library functions that treat it as a character type, would be a
> perfectly reasonable extension to C++ - I don't see how the concept of a
> extension could be interpreted as prohibiting such a type.

Conversion rank is a side issue: to say that any standard type can
become an extended type because extended types have a different
conversion rank from standard types of the same size is a circularity.

This issue has become both full of prolix irrelevance, and pointless.
No implementation is going to do what you say it can do, and I don't
especially care if it didn't. Aside from that I am not going to repeat
my views. They are there for the reading. I do not intend to respond
further unless you misrepresent my views, as I think you have become
close to doing.

Tim Rentsch

unread,
Oct 10, 2018, 7:44:16 AM10/10/18
to
"Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:

> On 07.10.2018 01:31, Tim Rentsch wrote:
>
>> "Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:
>>
>>> On 06.10.2018 20:36, Tim Rentsch wrote:
>>>
>>>> [snip]
>>>> Personally I would prefer the [u]int8_t types in particular(*) to
>>>> be distinct from the standard character types, to accommodate
>>>> better behavior with respect to streams among other reasons.
>>>> Stronger type checking is another reason.
>>>>
>>>> (*) Also other <stdint.h> types that could be character types.
>>>
>>> You can do that with `enum` types.
>>
>> It also could be done using extended integer types, which IMO
>> would be a better choice.
>
> A compiler vendor can offer extended integer types. For a
> compiler vendor it would not be reasonable to offer such types
> as `enum` types. So that was not what I was talking about.

I see, you were changing the subject. The discussion had been
about what implementations could do, and you wanted to talk
about what user programs could do.

Alf P. Steinbach

unread,
Oct 10, 2018, 8:08:02 AM10/10/18
to
Not sure where you get these ideas. But if you refrain from speculating
about motives, then you can save some time. For yourself and others.

I was using of my time to /help/ you. Because you expressed a wish to
have `uint8_t` types "distinct from the standard character types".

Then I get in return that I was changing the subject, as if I was not
comfortable with the current subject. That's a not nice behavior. I want
you to improve in that regard.


Cheers!,

- Alf

James Kuyper

unread,
Oct 10, 2018, 1:29:43 PM10/10/18
to
On 10/09/2018 08:05 PM, Chris Vine wrote:
> On Tue, 9 Oct 2018 12:32:39 -0700 (PDT)
> james...@alumni.caltech.edu wrote:
>> On Tuesday, October 9, 2018 at 2:25:29 PM UTC-4, Chris Vine wrote:
>>> On Tue, 9 Oct 2018 09:36:43 -0700 (PDT)
>>> james...@alumni.caltech.edu wrote:
>> ....
>>>> "... The standard and extended signed integer types are collectively
>>>> called signed integer types." (6.7.1p2)
>>>> "The standard and extended unsigned integer types are collectively
>>>> called unsigned integer types. ..." (6.7.1p3)
>>>> This is what tells you everything you need to know about what extended
>>>> integer types are, in order to cope with the possibility that your code
>>>> might use them indirectly, such as through a typedef or template
>>>> parameter. ... [snip]
>>>
>>> Of course it doesn't. Nor does the part of your posting concerning the
>>> fact that integer types have conversion ranks (which I have snipped for
>>> clarity's sake) lend anything at all to the point.
>>
>> You said that the definitions didn't explain the differences between
>> extended and standard integer types. I started out writing that they
>> fully explained the only difference between the two categories.
>
> I know that,

How in the world could you know what I started out writing? Unless you
have me under servelance, you only saw the version I posted after making
the changes I described later.

> ... you are restating yourself. I disagree.

No, I was not restating myself. I never published that version of the
statement, and I mentioned that version in this context only to explain
that I realized that that version would have been incorrect.

>>> The standard actually says very little about what integer types are,
>>> other than by specifying minimum ranges for them and requiring unsigned
>>> integers to implement a 2^n binary representation for overflow purposes,
>>> leaving the natural (implicit) meaning of "integer" and "integral" to
>>> carry the weight. It also says nothing about how, say, an "extended"
>>> signed integer differs from a "standard" signed integer, saying only
>>> that the extended signed integer type is implementation defined,
>>
>> Every statement the standard makes about signed integer types, integer
>> types, or arithmetic types, is a statement that constrains the
>> implementation of extended signed integer types
>
> I never said they didn't.

No, but your claim that the standard "says very little" about such
matters doesn't hold up when you consider everything it says in that
manner. We're talking about hundreds, possibly thousands of words of
relevant specification.

> - and there's a lot of
>> statements of that kind scattered through the standard, especially
>> section 7. For example, 7.6.9p2 implies that relational operators must
>> be supported for extended integer types. You've touched on only a small
>> fraction of all the things it says about such types.
>> The standard doesn't impose any other constraints on the implementation
>> of signed integer types.
>
> None of which are relevant to the issue which began this. I dealt with
> the (irrelevant) provisions of the standard which you had successively
> argued supported your assertions.

The provisions of the standard that you specify as being irrelevant are,
in fact irrelevant to your main point; but you haven't bothered
explaining or defending your main point, preferring argument by
assertion. As a result, very little of what I have said in response has
been addressed to your main point - most of it has involved correcting
misstatements on your part that you made while not addressing the main
point, such as claiming that the standard says little about extended
integer types, or implying that there was something unexplained about
the differences between extended and standard types.

> I am not going to deal with the
> irrelevant provisions you haven't previously argued do so. As an
> aside, the most essential property of a signed integer type, that
> within a given range it must be capable of holding whole numbers on the
> number line in an exact form, is to the best of my knowledge not
> explicitly stated in the standard. Nor need it be, as it is
> sufficiently implicit.

It's implied by what the standard says about how integer types may be
represented. The representation described can only represent integers,
and can represent every integer value between the smallest and largest
integer values that are representable by it.

>>> ... Consideration of what implication (if any) the word "extended"
>>> carries, which is an issue on which two reasonable people could differ,
>>> has been followed by your dogmatic insistence, by reference to portions
>>> of the standard that say nothing about the issue, that the standard
>>> demands that "extended" must have no meaning at all. I disagree on
>>> that.
>>
>> No, the word "extended" in "extended integer types" refers very
>> specifically to the fact that the types are defined by the
>> implementation as an extension to C++. A type with the same arithmetic properties and representation as
>> signed char, but with a different
>> conversion rank and without any corresponding operator overloads for
>> standard library functions that treat it as a character type, would be a
>> perfectly reasonable extension to C++ - I don't see how the concept of a
>> extension could be interpreted as prohibiting such a type.
>
> Conversion rank is a side issue: ...

Conversion rank is not an issue at all; as I said earlier, I mentioned
it only to make it easier to correctly word my assertion that the
standard does describe the main difference between those type
categories. Conversion rank is the other, much more minor difference,
and irrelevant to my arguments (except insofar as it complicated them by
requiring me to explain "main difference" versus "only difference".

> .... to say that any standard type can
> become an extended type because extended types have a different
> conversion rank from standard types of the same size is a circularity.

I'm not sure what that phrase means to you, but I am quite sure that
phrase does not describe any assertion that I made. I was not talking
about a standard type becoming an extended type. I was talking about a
standard type and an extended type having the same representation and
mathematical properties, just like the fact that, on many systems, "int"
and "long" have the same representation and mathematical properties,
despite being different an incompatible types. And mentioning conversion
rank was something I added only because the standard requires that the
members of any such pair of types must have different conversion rank -
that fact had no other role to play in my argument that such a pair
would be permitted (the pair would definitely not be permitted if it
violated that requirement).

Tim Rentsch

unread,
Oct 11, 2018, 3:57:01 AM10/11/18
to
That's a strawman argument. There are several reasons a C
implementation might choose to keep [u]int8_t distinct from the
character types that have nothing to do with C++. Also the
people who did the C++ implementation could provide their own
C implementation, and choose to keep the [u]int8_t types
separate precisely because they want C++ to take advantage
of that.

> even if it could. I am
> not even sure that that would be legal: "extended" implies
> something different to the standard integer types, not something
> the same as a standard integer type.

I think you're grasping at straws here. Extended integer types
are called "extended" only because they are not standard integer
types. The set of standard integer types can have two or more
types that are "the same" but still are distinct types (and
indeed that is the case in most implementations, either int/long
or long/long long). The rules for integer conversion rank
tacitly acknowledge the possibility of extended integer types
that are "the same" but still are distinct types. Extended
integer types are always disjoint from standard integer types by
virtue of how they are defined. But being distinct doesn't imply
they can't be "the same": we have standard integer types that
are distinct but "the same", and extended integer types that are
distinct but "the same", so there is no reason to suppose that an
extended integer type, which is always distinct from every
standard integer type, can't be "the same" as a standard integer
type. Are you confusing the notions of being distinct and having
similar characteristics? The C and C++ standards clearly admit
the possibility of having two integer types that have the same
size, width, representation, alignment, and so forth, and yet
still are distinct types. I don't see any evidence that the rule
is any different for a standard integer type and an extended
integer type.

Chris Vine

unread,
Oct 11, 2018, 6:50:50 AM10/11/18
to
On Thu, 11 Oct 2018 00:56:43 -0700
I can't see how you can say it is a "strawman argument", which is a
ruse (that is, from Wikipedia: "an informal fallacy based on giving the
impression of refuting an opponent's argument, while actually refuting
an argument that was not presented by that opponent"). I was saying
that even if you were right, in a case where unsigned/signed char meet
the requirements for [u]int8_t you would not in practice come across an
"extended" type with the same size and other characteristics as that
unsigned/signed char for the [u]int8_t typedef. It was a prediction,
and a legitimate one even if you disagree with it. The refutation of
your argument, in which I indicated why you may not be right, was in my
text below ("I am not even sure that that would be legal ..."), which
you also disagree with.

Of course, the proof of the pudding as regards what happens "in
practice" is what compiler vendors do. I am content to be proved wrong
about that.
I think this one has been beaten to death. I note that you don't agree
with me.
It is loading more messages.
0 new messages