Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bit-fields of type unsigned long and unsigned long long

56 views
Skip to first unread message

Philipp Klaus Krause

unread,
Jun 22, 2021, 10:49:55 AM6/22/21
to
Since bit-fields of type unsigned long and unsigned long long are
useful, commonly suppoted by implementations, and commonly used, I'd
like to see the standard support them.

Here's a draft of a proposal for that:

http://www.colecovision.eu/stuff/proposal-unsigned-long-bit-field.html

David Brown

unread,
Jun 22, 2021, 11:18:16 AM6/22/21
to
On 22/06/2021 16:49, Philipp Klaus Krause wrote:
> Since bit-fields of type unsigned long and unsigned long long are
> useful, commonly suppoted by implementations, and commonly used, I'd
> like to see the standard support them.

The standard /does/ support them - it simply does not mandate that an
implementation supports them. It's good to be accurate in your wording
for a proposal like that.

In my own code, I don't believe I have ever had need for types longer
than "unsigned int" in bitfields - though I have certainly used the type
"uint32_t" which happens to be "unsigned long" on several targets even
though "unsigned int" is the same size.

But I have a lot more use of smaller integer types. And I use
enumerated types in bitfields too, though these usually require a
minimum size matching "int" (unless you use compiler-specific extensions
to change that).


I would suggest that it is pointless to add to the list of types
mandated in the C standard here. Rather, it should simply say that a
bit-field shall have "an integer type". That covers all sizes, shorter
than int as well as longer, signed as well as unsigned. After all,
pretty much any serious modern C compiler will already support all
integer types here.

And I would also be even happier with "an integer type or enumerated type".

Keith Thompson

unread,
Jun 22, 2021, 2:17:45 PM6/22/21
to
David Brown <david...@hesbynett.no> writes:
> On 22/06/2021 16:49, Philipp Klaus Krause wrote:
>> Since bit-fields of type unsigned long and unsigned long long are
>> useful, commonly suppoted by implementations, and commonly used, I'd
>> like to see the standard support them.
>
> The standard /does/ support them - it simply does not mandate that an
> implementation supports them. It's good to be accurate in your wording
> for a proposal like that.

In the proposal, I suggest:

This adds unsigned long and unsigned long long to the +required+
supported types for bit-fields.

> In my own code, I don't believe I have ever had need for types longer
> than "unsigned int" in bitfields - though I have certainly used the type
> "uint32_t" which happens to be "unsigned long" on several targets even
> though "unsigned int" is the same size.
>
> But I have a lot more use of smaller integer types. And I use
> enumerated types in bitfields too, though these usually require a
> minimum size matching "int" (unless you use compiler-specific extensions
> to change that).
>
> I would suggest that it is pointless to add to the list of types
> mandated in the C standard here. Rather, it should simply say that a
> bit-field shall have "an integer type". That covers all sizes, shorter
> than int as well as longer, signed as well as unsigned. After all,
> pretty much any serious modern C compiler will already support all
> integer types here.
>
> And I would also be even happier with "an integer type or enumerated type".

I agree. If you're going to mandate support for additional types, why
not go all the way?

Currently, plain int bit fields have implementation-defined signedness.
I'm not a big fan of that rule, but logically it should be extended to
bit fields of other integer types specified without a "signed" or
"unsigned" keyword -- except that a plain char bit field should IMHO
have the same signedness as plain char.

Signed bit fields are questionably useful, but support for them is
already required, and I'm sure there's code that depends on them.

>> Here's a draft of a proposal for that:
>>
>> http://www.colecovision.eu/stuff/proposal-unsigned-long-bit-field.html

"support" is misspelled in the Justification section.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Philipp Klaus Krause

unread,
Jun 23, 2021, 2:25:52 AM6/23/21
to
Am 22.06.21 um 20:17 schrieb Keith Thompson:
>>
>> But I have a lot more use of smaller integer types. And I use
>> enumerated types in bitfields too, though these usually require a
>> minimum size matching "int" (unless you use compiler-specific extensions
>> to change that).
>>
>> I would suggest that it is pointless to add to the list of types
>> mandated in the C standard here. Rather, it should simply say that a
>> bit-field shall have "an integer type". That covers all sizes, shorter
>> than int as well as longer, signed as well as unsigned. After all,
>> pretty much any serious modern C compiler will already support all
>> integer types here.
>>
>> And I would also be even happier with "an integer type or enumerated type".
>
> I agree. If you're going to mandate support for additional types, why
> not go all the way?

I am worried that support for types smaller than int¹ and signed types²
could be controversial. So I'm trying to go for the two types that seem
most useful, most used, and, I hope, least controversial.

¹ People might disagree on questions of type and promotion.
² People might disagree on signed / unsigned for plain types and on
support for 1-bit bit-fields.

Philipp

P.S.: Even today, people disagree on some questions on the type of
bit-fields. Even implementations do (e.g. GCC vs. clang). And some say
that the standard is intentionally ambiguous to make the type
essentially implementation-defined.

P.P.S.: Thanks to everyone for the comments on wording.

David Brown

unread,
Jun 23, 2021, 2:59:25 AM6/23/21
to
I can appreciate your concerns about controversy. However, I suspect (I
have no statistics, surveys, reports or evidence beyond a few samples)
there are two groups of people who use bitfields:

1. Those that stick to standards-mandated and guaranteed portable types
(bool, signed int, unsigned int).

2. Those that use types supported by the single compiler they use, or by
most real compilers (whatever integer type they want).

Who is your proposal for? Who are you trying to help?

As it stands, it could help group 1 if they need longer unsigned types
in bitfields. But it won't do anything much for group 2, who already
use many different sizes (including smaller sizes). Group 2 programmers
outnumber group 1 programmers by orders of magnitude - relatively few C
programmers need to write fully portable C code restricted only to the
guarantees in the C standards, and those that do are unlikely to make
much use of bitfields because the specifications in the standards are so
weak.

If you follow my recommendation and mandate support for /all/ integer
types, then group 1 get significantly more flexibility and group 2 now
know that their code is more portable (though many details are still
implementation-dependent, at least they can expect the same results for
different compilers on the same target and ABI). It's not going to make
a big difference to anyone, but it would be nice to have the standard
reflect common usage.

I agree that your more limited proposal is more likely to get accepted
than a wider one. I simply don't think the limited version is worth the
effort - it needs to be wider to be of relevance in real code. So,
IMVHO (and I'm not the one making or submitting the proposal) it makes
sense to gamble for the big prize, and risk rejection for the more
controversial proposal.



Keith Thompson

unread,
Jun 23, 2021, 3:39:23 AM6/23/21
to
Are there any implementations that support bit fields of types other
than the required ones that *don't* support them for all integer types?
I dislike arbitrary restrictions, especially if most implementations
won't impose those restrictions anyway. I wouldn't be surprised if all
implementations that support non-standard bit field types do so
consistently.

Philipp Klaus Krause

unread,
Jun 23, 2021, 7:12:50 AM6/23/21
to
Am 23.06.21 um 08:59 schrieb David Brown:
>
> Who is your proposal for? Who are you trying to help?

Programmers that need bit-fields wider than 16 bits, and want them to be
(somewhat more) portable.

> If you follow my recommendation and mandate support for /all/ integer
> type

I think that goes too far. I don't want to mandate support for extended
integer types.

>
> I agree that your more limited proposal is more likely to get accepted
> than a wider one. I simply don't think the limited version is worth the
> effort - it needs to be wider to be of relevance in real code. So,
> IMVHO (and I'm not the one making or submitting the proposal) it makes
> sense to gamble for the big prize, and risk rejection for the more
> controversial proposal.
>

Some more far-reaching proposals on integers including bit-fields have
been rejected in the past.

I think I'll try to clean up my proposal for unsigned long and unsigned
long long, and add two questions at the end: "Does WG14 want to see a
proposal for further standard integer types along the lines of this
one?" and "Does WG14 want to see a proposal for bit-precise integer
types along the lines of this one?".

Philipp

Philipp Klaus Krause

unread,
Jun 23, 2021, 7:22:44 AM6/23/21
to
Am 22.06.21 um 16:49 schrieb Philipp Klaus Krause:
I now realize that in some way, the type question is even harder for
types wider than int than for types narrower than it. We already have
divergence on the type between GCC and clang for unsigned int
bit-fields. But there it only matters when you use _Generic (i.e. for
clang _Generic on an unsigned int bit-field will match unsigned int, but
for clang it won't).
For other uses, we have the integer promotions that work on anything
smaller than int. So x.b << z behaves similar on GCC vs clang, since x.b
is promoted to int (there is still the difference of promoting to int
vs. promoting to unsigned int).
But for an unsigned long bit-field x.b << z will often give a different
result on GCC vs clang: The topmost bits of the result might be 1 for
clang where they are 0 for GCC (as clang does the shift on an unsigned
long, while GCC does it on a type just as wide as the bit-field).

David Brown

unread,
Jun 23, 2021, 8:14:39 AM6/23/21
to
On 23/06/2021 13:12, Philipp Klaus Krause wrote:
> Am 23.06.21 um 08:59 schrieb David Brown:
>>
>> Who is your proposal for? Who are you trying to help?
>
> Programmers that need bit-fields wider than 16 bits, and want them to be
> (somewhat more) portable.
>
>> If you follow my recommendation and mandate support for /all/ integer
>> type
>
> I think that goes too far. I don't want to mandate support for extended
> integer types.
>

Given that no compiler (to my knowledge - and I'd be very interested in
counter-examples) actually has extended integer types, that is surely
not a problem.

>>
>> I agree that your more limited proposal is more likely to get accepted
>> than a wider one. I simply don't think the limited version is worth the
>> effort - it needs to be wider to be of relevance in real code. So,
>> IMVHO (and I'm not the one making or submitting the proposal) it makes
>> sense to gamble for the big prize, and risk rejection for the more
>> controversial proposal.
>>
>
> Some more far-reaching proposals on integers including bit-fields have
> been rejected in the past.
>
> I think I'll try to clean up my proposal for unsigned long and unsigned
> long long, and add two questions at the end: "Does WG14 want to see a
> proposal for further standard integer types along the lines of this
> one?" and "Does WG14 want to see a proposal for bit-precise integer
> types along the lines of this one?".
>

The bit-precise (<stdint.h>) integer types are all defined in terms of
integer types. (In theory, they could be extended integer types - in
practice, they are always standard integer types.) So your second
question here is superfluous.

The first question would be a reasonable compromise between your
original more limited proposal, and the wider one. I don't really know
how these proposals work, and how much they are all-or-nothing or for
questions and discussions, so you are in a better position to judge this
than I am.


Philipp Klaus Krause

unread,
Jun 23, 2021, 11:45:36 AM6/23/21
to
> Given that no compiler (to my knowledge - and I'd be very interested in
> counter-examples) actually has extended integer types, that is surely
> not a problem.

I think they are mostly used in lesser-known compilers targeting
embedded systems; but AFAIK both GCC and clang have a 24-bit extended
integer type when targeting avr, also used as int24_t and uint24_t in
stdint.h.

>
> The bit-precise (<stdint.h>) integer types are all defined in terms of
> integer types. (In theory, they could be extended integer types - in
> practice, they are always standard integer types.) So your second
> question here is superfluous.

I meant the C23 bit-precise integer types from stdbitint.h, as per N2709.

>
> The first question would be a reasonable compromise between your
> original more limited proposal, and the wider one. I don't really know
> how these proposals work, and how much they are all-or-nothing or for
> questions and discussions, so you are in a better position to judge this
> than I am.
>

The proposal champion (usually an author, but can be someone else if no
author can attend the meeting) presents the proposal. Then there is a
discussion (with a time limit). Afterwards there usually are straw polls
on questions (usually suggested by the champion, but can be someone
else). Typical questions styles are "Does the comittee want <proposal>
for <next version of C standard>, as in Nxxxx?" (directly deciding) and
"Does the comittee want <proposal> along the lines of Nxxxx?" (to see
which direction to proceed, typically when the feeling from the
discussion is that the proposal needs further polishing, and the author
wants to know if it is worth doing so.

If you want to see some examples, have a look at the minutes fo a
meeting, e.g. N2691, though the words attributed to attendants there are
usually a summary of the stated position as perceived by the one writing
the minutes rather than the exact words (would be too long otherwise -
especially examples, references and arguments given to support the
position rarely appear in the minutes). Straw poll question wording and
results are AFAIK always exact though.

Philipp Klaus Krause

unread,
Jun 23, 2021, 12:28:01 PM6/23/21
to
Am 23.06.21 um 13:22 schrieb Philipp Klaus Krause:

> For other uses, we have the integer promotions that work on anything
> smaller than int. So x.b << z behaves similar on GCC vs clang, since x.b
> is promoted to int (there is still the difference of promoting to int
> vs. promoting to unsigned int).

My mistake here: they always promote to int, as per 6.3.1.1 of the
latest standard draft N2596.

Keith Thompson

unread,
Jun 23, 2021, 1:22:13 PM6/23/21
to
Philipp Klaus Krause <p...@spth.de> writes:
>> Given that no compiler (to my knowledge - and I'd be very interested in
>> counter-examples) actually has extended integer types, that is surely
>> not a problem.
>
> I think they are mostly used in lesser-known compilers targeting
> embedded systems; but AFAIK both GCC and clang have a 24-bit extended
> integer type when targeting avr, also used as int24_t and uint24_t in
> stdint.h.

gcc's documentation says "GCC does not support any extended integer types.".
I don't see anything about 24-bit integers.

[...]

Philipp Klaus Krause

unread,
Jun 23, 2021, 1:52:40 PM6/23/21
to
Am 23.06.21 um 19:22 schrieb Keith Thompson:
> Philipp Klaus Krause <p...@spth.de> writes:
>>> Given that no compiler (to my knowledge - and I'd be very interested in
>>> counter-examples) actually has extended integer types, that is surely
>>> not a problem.
>>
>> I think they are mostly used in lesser-known compilers targeting
>> embedded systems; but AFAIK both GCC and clang have a 24-bit extended
>> integer type when targeting avr, also used as int24_t and uint24_t in
>> stdint.h.
>
> gcc's documentation says "GCC does not support any extended integer types.".
> I don't see anything about 24-bit integers.
>
> [...]
>

Hmm, https://gcc.gnu.org/wiki/avr-gcc mentions 24-bit integers. Maybe
they are 24-bit integers not considered extended integer types in the
sense of the standard? Though I don't know what they'd do that; AFAIK
that only makes sense if you want something that behaves like an integer
bigger than intmax_t.

Tim Rentsch

unread,
Jun 23, 2021, 1:53:21 PM6/23/21
to
Philipp Klaus Krause <p...@spth.de> writes:

Here is a possible (and simpler) alternative.

Type specifier for bit-fields is _Bool, int, signed, or unsigned
(with signed int and unsigned int being synonymous with signed
and unsigned, respectively).

Bit-fields of type _Bool act just like they do now.

Bit-fields of type int are either the same as an unsigned
bit-field or the same as a signed bit-field, with the choice
being implementation-defined, just like they are now.

Bit-fields of types signed and unsigned may use widths up to the
widths of intmax_t and uintmax_t, respectively.

The type of a signed bit-field is the first of signed, long,
long long, and intmax_t, that is wide enough to hold all the
possible values of the bit-field.

The type of an unsigned bit-field is the first of unsigned,
unsigned long, unsigned long long, and uintmax_t, that is wide
enough to hold all the possible values of the bit-field.

For compatibility with current integer promotion rules, an access
for an unsigned bit-field whose range of possible values are all
representable within the range of signed int yields a value that
has been promoted to type signed int, just like such things are
done now. In all other cases the type of a bit-field access
is the type of the bit-field, as stated above.

(Personally I like this approach better than allowing different
integer types as bit-field specifiers.)

Philipp Klaus Krause

unread,
Jun 23, 2021, 2:13:53 PM6/23/21
to
Am 23.06.21 um 19:53 schrieb Tim Rentsch:
> Bit-fields of types signed and unsigned may use widths up to the
> widths of intmax_t and uintmax_t, respectively.
>

WG14 usually doesn't like to use the same syntax as C++ but with a
different meaning.
AFAIK, in C++, you can have an int bit-field wider than int (or whatever
else the type is), but the upper bits then are padding bits.

Keith Thompson

unread,
Jun 23, 2021, 7:12:14 PM6/23/21
to
Interesting. It does mention __int24 and __uint24 under "Extensions",
so apparently they're language extensions (allowed under 4p6) rather
than extended integer types. Perhaps they don't meet all the
requirements of (extended) integer types. Since the compiler appears
not to be fully conforming anyway, I'm not sure how much that matters
(except that it would have given us an example of extended integer types
in the wild).

Similarly, gcc supports __int128 and unsigned __int128 (note the
different syntax) as a language extension on some target systems.

David Brown

unread,
Jun 24, 2021, 4:32:03 AM6/24/21
to
On 23/06/2021 17:45, Philipp Klaus Krause wrote:
>> Given that no compiler (to my knowledge - and I'd be very interested in
>> counter-examples) actually has extended integer types, that is surely
>> not a problem.
>
> I think they are mostly used in lesser-known compilers targeting
> embedded systems; but AFAIK both GCC and clang have a 24-bit extended
> integer type when targeting avr, also used as int24_t and uint24_t in
> stdint.h.
>

I can't answer for clang. But gcc explicitly says it does not support
any extended integer types:
<https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html>

For the AVR (an 8-bit device, in case anyone reading is unfamiliar) it
/does/ support certain 24-bit types. In particular it supports 24-bit
pointers into a named address space "__memx", to handle generic pointers
as flash and ram are independent hardware address spaces on these
devices. However, "void *" is still 16-bit, and uintptr_t is still
16-bit. The 24-bit addresses are part of the named address extensions,
and not normal pointers.

And it supports a type "__int24" in the same manner as gcc on bigger
devices supports "__int128" - in most respects it works like an integer
type, but it specifically is /not/ an "extended integer type" (or an
"integer type") in C standards terms. And that means that there is /no/
int24_t or uint24_t in a standards compliant C library for the AVR with
gcc. You have to write "typedef __int24 int24_t;" yourself.

I would be surprised if clang is different from gcc here, but I have not
looked at it.

I don't know why gcc doesn't support 24-bit integers on the AVR as an
extended integer type - the prime reason it can't make __int128 into an
extended integer type on 64-bit systems is that it would need to change
the size of intmax_t, which would lead to all sorts of messes.

>>
>> The bit-precise (<stdint.h>) integer types are all defined in terms of
>> integer types. (In theory, they could be extended integer types - in
>> practice, they are always standard integer types.) So your second
>> question here is superfluous.
>
> I meant the C23 bit-precise integer types from stdbitint.h, as per N2709.
>

OK. These are new to me, and I think it would be worth expanding on
that a little (giving the reference as you have here). Now that I
understand it, I agree with that second question too.

David Brown

unread,
Jun 24, 2021, 4:41:04 AM6/24/21
to
avr-gcc is very close to conforming (as freestanding rather than hosted)
these days, if you pick the right options (such as for 64-bit doubles).
It is obviously limited by the size of the target devices (it's not
easy to have 1023 members in a struct on a device that might have 256
bytes of ram!). And last I checked it was missing a few maths
functions, some wide character support, and a few other minor bits and
pieces. Still, full conformance is not particularly important to users
of such embedded tools.

David Brown

unread,
Jun 24, 2021, 5:02:44 AM6/24/21
to
This would give a very different result from the way compilers implement
bitfields today. In particular, consider :

sizeof(struct { int8_t a : 1; });

today, that is 1 on compilers that accept any integer type in bit
fields. With your proposal, it would be sizeof(int).

sizeof(struct { int64_t a : 1; });

today, that is 8 (assuming 8-bit char), while with your proposal it
would be sizeof(int).


It would be conceivable that an implementation could make the types of
the bitfields act the way you describe, while keeping the sizes (and
alignments, padding, splitting, etc.) of the bitfields the same as
today. But it would be needlessly confusing, and lead to other
differences (such as when the bitfields are used in other expressions).
The only sane approach to bitfield types is that the type of the
bitfield is the type given by the programmer when they declared the
bitfield. (And here I agree with Keith that it is unfortunate that a
bitfield declared as type "int" might be signed or unsigned.)


I also note that your "simpler" alternative is significantly longer than
Philipp's original proposal or the even simpler, obvious, and currently
implemented change:

"A bit-field shall have a type that is a qualified or unqualified
version of an integer type".

That's what compilers do today, that's what programmers use today, and
it is shorter, simpler and clearer than any other version. And it is
consistent with C++ (which also allows enumeration types - which would
be another nice addition to the C standard).

Keith Thompson

unread,
Jun 24, 2021, 2:06:43 PM6/24/21
to
How does that follow? The bit field occupies only 1 bit. Why should
its underlying type affect the size of the structure?

I know that with gcc and clang, these structures have different sizes:
struct { _Bool bitfield:1; }
struct { unsigned bitfield:1; }
but I don't see anything in the standard that suggests they should.
(I vaguely recall that it's an ABI requirement.)

> sizeof(struct { int64_t a : 1; });
>
> today, that is 8 (assuming 8-bit char), while with your proposal it
> would be sizeof(int).
>
> It would be conceivable that an implementation could make the types of
> the bitfields act the way you describe, while keeping the sizes (and
> alignments, padding, splitting, etc.) of the bitfields the same as
> today. But it would be needlessly confusing, and lead to other
> differences (such as when the bitfields are used in other expressions).
> The only sane approach to bitfield types is that the type of the
> bitfield is the type given by the programmer when they declared the
> bitfield. (And here I agree with Keith that it is unfortunate that a
> bitfield declared as type "int" might be signed or unsigned.)

The "type of a bit-field" seems like a tricky concept. Consider:

struct { unsigned int bf: 1; };

N1570 6.7.2.1p5 says:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
which implies that bf is of type unsigned int. But p10 says:
A bit-field is interpreted as having a signed or unsigned integer
type consisting of the specified number of bits.
So the type of bf is an unsigned type consisting of 1 bit, and clearly
that's not unsigned int.

I think p5 is referring to the type used to define the bit-field, and
p10 is referring to the (anonymous) type that the bit-field actually
ends up having. IMHO it could be worded more clearly.

> I also note that your "simpler" alternative is significantly longer than
> Philipp's original proposal or the even simpler, obvious, and currently
> implemented change:
>
> "A bit-field shall have a type that is a qualified or unqualified
> version of an integer type".
>
> That's what compilers do today, that's what programmers use today, and
> it is shorter, simpler and clearer than any other version. And it is
> consistent with C++ (which also allows enumeration types - which would
> be another nice addition to the C standard).

Any proposal should take C++ into account, both to steal ideas and to
avoid inconsistencies (like C++'s statement that if the specified width
is more than the width of the specified type, the extra bits are padding
bits).

Tim Rentsch

unread,
Jun 24, 2021, 2:48:33 PM6/24/21
to
Philipp Klaus Krause <p...@spth.de> writes:

The obvious answer is to ask the C++ committee to change their
silly rule. And if they don't want to change it, just ignore it;
the C++ rule is already incompatible with C, now it will just be
incompatible in a different way. The gap between C and C++ has
increased to the point where they should be treated as separate
and independent languages.

Keith Thompson

unread,
Jun 24, 2021, 3:23:36 PM6/24/21
to
It's no more incompatible than any other C++ feature that doesn't exist
in C. A program that defines a bit field with a length that exceeds the
width of the type is a constraint violation in C, and valid in C++.

On the other hand, I agree that the C++ rule *seems* silly. I wonder
what the rationale is. (The earliest reference to the rule that I found
is in the C++ 1998 standard.)

It would be nice to avoid *gratuitous* incompatibilities between C and
C++.

Philipp Klaus Krause

unread,
Jun 24, 2021, 4:14:19 PM6/24/21
to
Am 24.06.21 um 20:48 schrieb Tim Rentsch:
>
> The obvious answer is to ask the C++ committee to change their
> silly rule. And if they don't want to change it, just ignore it;
> the C++ rule is already incompatible with C, now it will just be
> incompatible in a different way. The gap between C and C++ has
> increased to the point where they should be treated as separate
> and independent languages.
>

WG14 and WG21 try to keep the common subset compatible, especially if it
is something that might appear in headers (for other code this matters
less - after all non-header source is usually either C or C++, but
headers are commonly shared).
And bit-fields are in struct declarations which are often in headers.

David Brown

unread,
Jun 24, 2021, 4:40:37 PM6/24/21
to
On 24/06/2021 21:23, Keith Thompson wrote:
> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>> Philipp Klaus Krause <p...@spth.de> writes:
>>> Am 23.06.21 um 19:53 schrieb Tim Rentsch:
>>>> Bit-fields of types signed and unsigned may use widths up to the
>>>> widths of intmax_t and uintmax_t, respectively.
>>>
>>> WG14 usually doesn't like to use the same syntax as C++ but with a
>>> different meaning.
>>> AFAIK, in C++, you can have an int bit-field wider than int (or whatever
>>> else the type is), but the upper bits then are padding bits.
>>
>> The obvious answer is to ask the C++ committee to change their
>> silly rule. And if they don't want to change it, just ignore it;
>> the C++ rule is already incompatible with C, now it will just be
>> incompatible in a different way. The gap between C and C++ has
>> increased to the point where they should be treated as separate
>> and independent languages.
>
> It's no more incompatible than any other C++ feature that doesn't exist
> in C. A program that defines a bit field with a length that exceeds the
> width of the type is a constraint violation in C, and valid in C++.
>
> On the other hand, I agree that the C++ rule *seems* silly. I wonder
> what the rationale is. (The earliest reference to the rule that I found
> is in the C++ 1998 standard.)
>
> It would be nice to avoid *gratuitous* incompatibilities between C and
> C++.
>

I'd be entirely happy with a rule in C and C++ that limited the size of
a bit-field to the size of the type - I agree it seems very strange that
C++ allows them to be bigger. But the other key difference in C++
bit-fields is that any integer type or enumeration type is allowed, and
that should (IMHO) be copied by the C standards.

David Brown

unread,
Jun 24, 2021, 5:11:44 PM6/24/21
to
The current practice is that the size (and alignment) of the struct or
its parts comes from the type used to declare the bit-field - just as
for any other field. As I said below, it would be conceivable for a
compiler to use the programmer's specified type to set the size of the
containing struct or addressable storage unit, while ignoring it for the
type of the field and how it is interpreted when used in an expression.
That would, however, seem arbitrary and counter-intuitive, as well as
being contrary to current practice and (if my interpretation is correct
- but I might be wrong here) to C++ standards.

>
> I know that with gcc and clang, these structures have different sizes:
> struct { _Bool bitfield:1; }
> struct { unsigned bitfield:1; }
> but I don't see anything in the standard that suggests they should.

I have used a few compilers that didn't support anything but "int"
(signed or unsigned) in bit-fields. (They were all C90 compilers, so no
_Bool.) I have never known a compiler that was happy with different
integer types for bit-fields and did anything other than have sizes that
come directly from the types specified for the bit-field. I can't claim
to have done extensive testing here, as it has not occurred to me that
they might do something different, but I regularly check that the sizes
of my structs are as expected. And for most of my work, there are
manufacturer-supplied headers defining structs and addresses for
accessing peripherals - they too regularly use bit-fields, and expect
the sizes to work rationally.

The standards, however, leave most such details as implementation
dependent. As you say, ABI's might be more specific.
Indeed it could be clearer. The real type of the bit-field arguably
does not matter much, since you cannot take a pointer to the bit-field.
The most important aspect of its type is how it is treated in an
expression, as described in p10. The only way the type can be checked
directly is, I think, with _Generic - and gcc and clang consider the
bit-field width to be part of the type. But that is pretty obscure
usage. (In C++, the type might be used in other cases such as overloads
- and there the standards are explicit that the bit-field width is not
part of the type of the field, which seems simpler and clearer.)

>
>> I also note that your "simpler" alternative is significantly longer than
>> Philipp's original proposal or the even simpler, obvious, and currently
>> implemented change:
>>
>> "A bit-field shall have a type that is a qualified or unqualified
>> version of an integer type".
>>
>> That's what compilers do today, that's what programmers use today, and
>> it is shorter, simpler and clearer than any other version. And it is
>> consistent with C++ (which also allows enumeration types - which would
>> be another nice addition to the C standard).
>
> Any proposal should take C++ into account, both to steal ideas and to
> avoid inconsistencies (like C++'s statement that if the specified width
> is more than the width of the specified type, the extra bits are padding
> bits).
>

Agreed.

Keith Thompson

unread,
Jun 24, 2021, 6:16:17 PM6/24/21
to
David Brown <david...@hesbynett.no> writes:
> On 24/06/2021 20:06, Keith Thompson wrote:
>> David Brown <david...@hesbynett.no> writes:
[...]
>>> This would give a very different result from the way compilers implement
>>> bitfields today. In particular, consider :
>>>
>>> sizeof(struct { int8_t a : 1; });
>>>
>>> today, that is 1 on compilers that accept any integer type in bit
>>> fields. With your proposal, it would be sizeof(int).
>>
>> How does that follow? The bit field occupies only 1 bit. Why should
>> its underlying type affect the size of the structure?
>
> The current practice is that the size (and alignment) of the struct or
> its parts comes from the type used to declare the bit-field - just as
> for any other field. As I said below, it would be conceivable for a
> compiler to use the programmer's specified type to set the size of the
> containing struct or addressable storage unit, while ignoring it for the
> type of the field and how it is interpreted when used in an expression.
> That would, however, seem arbitrary and counter-intuitive, as well as
> being contrary to current practice and (if my interpretation is correct
> - but I might be wrong here) to C++ standards.

(For simplicity, assume CHAR_BIT==8 and sizeof(int)==4.)

I understand that it's existing practice, but it just doesn't make any
sense to me. If I define a struct with a bit-field defined as "unsigned
int:1", it's a single bit, not a 32-bit unsigned int object. I can pack
8 of them into a single byte. I just don't see why the declared type
of a bit field should affect the size of the containing structure when
it has no effect on the size of the bit field itself. The structure
doesn't *need* those 32 bits of storage. If I add a second unsigned:1
bit-field, the structure doesn't grow by another 32 bits.

As N1570 6.7.2.1p10 says, "A bit-field is interpreted as having a signed
or unsigned integer type consisting of the specified number of bits.",
so the bit-field object isn't really of type unsigned int.

For an implementation that doesn't support types other than the
required ones, any struct with a 2-bit bit-field would have to be
at least 32 bits, while a struct with an unsigned char member could
be as small as 8 bits. (Same for a 1-bit bit-field pre-C99.) That
seems to me like an arbitrary restriction.

I don't use bit-fields much, so maybe I'm missing some reason why this
behavior is reasonable and/or useful.

[...]

David Brown

unread,
Jun 25, 2021, 4:43:45 AM6/25/21
to
As I see it, bit-fields are used for two main purposes.

One is for structures internal to a program, in order to reduce space.
There are two typical reasons for that. You might have a small embedded
system with extremely small ram size (though these are getting less
common, and since they are very rarely used with anything newer than
C99, changes to the standard matter little there). Or you might have a
bigger system with very large quantities of data, so that packing the
data is important for efficiency. Common for these is that you don't
really care about things like ordering, but you /do/ care that the sizes
are what you expect. In particular, for efficiency on big systems too
small is not better than too big - an array of 60 byte structs (with 4
byte alignment) is going to be a lot slower than an array of 64 byte
structs for many purposes.

The other main use for bitfields is for matching external structures.
These can be part of file formats, network packet formats, calls to
functions from other languages, or hardware and peripherals. For this
kind of use, precise specifications of all sizes, alignments, bit
ordering, etc., is essential. The alternative is to use masks and
shifting which is ugly and error-prone, but more portable.

The bit-field requirements in the C specifications today are too limited
and under-specified to be of much use for anything - they are not enough
to do a good job of any common use-cases. (They could be useful for
making compact structures for large arrays in the days before caches,
when precise sizes mattered less.)

The practical reality of bit-fields is that people use them based on the
specifications given by their compiler and/or target ABI, using
extensions provided by almost all good compilers (allowing any integer
type or enumeration type), or using undocumented assumptions about
sizes, alignments, ordering, etc., for their compiler and target.

A change to the C standards which does not take them nearer to
guaranteeing the practical use of today's tools and needs of today's
programmers, is useless.

A change to the C standards that goes against today's practice is worse
than useless.

A change that differs from the C++ standards (except in regarding the
silly support for over-sized bit-fields) is worse than useless.

A change that involves long, complicated reasoning about sizes and types
but provides enough implementation-defined wiggle room to fit with
today's implementations, is useless. The last thing the C standards
need is more language that is confusing, open to interpretation, vague,
and unhelpful to either compiler implementers or C programmers.

If the C standard can't be changed to explicitly allow /all/ integer
types in bit-fields, then making any change at all here is a waste of
time and effort.

If it can't be changed to mandate that the size of the allocation units
matches the size of the type specified by the programmer as the
bit-field type, then it must be left as it is (implementation-defined).

If it can't be changed to mandate that the type of the bit-field matches
the type specified by the programmer, then it must be left as it is.

Keith Thompson

unread,
Jun 25, 2021, 5:02:38 AM6/25/21
to
I don't see an answer to my question anywhere in there.

To be perhaps a bit clearer, when I compile this program with gcc or
clang:

#include <stdio.h>
#include <limits.h>
int main(void) {
struct has_bit_field {
unsigned bit_field : 1;
};
struct has_unsigned_char {
unsigned char uc;
};
printf("CHAR_BIT = %d\n", CHAR_BIT);
printf("sizeof (unsigned) = %zu\n", sizeof (unsigned));
printf("sizeof (struct has_bit_field) = %zu\n", sizeof (struct has_bit_field));
printf("sizeof (struct has_unsigned_char) = %zu\n", sizeof (struct has_unsigned_char));
}

the output on my system is:

CHAR_BIT = 8
sizeof (unsigned) = 4
sizeof (struct has_bit_field) = 4
sizeof (struct has_unsigned_char) = 1

What is the rationale for a bit-field forcing the structure to be
expanded to 4 bytes, when the bit-field itself is only 1 bit?
Why does a 1-bit bit-field for a bigger structure size than an 8-bit
unsigned char member?

I understand that it's implementation-defined, probably ABI-defined, and
that changing it would break existing code that depends on this
behavior. I'm trying to understand why it was defined this way in the
first place. I see nothing in the C standard that suggests this (though
of course it does allow it).

I would have assumed that both struct has_bit_field and struct
has_unsigned_char could sensibly have a size of 1 byte.

David Brown

unread,
Jun 25, 2021, 7:54:47 AM6/25/21
to
On 25/06/2021 11:02, Keith Thompson wrote:

>
> I don't see an answer to my question anywhere in there.
>
> To be perhaps a bit clearer, when I compile this program with gcc or
> clang:
>
> #include <stdio.h>
> #include <limits.h>
> int main(void) {
> struct has_bit_field {
> unsigned bit_field : 1;
> };
> struct has_unsigned_char {
> unsigned char uc;
> };
> printf("CHAR_BIT = %d\n", CHAR_BIT);
> printf("sizeof (unsigned) = %zu\n", sizeof (unsigned));
> printf("sizeof (struct has_bit_field) = %zu\n", sizeof (struct has_bit_field));
> printf("sizeof (struct has_unsigned_char) = %zu\n", sizeof (struct has_unsigned_char));
> }
>
> the output on my system is:
>
> CHAR_BIT = 8
> sizeof (unsigned) = 4
> sizeof (struct has_bit_field) = 4
> sizeof (struct has_unsigned_char) = 1
>
> What is the rationale for a bit-field forcing the structure to be
> expanded to 4 bytes, when the bit-field itself is only 1 bit?
> Why does a 1-bit bit-field for a bigger structure size than an 8-bit
> unsigned char member?

It lets the /programmer/ make the choice.

When I write "uint16_t x : 3;", I mean "Make a container field of size
and type uint16_t, and allocate 3 bits of it to x. If there is already
a container field of this type and size with enough free space just
before x, add it to the same container".

That is the function I want bit-fields to have. That is what makes them
useful in a wide range of use-cases. There are some situations where
all you need is to say "x doesn't have to hold more than 3 bits of
information, and I'd like the surrounding structure to be reasonably
small". But often you want to have the control.

Another situation where you need precise control for hardware registers
is in the access sizes for volatile data (and hardware registers are
usually accessed as volatile). Access size typically makes a /huge/
difference in the hardware. If the bit-field "x" above is part of a
volatile structure, then all reads and writes to it must be done using
16-bit accesses. (gcc had a bug for a while in their ARM port that
meant a smaller access size would be used for smaller bit size of
fields, giving 8-bit accesses for a 3 bit bit-field regardless of the
declared type. This caused a lot of problems.)

All in all, I want the type I give in the bit-field declaration to be
used by the compiler. Otherwise, what's the point of having it at all?

>
> I understand that it's implementation-defined, probably ABI-defined, and
> that changing it would break existing code that depends on this
> behavior. I'm trying to understand why it was defined this way in the
> first place. I see nothing in the C standard that suggests this (though
> of course it does allow it).
>

The C standard doesn't say anything about such sizes - it wouldn't make
much sense to do so when there are so limited options mandated for
bit-field types in the standard at the moment. The standard basically
abdicates many of the important aspects of bit-fields to the
implementation - it gives the syntax but not the details.

> I would have assumed that both struct has_bit_field and struct
> has_unsigned_char could sensibly have a size of 1 byte.
>

I would not.

And therein, perhaps, is the biggest problem with bit-fields in C.
Because there is virtually no guidance in the standards in this respect,
people with different experiences, practices and needs for the language
can have widely different expectations or assumptions about them. (I
expect that in actual code you would not make such assumptions, since I
believe you often have to write more portable code that avoids relying
on implementation-defined behaviour.)

Ben Bacarisse

unread,
Jun 25, 2021, 8:07:26 AM6/25/21
to
struct eg {
unsigned x : 3;
unsigned : 16-3;
};

seems just as clear to me. Both rely on implementation-defined
behaviour, but had gcc taken the other route, the programmer would still
have the choice.

It seems wrong for bit fields have an explicit type at all. What they
need is only the signed/unsigned distinction for which there are already
keywords.

--
Ben.

Philipp Klaus Krause

unread,
Jun 25, 2021, 8:43:25 AM6/25/21
to
Am 25.06.21 um 10:43 schrieb David Brown:

>
> As I see it, bit-fields are used for two main purposes.
>
> One is for structures internal to a program, in order to reduce space.
> There are two typical reasons for that. You might have a small embedded
> system with extremely small ram size (though these are getting less
> common, and since they are very rarely used with anything newer than
> C99, changes to the standard matter little there).
If newer standards are rarely used for programming small embedded
systems, is it because the standard might have been diverging from their
requirements?
Is it because newer standards take so much effort to implement that
compilers other than GCC and clang lack the manpower to implement them
timely?
Is it because users follow a coding standard, such as MISRA, that takes
a while to update for new C standards?

David Brown

unread,
Jun 25, 2021, 9:17:15 AM6/25/21
to
If that had been the route implementations of bit-fields had taken, then
it would have worked to some extend - and we would have lived with it,
using (as now) explicit masking and shifting for cases where bit-fields
can't give the needed guarantees (perhaps due to portability needs).

However, reality is different. You can see an example in this godbolt
link <https://godbolt.org/z/3nT4nx7Gb>.

struct ea {
uint16_t x : 3;
uint16_t : 13;
};

struct eb {
uint16_t x : 3;
};

struct ec {
uint8_t x : 3;
uint16_t : 13;
};

struct eg {
unsigned x : 3;
unsigned : 16 - 3;
};


The standards say nothing about the required sizes, other than some
minimums. Real-world implementations make "struct eg" 32-bit (given
32-bit int), and the others all 16-bit.

If implementations were changed to make the size of "eg" the minimum
needed to support the bit-fields, it would break current code and
practice such as "struct eb" here. That alone means the idea of making
the containing struct have minimum size will never be put in the
standards. (And if there are current implementations unknown to me that
would make "struct eg" 16-bit here despite having 32-bit int, then that
would of course exclude the standards mandating the more common behaviour.)

Had "minimum size" been the commonly implemented behaviour, or required
by the standards, then it would have worked reasonably well. And as you
say, there would not be a need for a type for the bit-field, just
"signed" or "unsigned", or "_Bool".

It would not have allowed control of the access sizes for volatile
accesses. Though this is of course implementation-dependent, specifying
the type in the bit-field tells the compiler what size the programmer
wants. And it would not have allowed the use of enumeration types in
bit-field declarations. (Baring implementation-specific extensions,
enumeration types are generally indistinguishable from "int", but their
use can make code clearer and allow for better static checking.)


I don't disagree that the system you suggest could have worked, but
compiler implementers have gone a different route - one that I
personally think works better, based on the kind of coding I personally do.




David Brown

unread,
Jun 25, 2021, 9:33:52 AM6/25/21
to
These are all valid reasons. I couldn't tell you the proportions here,
but I suspect you've covered the biggest points.

A lot of embedded developers are also very conservative - there is an
amazing amount still being written in C90.

And there is the question of what the newer C standards have given
embedded developers. C11 gave us atomics that don't work for embedded
systems, and threads that are useless even if embedded toolchain
developers bothered to make some kind of implementation for them.
_Generic is more complicated than most people want to see, and has
little use in practice. Static assertions are a wonderful idea - that's
why programmers in C90 and C99 already have them (using ugly macros
rather than a language feature, but they still work). Anonymous structs
and unions were already supported as extensions by many toolchains.

C17 gives nothing bug bug fixes.

Some emebedded programmers, such as myself, will use new standards when
they are supported by my tools and give some advantages.

But one of the most important points of C as a language is its
stability, consistency and backwards compatibility. It doesn't really
need changes, except perhaps to clarify some parts, simplify by removing
support for long-outdated systems, codify existing common practice, and
throw out old, dangerous and obsolete syntax.

Keith Thompson

unread,
Jun 25, 2021, 11:09:13 AM6/25/21
to
I really don't understand that statement.

The way I *wish* bit-fields had been implemented is that a bit-field has
the specified size, and its declared type has no effect on the
structure's layout. *That* would let the programmer make the choice,
for example by defining an umnamed bit-field for padding if desired.
With that scheme, I can define an 8-bit structure containing two 4-bit
bit-fields. With the current common implementation, I can't (unless I
use a non-standard type for the bit-fields).

A 1-bit bit-field doesn't have an alignment requirement. Why should it
impose such a requirement on the containing structure?

> When I write "uint16_t x : 3;", I mean "Make a container field of size
> and type uint16_t, and allocate 3 bits of it to x. If there is already
> a container field of this type and size with enough free space just
> before x, add it to the same container".

Sure, that's how it works -- but nothing in the standard gives a clue
that that's what it should mean.

It just seems bizarre to me that the type of a bit-field has no effect
on the representation of the bit-field itself (other than signedness),
but it does affect the representation of the containing structure.

> That is the function I want bit-fields to have. That is what makes them
> useful in a wide range of use-cases. There are some situations where
> all you need is to say "x doesn't have to hold more than 3 bits of
> information, and I'd like the surrounding structure to be reasonably
> small". But often you want to have the control.

How exactly does the current scheme give the programmer more control?
If I want extra padding, I can add it myself. If I don't, I get it
anyway.

> Another situation where you need precise control for hardware registers
> is in the access sizes for volatile data (and hardware registers are
> usually accessed as volatile). Access size typically makes a /huge/
> difference in the hardware. If the bit-field "x" above is part of a
> volatile structure, then all reads and writes to it must be done using
> 16-bit accesses. (gcc had a bug for a while in their ARM port that
> meant a smaller access size would be used for smaller bit size of
> fields, giving 8-bit accesses for a 3 bit bit-field regardless of the
> declared type. This caused a lot of problems.)
>
> All in all, I want the type I give in the bit-field declaration to be
> used by the compiler. Otherwise, what's the point of having it at all?

But why should the type of a bit-field affect the layout of the
containing struct but not the representation of the bit-field itself?
Why impose an arbitrary restriction?

[snip]

> And therein, perhaps, is the biggest problem with bit-fields in C.
> Because there is virtually no guidance in the standards in this respect,
> people with different experiences, practices and needs for the language
> can have widely different expectations or assumptions about them. (I
> expect that in actual code you would not make such assumptions, since I
> believe you often have to write more portable code that avoids relying
> on implementation-defined behaviour.)

The way it seems to me is that, yes, bit-fields are underspecified, but
the common implementation imposes an arbitrary restriction that reduces
the programmer's control. There's nothing you can do with the existing
scheme that couldn't be done with the one I would have preferred.

Keith Thompson

unread,
Jun 25, 2021, 11:22:14 AM6/25/21
to
David Brown <david...@hesbynett.no> writes:
> On 25/06/2021 14:07, Ben Bacarisse wrote:
[...]
> If implementations were changed to make the size of "eg" the minimum
> needed to support the bit-fields, it would break current code and
> practice such as "struct eb" here. That alone means the idea of making
> the containing struct have minimum size will never be put in the
> standards. (And if there are current implementations unknown to me that
> would make "struct eg" 16-bit here despite having 32-bit int, then that
> would of course exclude the standards mandating the more common behaviour.)

Of course changing behavior that existing code depends on is not practical.

> Had "minimum size" been the commonly implemented behaviour, or required
> by the standards, then it would have worked reasonably well. And as you
> say, there would not be a need for a type for the bit-field, just
> "signed" or "unsigned", or "_Bool".
>
> It would not have allowed control of the access sizes for volatile
> accesses. Though this is of course implementation-dependent, specifying
> the type in the bit-field tells the compiler what size the programmer
> wants.

I don't follow. Tells the compiler the size of what? Not of the
bit-field itself. Can you give an example that illustrates your point
about volatile accesses?

> And it would not have allowed the use of enumeration types in
> bit-field declarations. (Baring implementation-specific extensions,
> enumeration types are generally indistinguishable from "int", but their
> use can make code clearer and allow for better static checking.)

Each enumeration type is compatible with some implementation-defined
integer type, not necessarily int. (Enumeration constants are of type
int.) But how would it not allow the use of enumeration bit-fields?
The current scheme, where the type affects the size of the struct, does
make that difficult (though a language feature to specify the underlying
type of an enum would make that easier). But something like this:
enum booool { no, maybe, probably, yes };
struct foo {
enum booool : 2;
unsigned : 6;
};
would make perfectly good sense under the scheme I would have preferred.

Unexpectedly, with gcc enum booool is 32 bits (it's compatible
with unsigned int) but struct foo is 8 bits.

> I don't disagree that the system you suggest could have worked, but
> compiler implementers have gone a different route - one that I
> personally think works better, based on the kind of coding I personally do.

Is there anything you can do with the current scheme that you couldn't
do with mine, perhaps with the addition of explicit padding?

David Brown

unread,
Jun 25, 2021, 12:22:45 PM6/25/21
to
On 25/06/2021 17:22, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
>> On 25/06/2021 14:07, Ben Bacarisse wrote:
> [...]
>> If implementations were changed to make the size of "eg" the minimum
>> needed to support the bit-fields, it would break current code and
>> practice such as "struct eb" here. That alone means the idea of making
>> the containing struct have minimum size will never be put in the
>> standards. (And if there are current implementations unknown to me that
>> would make "struct eg" 16-bit here despite having 32-bit int, then that
>> would of course exclude the standards mandating the more common behaviour.)
>
> Of course changing behavior that existing code depends on is not practical.
>
>> Had "minimum size" been the commonly implemented behaviour, or required
>> by the standards, then it would have worked reasonably well. And as you
>> say, there would not be a need for a type for the bit-field, just
>> "signed" or "unsigned", or "_Bool".
>>
>> It would not have allowed control of the access sizes for volatile
>> accesses. Though this is of course implementation-dependent, specifying
>> the type in the bit-field tells the compiler what size the programmer
>> wants.
>
> I don't follow. Tells the compiler the size of what? Not of the
> bit-field itself. Can you give an example that illustrates your point
> about volatile accesses?

I did, with the godbolt link that you snipped. (If you don't like
godbolt links, I can paste it into a post, but the link makes it easy to
see the code and the generated results for whatever processor you are
familiar with.)

Another relevant link would be
<https://github.com/ARM-software/abi-aa/blob/f52e1ad3f81254497a83578dc102f6aac89e52d0/aapcs32/aapcs32.rst#8175volatile-bit-fields--preserving-number-and-width-of-container-accesses>

This is part of the ABI for ARM, which describes how volatile bit-field
access should work on that target.

You might not be very familiar with low-level coding and hardware access
(most programmers are not). Hardware is usually memory-mapped these
days. So if you are making a driver for, say, a UART (an old-fashioned
serial port) there will be specific memory addresses for a "receive"
port, a "transmit" port, for control registers for setting the baud
rate, status registers for flags, etc. A common way to handle this is
to make a struct built up from uintN_t fixed-size types and bit-fields
for the control and status fields. Your code must be precise in how it
accesses these - reading and writing registers can trigger actions. For
example, a write to the transmit register might send out a character on
the bus. Reading from the receive register might change some flags and
collect the next character from an input buffer. So you must not access
neighbouring registers unintentionally. Equally, a register might
require a 16-bit or 32-bit access to work correctly, and reading or
writing by byte will fail. You must have full control of exactly what
sizes you are using for the accesses as well as the ordering of them.

This kind of thing is, of course, implementation dependent and outside
the scope of the C standards. But people program such things in C and
specifying the exact type for the fields is how you control the exact
size of the accesses.

>
>> And it would not have allowed the use of enumeration types in
>> bit-field declarations. (Baring implementation-specific extensions,
>> enumeration types are generally indistinguishable from "int", but their
>> use can make code clearer and allow for better static checking.)
>
> Each enumeration type is compatible with some implementation-defined
> integer type, not necessarily int. (Enumeration constants are of type
> int.) But how would it not allow the use of enumeration bit-fields?

Ben's suggestion (in the sense of what he thinks would have been a good
choice for the language, rather than how he would like the standards to
be changed) was to have only "signed" or "unsigned" as the type for
bit-fields. That rules out enumeration types.

> The current scheme, where the type affects the size of the struct, does
> make that difficult (though a language feature to specify the underlying
> type of an enum would make that easier). But something like this:
> enum booool { no, maybe, probably, yes };

(Reminds me of <https://thedailywtf.com/articles/What_Is_Truth_0x3f_>)


> struct foo {
> enum booool : 2;
> unsigned : 6;
> };
> would make perfectly good sense under the scheme I would have preferred.


Yes, that would be possible in your scheme (but see next comment).

>
> Unexpectedly, with gcc enum booool is 32 bits (it's compatible
> with unsigned int) but struct foo is 8 bits.

That is not /entirely/ unexpected. You have no named fields in the
struct, which I believe is a constraint error in C (it triggers a
-Wpedantic warning in gcc). The sizeof of an empty struct is usually
given as 1.

If you have "enum booool b : 2;" instead, the struct has size 32 bits.
That's not always what I would like, but it is consistent with the size
of the enumeration type (given by the ABI). And for those that want
tighter control and are happy with extensions, gcc's "packed" attribute
lets you make smaller enumeration types. (Or you can use C++'s scoped
enums, that can have their underlying type specified. It might be nice
for C to copy that feature.)

>
>> I don't disagree that the system you suggest could have worked, but
>> compiler implementers have gone a different route - one that I
>> personally think works better, based on the kind of coding I personally do.
>
> Is there anything you can do with the current scheme that you couldn't
> do with mine, perhaps with the addition of explicit padding?
>

Yes - I can't tell the compiler the size for volatile accesses.

Apart from that, your scheme could work. But to me, it would be a step
backwards in clarity of code. It's like suggesting C could be
simplified by removing the binary minus operator. After all, you can
write "a - b" as "a + -b". It's true, but it is not helpful to the
language.

If I want things to be of particular sizes and alignments, I use clear
fixed-size types. I don't have to figure out counts of padding bits and
add them in. The current scheme - as supported by compilers today -
lets you write code in a manner that gives a closer correspondence
between the text you write and results you get, for low level code.

Ben Bacarisse

unread,
Jun 25, 2021, 2:12:03 PM6/25/21
to
> ...

Yes, I said exactly that. Both routes offer what you gave as the
justification for one.

<cut>
> Had "minimum size" been the commonly implemented behaviour, or required
> by the standards, then it would have worked reasonably well. And as you
> say, there would not be a need for a type for the bit-field, just
> "signed" or "unsigned", or "_Bool".
>
> It would not have allowed control of the access sizes for volatile
> accesses.

I don't know what you mean here. I know what all the words mean, but I
don't know sort of control you mean nor how one implementation choice
would prevent it.

--
Ben.

Keith Thompson

unread,
Jun 25, 2021, 4:28:26 PM6/25/21
to
My mistake. I saw the link followed by some code, and incorrectly
assumed the link just had the code you posted, so I didn't follow it.
I'll look at it later.
Hey, at least I made "no" equal to 0.

>> struct foo {
>> enum booool : 2;
>> unsigned : 6;
>> };
>> would make perfectly good sense under the scheme I would have preferred.
>
>
> Yes, that would be possible in your scheme (but see next comment).
>
>>
>> Unexpectedly, with gcc enum booool is 32 bits (it's compatible
>> with unsigned int) but struct foo is 8 bits.
>
> That is not /entirely/ unexpected. You have no named fields in the
> struct, which I believe is a constraint error in C (it triggers a
> -Wpedantic warning in gcc). The sizeof of an empty struct is usually
> given as 1.

The lack of named fields was just an oversight.

> If you have "enum booool b : 2;" instead, the struct has size 32 bits.
> That's not always what I would like, but it is consistent with the size
> of the enumeration type (given by the ABI). And for those that want
> tighter control and are happy with extensions, gcc's "packed" attribute
> lets you make smaller enumeration types. (Or you can use C++'s scoped
> enums, that can have their underlying type specified. It might be nice
> for C to copy that feature.)

Hmm. The idea that making a bit-field anonymous, without changing its
type or size, affects the layout of the structure seems weird and
arbitrary to me, even more so than what I've been ranting about
previously in this thread.

>>> I don't disagree that the system you suggest could have worked, but
>>> compiler implementers have gone a different route - one that I
>>> personally think works better, based on the kind of coding I personally do.
>>
>> Is there anything you can do with the current scheme that you couldn't
>> do with mine, perhaps with the addition of explicit padding?
>
> Yes - I can't tell the compiler the size for volatile accesses.
>
> Apart from that, your scheme could work. But to me, it would be a step
> backwards in clarity of code. It's like suggesting C could be
> simplified by removing the binary minus operator. After all, you can
> write "a - b" as "a + -b". It's true, but it is not helpful to the
> language.
>
> If I want things to be of particular sizes and alignments, I use clear
> fixed-size types. I don't have to figure out counts of padding bits and
> add them in. The current scheme - as supported by compilers today -
> lets you write code in a manner that gives a closer correspondence
> between the text you write and results you get, for low level code.

I'll look into the volatile stuff. But again, the idea that the type of
a bit-field *doesn't* affect the representation of the bit-field itself
but *does* affect the representation of the enclosing structure just
feels arbitrary and weird to me. And it's quite possible that I've
ranted about it enough.

...

After writing the above, I did a Google search for "struct bit-field
affects size". I found this post on Stack Overflow from 8 years ago:
https://stackoverflow.com/q/15980589/827263
The wording of the question seemed familiar. I figured out why when I
saw my own name at the bottom of the post. 8-)}

Following the links, there's some discussion of bit-fields in the
System V ABI for i386
http://www.sco.com/developers/devspecs/abi386-4.pdf

Summary: The idea that the declared type of a bit-field should affect
the size of the containing structure appears to be intuitively obvious
to some people (including, apparently, the authors of the i386 ABI), but
completely out of left field to others (at least to me). I have yet to
figure out why my intuition on this point differs so much from at least
some other people's. And I think I'm going to drop it unless someone
has some new insights to offer. Meanwhile, I rarely use bit-fields, and
when I do I'll be aware of how they work in practice.

Jakob Bohm

unread,
Jul 6, 2021, 5:18:06 PM7/6/21
to
Besides what Mr. Krause mentioned, a major reason is that many
embedded platforms are stuck with whichever actual compiler
implementation shipped with the first release of that platform.

Any new compilers tend to emulate that historic compiler extremely
closely in order to get identical or very closely equivalent
binaries from existing source code. And programmers that care
tend to read up on which language "enhancements" will make their
code incompatible with existing toolchains and smaller/cheaper/older
platform versions.

This essentially locks each platform family to that historic
compiler and its idiosyncrasies.

However newly created embedded platforms such as embedded RISC-VI
or any new bit-sliced utility processor for FPGA projects would
get frozen to whatever the commonly implemented language will be
when that platform is created. This makes it worthwhile to discuss
future improvements to the language for deeply embedded systems.

A language aspect that will be useful even for existing platforms is
a standard, easily backported, way to tell reusable code which language
implementation alternatives are in effect when compiling the portable
code with any old toolchain. Basically a supplement to <limits.h>
that covers all the stuff currently only left to "read the
implementation defined features document hopefully provided by your
vendor" .

Something that will allow a strictly conformant program or translation
unit to actively use whichever syntax that historic compiler uses for
things like "place this bit field at the most significant bit of the
second byte" or "allow +1 to set this bit field to 1" or what bit
encoding is used for signed ints (2s complement, 1s complement,
sign+magnitude, etc.)

Every implementation-defined language aspect needs a standard define
indicating which alternative was chosen, with values for each common or
historic choice, and a (common) rule for how to add yet-to-be-invented
choices in a forward compatible manner. But turning it into a formal
workable proposal is a major exercise.


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Tim Rentsch

unread,
Jul 10, 2021, 11:58:21 AM7/10/21
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

[...]

> It seems wrong for bit fields have an explicit type at all. What they
> need is only the signed/unsigned distinction for which there are already
> keywords.

I was about to say that I completely agree, but there is an
important exception, namely _Bool. A bitfield of type _Bool
has different semantics than a bitfield of type unsigned :1.

Tim Rentsch

unread,
Jul 10, 2021, 12:13:55 PM7/10/21
to
Keith Thompson <Keith.S.T...@gmail.com> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>> Philipp Klaus Krause <p...@spth.de> writes:
>>
>>> Am 23.06.21 um 19:53 schrieb Tim Rentsch:
>>>
>>>> Bit-fields of types signed and unsigned may use widths up to the
>>>> widths of intmax_t and uintmax_t, respectively.
>>>
>>> WG14 usually doesn't like to use the same syntax as C++ but with a
>>> different meaning.
>>> AFAIK, in C++, you can have an int bit-field wider than int (or whatever
>>> else the type is), but the upper bits then are padding bits.
>>
>> The obvious answer is to ask the C++ committee to change their
>> silly rule. And if they don't want to change it, just ignore it;
>> the C++ rule is already incompatible with C, now it will just be
>> incompatible in a different way. The gap between C and C++ has
>> increased to the point where they should be treated as separate
>> and independent languages.
>
> It's no more incompatible than any other C++ feature that doesn't exist
> in C. A program that defines a bit field with a length that exceeds the
> width of the type is a constraint violation in C, and valid in C++.

I don't know what point you're trying to make here. How it works
in C++ is different from how it works in C. That the C standard
deems it a constraint violation to me seems irrelevant. The
important point is that the behaviors in the two cases aren't
the same.

> On the other hand, I agree that the C++ rule *seems* silly. I wonder
> what the rationale is. (The earliest reference to the rule that I found
> is in the C++ 1998 standard.)
>
> It would be nice to avoid *gratuitous* incompatibilities between C and
> C++.

It would be even nicer in this particular case if C++ changed
its silly rule to the rule I suggested for C.

Tim Rentsch

unread,
Jul 10, 2021, 12:23:49 PM7/10/21
to
Philipp Klaus Krause <p...@spth.de> writes:

If WG21 is interested in keeping the common subset compatible, then
they should change their silly rule to something more sensible,
such as the rule I suggested. ISTM that the C++ folks always want
the C standard to change to accommodate them, but aren't all that
interested in reciprocating. It's time for that to change. And
if they aren't willing to change, screw 'em. You may quote me.
0 new messages