Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

sizeof(bitfield struct)

124 views
Skip to first unread message

Rick C. Hodgin

unread,
Nov 4, 2018, 6:59:51 AM11/4/18
to
I had something happen yesterday that surprised me. I was using a
bitfield struct:

#define u32 uint32_t
#define u16 uint16_t

struct STime
{
u32 seconds2 : 5; // double-seconds 0-29
u32 minute : 6; // 0-59
u32 hour : 5; // 0-23
// Total = 16 bits
};

And in another block of code I had this structure used in a union:

union
{
u16 raw_time; // Time in bit-encoded form
STime time; // Time structure for member access
};

However, it was expanding the sizeof(STime) to 4-bytes, and was
making the union be 4-bytes instead of 2-bytes. I was not ex-
pecting that. I expected the sizeof(STime) to be the actual
size of the bits, and not what they expand to.

Is this normal in C++? It seems unnatural to expand the union
size to the derived type sizes, rather than the actual bit size.

--
Rick C. Hodgin

Pavel

unread,
Nov 4, 2018, 2:43:41 PM11/4/18
to
C++ object representation may include the implementation-specific number of
padding bits (see e.g. 6.7 in the latest standard draft, also 8.5.2.3
specifically about sizeof).

On my system, even your STime struct has sizeof of 4:
-------------------------
#include <cstdint>
#include <iostream>
using namespace std;

typedef uint32_t u32;

struct STime {
u32 seconds2 : 5; // double-seconds 0-29
u32 minute : 6; // 0-59
u32 hour : 5; // 0-23
// Total = 16 bits
};

int
main(int, char*[])
{
cout << "sizeof(STime)=" << sizeof(STime) << endl;
return 0;
}

------ result:
sizeof(STime)=4
--------

Compilers may provide pragrmas attributes to change padding see e.g. docs for
gcc "packed" attribute.

HTH
-Pavel

bitrex

unread,
Nov 4, 2018, 2:50:36 PM11/4/18
to
why would it make sense for "sizeof" to return bitfield struct sizes
which aren't word-aligned when AFAIK it's at best
implementation-dependent whether bitfield structs less than the native
word size can be packed and/or allocated across word-boundaries. Given
that "sizeof" is usually used for the purposes of calculating allocation
sizes and not general-informational purposes

Rick C. Hodgin

unread,
Nov 4, 2018, 3:19:28 PM11/4/18
to
Correct. I had to change the values from u32 to u16 to get it to
have a 2-byte representation within the union. That's the part I
wasn't expecting.

> Compilers may provide pragrmas attributes to change padding see e.g. docs for
> gcc "packed" attribute.

This was compiled in MSVC++, and I double-checked to make sure I
had the struct alignment set to bytes.

I'm really surprised that bit structs are expanded to the largest
member of their types, and not represented by the size of their
bit encoding. I actually consider it to be a flaw in C/C++ to do
it that way.

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Nov 4, 2018, 3:26:45 PM11/4/18
to
I could see asking for sizeof(STime.hour) and having it return 4
because its type is a 32-bit quantity.

But for the sizeof(STime) it's not 4. Each STime structure is
only two bytes. And changing the member values from u32 to u16
now made the structure be 2 bytes again.

Additionally, if you did it thusly:

STime* t = get_a_valid_t_array_of_at_least_10_elements();

for (int i = 0; i < 10; ++i)
++t;

It's going to increase by 2 bytes per iteration. You couldn't
change that code to something like this and have it work properly
if your STime members were u32. But, if you change them to u16,
then this code would work properly:

for (int i = 0; i < 10; ++i)
t = (STime*)((char*)t + sizeof(STime));

The value for t would be skewed if STime's members were u32 in-
stead of u16.

--
Rick C. Hodgin

Pavel

unread,
Nov 4, 2018, 5:38:03 PM11/4/18
to
This is also not so for gcc. When I add the "packed" attribute (for gcc), the
sizeof of your struct with u32 bit fields becomes 2:

$ cat sbf.cpp
#include <cstdint>
#include <iostream>
using namespace std;

typedef uint32_t u32;

struct STime {
u32 seconds2 : 5 __attribute__ ((packed)); // double-seconds 0-29
u32 minute : 6 __attribute__ ((packed)); // 0-59
u32 hour : 5 __attribute__ ((packed)); // 0-23
// Total = 16 bits
};

int
main(int, char*[])
{
cout << "sizeof(STime)=" << sizeof(STime) << endl;
return 0;
}
$ g++ -std=c++11 ./sbf.cpp
$ ./a.out
sizeof(STime)=2
$

>
>> Compilers may provide pragrmas attributes to change padding see e.g. docs for
>> gcc "packed" attribute.
>
> This was compiled in MSVC++, and I double-checked to make sure I
> had the struct alignment set to bytes.
>
> I'm really surprised that bit structs are expanded to the largest
> member of their types,
Note this seems just an implementation choice of MSVC as per the results of my
test above.
> and not represented by the size of their
> bit encoding. I actually consider it to be a flaw in C/C++ to do
> it that way.
C/C++ as such does not define whether or how many padding bits are added, it is
implementation-specific. New (2011+) standard, however, does provide some more
tools to control alignment e.g. "alignas" specifier but it cannot make alignment
weaker.

To regain confidence in how your structs are aligned, you might consider using
static asserts. E.g. (adopted from http://www.catb.org/esr/structure-packing/)

static_assert(sizeof(struct STime) == 2, “Check your assumptions");

HTH
-Pavel

Rick C. Hodgin

unread,
Nov 4, 2018, 5:50:21 PM11/4/18
to
On Sunday, November 4, 2018 at 5:38:03 PM UTC-5, Pavel wrote:
> To regain confidence in how your structs are aligned, you might consider using
> static asserts. E.g. (adopted from http://www.catb.org/esr/structure-packing/)
>
> static_assert(sizeof(struct STime) == 2, “Check your assumptions");
>
> HTH
> -Pavel

Definitely helps. Thank you, Pavel.

--
Rick C. Hodgin

David Brown

unread,
Nov 4, 2018, 6:07:05 PM11/4/18
to
On 04/11/2018 12:59, Rick C. Hodgin wrote:
> I had something happen yesterday that surprised me. I was using a
> bitfield struct:
>
> #define u32 uint32_t
> #define u16 uint16_t
>
> struct STime
> {
> u32 seconds2 : 5; // double-seconds 0-29
> u32 minute : 6; // 0-59
> u32 hour : 5; // 0-23
> // Total = 16 bits
> };
>

When you write "u32 seconds2 : 5;", what you are saying is "make
seconds2 5 bits of a u32". So the struct STime has a u32, gives 5 bits
to seconds2, 6 bits to minute, and 5 bits to hour. The remaining 16
bits are unused - but they still take up space in the struct, and still
affect alignment.

If you want this all to be within a 16-bit struct, use u16 (or uint16_t).

(Compilers may let you reduce this with extra features like "packed"
attributes or pragmas.)

Remember that there is quite a lot about bit-fields that are
implementation-specific. That might be fine for you, but you will have
to check that they work as expected on the compilers you use. The rules
I am giving here are for C, rather than C++ - I expect them to be
roughly the same for C++, but I am not familiar enough to be sure.
(Hopefully someone will correct me if I'm wrong.)

First, the type should be _Bool (bool for C++), signed int, unsigned
int, or an implementation-defined type. This means that u16, which is
"unsigned short" on most platforms, may not be supported. In practice
most modern compilers /will/ support it, but as I say - check.

The order of bit-fields packing is up to the implementation. On most
little-endian systems, ordering is from least-significant-bit onwards
but that is not guaranteed. AFAIK on very old versions of MS C
compilers, the order was most-significant-bit first, and then they
changed it for the next version of the compiler. There was a lot of
wailing at the time, but it has been consistent since then.

If you have something like:

uint16_t field1 : 10;
uint16_t field2 : 10;

it is implementation-defined if part of field2 is included in the first
uint16_t, or if it is all put in the second uint16_t.


Bit-fields can be very useful, but you have to take care, especially if
you want them to work across compilers or targets.


> And in another block of code I had this structure used in a union:
>
> union
> {
> u16 raw_time; // Time in bit-encoded form
> STime time; // Time structure for member access
> };
>
> However, it was expanding the sizeof(STime) to 4-bytes, and was
> making the union be 4-bytes instead of 2-bytes. I was not ex-
> pecting that. I expected the sizeof(STime) to be the actual
> size of the bits, and not what they expand to.
>
> Is this normal in C++? It seems unnatural to expand the union
> size to the derived type sizes, rather than the actual bit size.
>

The size (and alignment) comes from the size of the underlying type,
which is 32-bit in this case.

Mr Flibble

unread,
Nov 5, 2018, 1:56:51 PM11/5/18
to
On 04/11/2018 20:19, Rick C. Hodgin wrote:
[snip]

> I'm really surprised that bit structs are expanded to the largest
> member of their types, and not represented by the size of their
> bit encoding. I actually consider it to be a flaw in C/C++ to do
> it that way.

And Satan invented fossils, yes?

/Flibble

--
“You won’t burn in hell. But be nice anyway.” – Ricky Gervais

“I see Atheists are fighting and killing each other again, over who
doesn’t believe in any God the most. Oh, no..wait.. that never happens.” –
Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

Rick C. Hodgin

unread,
Nov 5, 2018, 2:55:44 PM11/5/18
to
On Monday, November 5, 2018 at 1:56:51 PM UTC-5, Mr Flibble wrote:
> And Satan invented fossils, yes?

No. You're listening to someone else suggesting that. Not
the Bible, not me, not the truth. You're letting falseness
rule in your thinking, and not the light of truth.

--
Rick C. Hodgin

Mr Flibble

unread,
Nov 5, 2018, 2:57:34 PM11/5/18
to
On 05/11/2018 19:55, Rick C. Hodgin wrote:
[snip]
> No. You're listening to someone else suggesting that. Not
> the Bible, not me, not the truth. You're letting falseness
> rule in your thinking, and not the light of truth.

And Satan invented fossils, yes?

Ian Collins

unread,
Nov 5, 2018, 3:59:31 PM11/5/18
to
On 05/11/18 09:19, Rick C. Hodgin wrote:

> I'm really surprised that bit structs are expanded to the largest
> member of their types, and not represented by the size of their
> bit encoding. I actually consider it to be a flaw in C/C++ to do
> it that way.

Why? Bit fields are deliberately loosely specified (apart from the
size..) with much being implementation defined (such as the ordering)
because of differences in hardware support. If you want 16 bits, use
uint16_t. Conversly, if you only want to map a couple of bits in a 32
bit register, you'd use a uint32_t - without silly #defines!

--
Ian.

Rick C. Hodgin

unread,
Nov 5, 2018, 4:03:21 PM11/5/18
to
That's a philosophical position, the one C/C++ took.

My view is if I define a pattern of 16 bits in a particular
order, the machine had better represent them as I indicate,
and when I step through some structure I'm expecting the ptr
to move by the size of the bits, and not their expressed size.

I truly view this as a fundamental failure of C/C++.

--
Rick C. Hodgin

Ian Collins

unread,
Nov 5, 2018, 4:20:39 PM11/5/18
to
So if you only declare a 13 bit pattern?

If you wanted 13 bits, why did you use a 32 bit underlying type?

--
Ian.

Rick C. Hodgin

unread,
Nov 5, 2018, 4:27:13 PM11/5/18
to
Increment by bit count b:

step_size = (b / 8) + ((b % 8) == 0 ? 0 : 1);

--
Rick C. Hodgin

Keith Thompson

unread,
Nov 5, 2018, 9:00:18 PM11/5/18
to
David Brown <david...@hesbynett.no> writes:
[...]
> When you write "u32 seconds2 : 5;", what you are saying is "make
> seconds2 5 bits of a u32". So the struct STime has a u32, gives 5 bits
> to seconds2, 6 bits to minute, and 5 bits to hour. The remaining 16
> bits are unused - but they still take up space in the struct, and still
> affect alignment.
>
> If you want this all to be within a 16-bit struct, use u16 (or uint16_t).
[...]
> The size (and alignment) comes from the size of the underlying type,
> which is 32-bit in this case.

The standard says that allocation of bit-fields is
implementation-defined. I don't see anything in the standard that
implies either that there is, or that there isn't, any difference
between, for example,
int8_t bitfield : 5;
and
int32_t bitfield : 5;
though I think some ABIs impose more specific requirements.

--
Keith Thompson (The_Other_Keith) k...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Rick C. Hodgin

unread,
Nov 5, 2018, 9:53:32 PM11/5/18
to
On Monday, November 5, 2018 at 9:00:18 PM UTC-5, Keith Thompson wrote:
> The standard says that allocation of bit-fields is
> implementation-defined. I don't see anything in the standard that
> implies either that there is, or that there isn't, any difference
> between, for example,
> int8_t bitfield : 5;
> and
> int32_t bitfield : 5;

+1.

> though I think some ABIs impose more specific requirements.

They'd have no choice, but to compute the two examples above
for sizeof() to be int8_t = 1 and int32_t = 4 is just wrong
on every level.

--
Rick C. Hodgin

David Brown

unread,
Nov 6, 2018, 4:50:07 AM11/6/18
to
On 06/11/18 03:00, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> When you write "u32 seconds2 : 5;", what you are saying is "make
>> seconds2 5 bits of a u32". So the struct STime has a u32, gives 5 bits
>> to seconds2, 6 bits to minute, and 5 bits to hour. The remaining 16
>> bits are unused - but they still take up space in the struct, and still
>> affect alignment.
>>
>> If you want this all to be within a 16-bit struct, use u16 (or uint16_t).
> [...]
>> The size (and alignment) comes from the size of the underlying type,
>> which is 32-bit in this case.
>
> The standard says that allocation of bit-fields is
> implementation-defined. I don't see anything in the standard that
> implies either that there is, or that there isn't, any difference
> between, for example,
> int8_t bitfield : 5;
> and
> int32_t bitfield : 5;
> though I think some ABIs impose more specific requirements.
>

Certainly it is common for ABI's (or compiler implementations) to use
the declared underlying type for allocation sizes and alignments. It is
also common in embedded systems (where bitfields and volatile are
common) to insist that the compiler respects the access size when
dealing with volatile structs with bitfields.

But as for the standards, I don't see it clearly expressed - so it does
not seem to be a requirement. It talks about "the addressable storage
unit holding the bit-field", and C11 6.7.2.1 describes how packing of
bits within these units is to be done (with plenty of room for
implementation defined behaviour). There does not appear to be anything
saying that the "addressable storage unit" here has to be of the size
and type of the bit-field's type - indeed, "An implementation may
allocate any addressable storage unit large enough to hold a bit-field".

However, the usefulness of bit-fields for "conforming to externally
imposed layouts" would be hindered if the implementation did not give
the programmer control over the sizes here.

Older C compilers sometimes had support for only "signed int" and
"unsigned int" as bit-field types - most that I have used in the past
couple of decades have supported a range of types. It is certainly
possible that compilers with only "int" and "unsigned int" as types
would have used smaller underlying storage units, while I think more
modern ones use the type you ask for. But it is, naturally,
implementation-dependent.

(I have looked through the C++ standards too, and didn't see any
difference here.)

David Brown

unread,
Nov 6, 2018, 4:57:41 AM11/6/18
to
On 06/11/18 03:53, Rick C. Hodgin wrote:
> On Monday, November 5, 2018 at 9:00:18 PM UTC-5, Keith Thompson wrote:
>> The standard says that allocation of bit-fields is
>> implementation-defined. I don't see anything in the standard that
>> implies either that there is, or that there isn't, any difference
>> between, for example,
>> int8_t bitfield : 5;
>> and
>> int32_t bitfield : 5;
>
> +1.
>

It looks like the standards don't impose rules here, but ABI's and
implementations do.

>> though I think some ABIs impose more specific requirements.
>
> They'd have no choice, but to compute the two examples above
> for sizeof() to be int8_t = 1 and int32_t = 4 is just wrong
> on every level.
>

No, it is certainly not wrong according to the C (or C++) standards, nor
is it wrong according to the ABI's of many processors and C
implementations. And speaking as someone who uses bitfields regularly
in cases where exact layouts and accesses are important, it is often
/essential/ that the type of the bitfield be used for the size,
alignment and accesses.

It is extremely simple to get the layout you want - use an appropriately
sized underlying type. (Yes, support for that is
implementation-dependant - but so are most things regarding bit-fields.)



Öö Tiib

unread,
Nov 6, 2018, 6:09:07 AM11/6/18
to
On Tuesday, 6 November 2018 11:57:41 UTC+2, David Brown wrote:
>
> It is extremely simple to get the layout you want - use an appropriately
> sized underlying type. (Yes, support for that is
> implementation-dependant - but so are most things regarding bit-fields.)

When it is "extremely simple" (just implementation dependent) then there
must be is a standard compliant way to static assert that we got the layout
that we wanted. Is there such a way?

David Brown

unread,
Nov 6, 2018, 6:17:45 AM11/6/18
to
Lots of things in C and C++ sound extremely simple, and are extremely
simple to describe - yet turn out to be really hard with nothing but
standard compliant code!

Here I'd say a static assert on the sizeof the struct was the best you
can do - that would likely fail if you've made incorrect assumptions
about the layout. But it won't cover everything, and it certainly won't
cover the order of the bit-fields.


Rick C. Hodgin

unread,
Nov 6, 2018, 7:43:58 AM11/6/18
to
Have you ever seen the movie [Ice Station Zebra]? In my best Russian
accent: "And now I must tell you how wrong you are."

The bit expression is a bit expression. And, if you truly need something
like Ian's 13-bit packed data structs, use interleaving structs
which begin at byte boundaries and ignore the previous portions
from prior structs. But the sizeof() a bit packed struct should
be rounded to the nearest larger byte.

The expressed type of those bits should not affect in any way the
sizeof() operation. It should always report its byte size occupancy
as rounded up to the nearest byte.

Only when I use sizeof(bitstruct.member) should it report the sizeof()
that type.

It's just flatly wrong... and now I have to support this failure
in philosophy in CAlive for compatibility.

To quote C3PO: "Oooh. How horrid."

--
Frederick Clarkson Hodgin, III

PS - Bonus points if you can figure out how they got "Rick" from that.

Juha Nieminen

unread,
Nov 6, 2018, 8:38:02 AM11/6/18
to
Öö Tiib <oot...@hot.ee> wrote:
> When it is "extremely simple" (just implementation dependent) then there
> must be is a standard compliant way to static assert that we got the layout
> that we wanted. Is there such a way?

There aren't always ways to static_assert such things.

For example, there is no standard-compliant way to static_assert that
we are compiling for a little-endian system, for instance. (This might
be on purpose.)

Öö Tiib

unread,
Nov 6, 2018, 9:01:42 AM11/6/18
to
I really can't imagine purpose. For example:

| C99 6.4.4.4p10: "The value of an integer character constant containing
| more than one character (e.g., 'ab'), or containing a character or escape
| sequence that does not map to a single-byte execution character, is
| implementation-defined."

Why? Who benefits from such a wasted feature? That could be defined
to follow order of bytes on platform.

David Brown

unread,
Nov 6, 2018, 9:02:38 AM11/6/18
to
On 06/11/18 13:43, Rick C. Hodgin wrote:
> On Tuesday, November 6, 2018 at 4:57:41 AM UTC-5, David Brown wrote:
>> On 06/11/18 03:53, Rick C. Hodgin wrote:
>>> They'd have no choice, but to compute the two examples above
>>> for sizeof() to be int8_t = 1 and int32_t = 4 is just wrong
>>> on every level.
>>>
>>
>> No, it is certainly not wrong according to the C (or C++) standards, nor
>> is it wrong according to the ABI's of many processors and C
>> implementations. And speaking as someone who uses bitfields regularly
>> in cases where exact layouts and accesses are important, it is often
>> /essential/ that the type of the bitfield be used for the size,
>> alignment and accesses.
>>
>> It is extremely simple to get the layout you want - use an appropriately
>> sized underlying type. (Yes, support for that is
>> implementation-dependant - but so are most things regarding bit-fields.)
>
> Have you ever seen the movie [Ice Station Zebra]? In my best Russian
> accent: "And now I must tell you how wrong you are."
>
> The bit expression is a bit expression. And, if you truly need something
> like Ian's 13-bit packed data structs, use interleaving structs
> which begin at byte boundaries and ignore the previous portions
> from prior structs. But the sizeof() a bit packed struct should
> be rounded to the nearest larger byte.

You write that as though it was some sort of indisputable fact. It is
not fact, and it is not indisputable - it is nothing more than your idea
of what you think would be best. It is, of course, fine to have your
opinion - but realise that it is /your/ opinion, and it does not appear
to match the opinion of people making or using C compilers. In
particular, the idea that there is one fixed "right" way to do things
here simply has no anchoring in reality.

If you want to use bit-fields, you need to be very careful to check the
layout matches what you want. Don't assume anything, don't assume
consistency between systems. And test rigorously.

Some ABIs will define the bit-field layout exactly - some compilers may
not follow these ABIs exactly. Compilers can have options that affect
the layout, and can have extensions ("packed" attributes or pragmas)
that affect them.

From a bit of experimenting on godbolt.org, I only found one compiler
for which this struct was not the same size as "int" :

struct A {
int x : 1;
int y : 2;
}

That was the AVR, which is an 8-bit microcontroller, where you usually
want to save ram. It used one byte for A. (The AVR is also the only
compiler here where the alignment of "int" is less than its size. Being
8-bit, there are no benefits in having larger alignments for any type.)
The msp430 compiler, being a 16-bit device, used 2 bytes. All the rest
used 4.

>
> The expressed type of those bits should not affect in any way the
> sizeof() operation. It should always report its byte size occupancy
> as rounded up to the nearest byte.
>

Again, that is an assertion without justification and without foundation
in reality. That is, quite simply, not the case in practice.

> Only when I use sizeof(bitstruct.member) should it report the sizeof()
> that type.

That is not allowed at all in C or C++. (It would not be unreasonable
to have some sort of "bitsizeof" operator here, but it does not exist in
C or C++.)

>
> It's just flatly wrong... and now I have to support this failure
> in philosophy in CAlive for compatibility.

I think you overrate your opinions. Does it never occur to you that
when other tools handle something differently, it is /you/ that is
wrong? Alternatively, perhaps it simply doesn't matter that much -
people manage to get what they need? We are not talking about a
complicated or unintuitive workaround here - you want everything to fit
in a 16-bit struct, so use 16-bit types instead of 32-bit types.

What you can be sure of, is that however you implement things in your
own language, it will be in conflict with the way bit-fields work with
some compilers.

Rick C. Hodgin

unread,
Nov 6, 2018, 9:08:39 AM11/6/18
to
On Tuesday, November 6, 2018 at 9:02:38 AM UTC-5, David Brown wrote:
> On 06/11/18 13:43, Rick C. Hodgin wrote:
> > The bit expression is a bit expression ...
> > But the sizeof() a bit packed struct should
> > be rounded to the nearest larger byte.
>
> You write that as though it was some sort of indisputable fact.

It is.

> > The expressed type of those bits should not affect in any way the
> > sizeof() operation. It should always report its byte size occupancy
> > as rounded up to the nearest byte.
> >
>
> Again, that is an assertion...

Again, it is. And I'm amazed to find so many compiler authors
doing it wrong.

Bitfield structs require some special attention. The question
exists in how can I inquire at runtime the sizeof() the bits
in the bitfield, compared to the sizeof() the type it represents,
compared to the sizeof() the entire bitfield struct, compared to
the size of the entire structure expanded to its expressed types.

I think bitfield structs are not yet completely cooked in the
existing compilers. Just enough to get it close, but they didn't
bring it to a close.

--
Rick C. Hodgin

David Brown

unread,
Nov 6, 2018, 9:40:30 AM11/6/18
to
On 06/11/18 15:08, Rick C. Hodgin wrote:
> On Tuesday, November 6, 2018 at 9:02:38 AM UTC-5, David Brown wrote:
>> On 06/11/18 13:43, Rick C. Hodgin wrote:
>>> The bit expression is a bit expression ...
>>> But the sizeof() a bit packed struct should
>>> be rounded to the nearest larger byte.
>>
>> You write that as though it was some sort of indisputable fact.
>
> It is.

This is like arguing with a wall.

Look, your opinion does not match that of compiler writers, ABI
designers, language designers or programmers. These folks have come to
different conclusions about the best way to implement bit-fields - not
always to the same conclusions, but often different from yours.

You are not smarter than these folks. You are not able to see something
that they are missing. You are not able to divine some sort of
"fundamental" issue to which they are blind. You are simply a developer
with an opinion (which is fine), and an over-inflated sense of the value
of that opinion (which is not fine).

>
>>> The expressed type of those bits should not affect in any way the
>>> sizeof() operation. It should always report its byte size occupancy
>>> as rounded up to the nearest byte.
>>>
>>
>> Again, that is an assertion...
>
> Again, it is. And I'm amazed to find so many compiler authors
> doing it wrong.
>
> Bitfield structs require some special attention. The question
> exists in how can I inquire at runtime the sizeof() the bits
> in the bitfield, compared to the sizeof() the type it represents,
> compared to the sizeof() the entire bitfield struct, compared to
> the size of the entire structure expanded to its expressed types.
>

You can't - not with standard C or C++, and not with any existing
extensions I am aware of. People have programmed in C for 40 years
without such features, so I don't think the need of them is particularly
pressing. But if you feel they would be useful, put them into your own
language. (I have often felt that a bit more introspection capabilities
in C and C++ would be nice, though I have never felt the need of it for
bit-fields.)

(There is a proposal for adding introspection to C++, but that is going
to be a much bigger feature.)

> I think bitfield structs are not yet completely cooked in the
> existing compilers. Just enough to get it close, but they didn't
> bring it to a close.
>

Bit-fields certainly have their limitations, and their loose
implementation-defined nature makes them difficult for portability.
With appropriate care, they have their place in programming, and many
people use them. But when designing a new language with different
requirements than C, it may make sense to create an alternative with
much stricter rules.


Rick C. Hodgin

unread,
Nov 6, 2018, 9:49:56 AM11/6/18
to
On Tuesday, November 6, 2018 at 9:40:30 AM UTC-5, David Brown wrote:
> On 06/11/18 15:08, Rick C. Hodgin wrote:
> > Bitfield structs require some special attention. The question
> > exists in how can I inquire at runtime the sizeof() the bits
> > in the bitfield, compared to the sizeof() the type it represents,
> > compared to the sizeof() the entire bitfield struct, compared to
> > the size of the entire structure expanded to its expressed types.
>
> You can't - not with standard C or C++, and not with any existing
> extensions I am aware of. People have programmed in C for 40 years
> without such features...

An oversight by all to be sure.

They're not pressing, but you can't even obtain that information in
today's compilers. It is most definitely an oversight.

You say talking to me is like talking to a wall? I just happen to
know I'm right. And today I've added bounce structs to CAlive to
allow reflection mirrors of the bitfield types into their native
types, and back again. I'll also allow the sizeof() to return the
size in bytes, and the bitsizeof() to return the size in bits, and
that will be globally, not just on bitfield structs or members.

Good day, David.

--
Rick C. Hodgin

David Brown

unread,
Nov 6, 2018, 10:08:16 AM11/6/18
to
Adding a "bitsizeof" operator for bit-fields seems reasonable enough. I
don't see a lot of potential use for it, but I know we have very
different opinions on that sort of thing (I want to know a feature is
very useful before adding it, you prefer to add early and let the user
decide if they want to use it).

Rick C. Hodgin

unread,
Nov 6, 2018, 10:32:51 AM11/6/18
to
On Tuesday, November 6, 2018 at 10:08:16 AM UTC-5, David Brown wrote:
> Adding a "bitsizeof" operator for bit-fields seems reasonable enough. I
> don't see a lot of potential use for it, but I know we have very
> different opinions on that sort of thing (I want to know a feature is
> very useful before adding it, you prefer to add early and let the user
> decide if they want to use it).

Yes, because I'm not arrogant enough to think I have all the
answers to all of the coding scenarios. I want to empower each
developer to be able to reach into the vast tookbox and use the
tools they need.

People who trim trees drive similar trucks to those who install
power lines, cable and Internet, yet I bet each of those trucks
that look outwardly similar have vastly different tools in the
toolboxes.

For different types of application design, it's like that. Some
trim trees, some install power lines, and some work with the el-
ectronic infrastructure and fiber optics to make TV and Internet
work.

It comes down to the individual at work doing their job, even
though many of the tools will be similar, the specific details
give aid and benefit where needed, and can be largely ignored
otherwise.

--
Rick C. Hodgin

Mr Flibble

unread,
Nov 6, 2018, 12:42:00 PM11/6/18
to
On 06/11/2018 14:49, Rick C. Hodgin wrote:
[snip]
> I just happen to know I'm right.

Rick C. Hodgin

unread,
Nov 6, 2018, 12:51:39 PM11/6/18
to
On Tuesday, November 6, 2018 at 12:42:00 PM UTC-5, Mr Flibble wrote:
> On 06/11/2018 14:49, Rick C. Hodgin wrote:
> [snip]
> > I just happen to know I'm right.
>
> And Satan invented fossils, yes?

No. He invented the lie that says he invented fossils, and
then spread that lie around under false pretenses, designed
entirely to deceive people who will not seek the truth under
any circumstance whatsoever.

It breaks my heart, Leigh. If I knew how to reach you in a
way which would reach you ... I'd do it.

--
Rick C. Hodgin

james...@alumni.caltech.edu

unread,
Nov 6, 2018, 1:28:54 PM11/6/18
to
On Tuesday, November 6, 2018 at 4:50:07 AM UTC-5, David Brown wrote:
...
> Certainly it is common for ABI's (or compiler implementations) to use
> the declared underlying type for allocation sizes and alignments. It is
> also common in embedded systems (where bitfields and volatile are
> common) to insist that the compiler respects the access size when
> dealing with volatile structs with bitfields.
>
> But as for the standards, I don't see it clearly expressed - so it does
> not seem to be a requirement. It talks about "the addressable storage
> unit holding the bit-field", and C11 6.7.2.1 describes how packing of
> bits within these units is to be done (with plenty of room for
> implementation defined behaviour). There does not appear to be anything
> saying that the "addressable storage unit" here has to be of the size
> and type of the bit-field's type - indeed, "An implementation may
> allocate any addressable storage unit large enough to hold a bit-field".

The only requirements on the "addressable storage unit" are: "If enough
space remains, a bit-field that immediately follows another bit-field in
a structure shall be packed into adjacent bits of the same unit. If
insufficient space remains, whether a bit-field that does not fit is put
into the next unit or overlaps adjacent units is implementation-defined.
The order of allocation of bit-fields within a unit (high-order to low-
order or low-order to high-order) is implementation-defined."
(6.7.2.1p11).
There's no requirement that the size of the unit be documented, nor that
it be the same for different struct types. There's certainly no
requirement that it be connected to the type used to declare a bit-
field.

> However, the usefulness of bit-fields for "conforming to externally
> imposed layouts" would be hindered if the implementation did not give
> the programmer control over the sizes here.

Strictly speaking, the requirements that the C standard fails to impose
on the layout of structs in particular, and on bit-fields in particular,
make C structs useless, in code intended to be portable, for describing
externally imposed layouts.
I don't approve of this (as shocking as that fact might seem to those
who claim that I worship the holy standard). But this state of affairs
exists because, at the time the first standard was being written,
different compilers implemented different rules for the layout of
structures, and no agreement could be reached to mandate stricter
requirements than the ones we currently have. That state of affairs
continues to the present, and requirements of backwards compatibility
makes it difficult to impose any new restrictions in future versions of
the standard. The best we could hope for is some new way of defining
struct types (or perhaps, something struct-like but using different
syntax) that have layouts more tightly constrained by the standard than
ordinary struct types.
In the mean time, the closest you can get to portable code for
externally-imposed memory layouts, is to mask and shift operations on
the members of arrays of unsigned char.

> Older C compilers sometimes had support for only "signed int" and
> "unsigned int" as bit-field types -

Those two are the only types that all versions of the standard have
mandated support for - since C99 it also mandates support for _Bool as a
bit-field's type. Implementations are explicitly allowed to support
other integer types, but code that needs to be portable can't take
advantage of that fact.

> ... most that I have used in the past
> couple of decades have supported a range of types. It is certainly
> possible that compilers with only "int" and "unsigned int" as types
> would have used smaller underlying storage units, while I think more
> modern ones use the type you ask for. But it is, naturally,
> implementation-dependent.

It didn't have to be - that's just the way C was defined.

Alf P. Steinbach

unread,
Nov 6, 2018, 3:41:58 PM11/6/18
to
Well, the g++ implementation of the now deprecated conversion to/from
UTF-8 in the standard library. Worth noting that Visual C++ screws up in
a different way. So it's perhaps good that it's now deprecated, but
AFAIK there's no replacement. :(


Cheers!,

- Alf



Mr Flibble

unread,
Nov 6, 2018, 5:38:18 PM11/6/18
to
On 06/11/2018 17:51, Rick C. Hodgin wrote:
[snip]
> No. He invented the lie that says he invented fossils,
[snip]

Satan doesn't exist mate.

And Satan invented fossils, yes?

Rick C. Hodgin

unread,
Nov 6, 2018, 9:24:59 PM11/6/18
to
On Tuesday, November 6, 2018 at 5:38:18 PM UTC-5, Mr Flibble wrote:
> On 06/11/2018 17:51, Rick C. Hodgin wrote:
> > No. [Satan] invented the lie that says he invented fossils,
>
> Satan doesn't exist mate.

He does. So does Jesus. Jesus leads you to eternal life. Satan leads
you to eternal death.

You can read the Bible to learn why. It has to do with pride, and a
focus on self.

--
Rick C. Hodgin

Keith Thompson

unread,
Nov 7, 2018, 6:51:50 PM11/7/18
to
David Brown <david...@hesbynett.no> writes:
[...]
> No, it is certainly not wrong according to the C (or C++) standards, nor
> is it wrong according to the ABI's of many processors and C
> implementations. And speaking as someone who uses bitfields regularly
> in cases where exact layouts and accesses are important, it is often
> /essential/ that the type of the bitfield be used for the size,
> alignment and accesses.
>
> It is extremely simple to get the layout you want - use an appropriately
> sized underlying type. (Yes, support for that is
> implementation-dependant - but so are most things regarding bit-fields.)

In C90, the only permitted underlying types for bit fields were
int, signed int, and unsigned int, so it wasn't possible to use
the underlying type to control the layout. (C99 added _Bool bit
fields, and many implementations permit other integer types.)
My understanding was always that defining
unsigned int bf:5;
simply meant that bf would occupy 5 bits and be able to store values
in the range 0..31. Permitting, say, unsigned long or unsigned
long long (as some C implementations do, and as C++ requires) means
that you can have bit fields wider than an int, but it would never
have occurred to me that changing the underlying type for the bit
field itself would affect the layout of the structure outside the
bit field.

I understand and accept that it's conventional to use the underlying
type that way, and I suppose I wouldn't object to a future C or C++
standard codifying that behavior, but I've always found it odd.

If I wanted to control the layout in that manner, I'd probably find
it clearer to add unused members after the bit field for padding.

David Brown

unread,
Nov 8, 2018, 3:32:28 AM11/8/18
to
On 08/11/18 00:51, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> No, it is certainly not wrong according to the C (or C++) standards, nor
>> is it wrong according to the ABI's of many processors and C
>> implementations. And speaking as someone who uses bitfields regularly
>> in cases where exact layouts and accesses are important, it is often
>> /essential/ that the type of the bitfield be used for the size,
>> alignment and accesses.
>>
>> It is extremely simple to get the layout you want - use an appropriately
>> sized underlying type. (Yes, support for that is
>> implementation-dependant - but so are most things regarding bit-fields.)
>
> In C90, the only permitted underlying types for bit fields were
> int, signed int, and unsigned int, so it wasn't possible to use
> the underlying type to control the layout. (C99 added _Bool bit
> fields, and many implementations permit other integer types.)

Yes, I realised that when someone else posted it. Many of the compilers
I have used over the years have supported different bit-field types as
extensions even in C90 modes - even some that had little or no C99
support. In my line of work, high portability is not often important,
and use of compiler extensions are often required. (Plus my only copy
of C90 is scanned pdf, so it is vastly easier to look things up in C99
or C11 standards and try to forget C90 ever existed...)

> My understanding was always that defining
> unsigned int bf:5;
> simply meant that bf would occupy 5 bits and be able to store values
> in the range 0..31. Permitting, say, unsigned long or unsigned
> long long (as some C implementations do, and as C++ requires) means
> that you can have bit fields wider than an int, but it would never
> have occurred to me that changing the underlying type for the bit
> field itself would affect the layout of the structure outside the
> bit field.
>
> I understand and accept that it's conventional to use the underlying
> type that way, and I suppose I wouldn't object to a future C or C++
> standard codifying that behavior, but I've always found it odd.

Fair enough - I can understand that. Equally, I find it odd if the
underlying type is /not/ used that way. I suspect this is a matter of
what we are used to, what the tools we most use do, and where we use
bit-fields. What is certain, is that bit-fields are not particularly
portable. The semantics of them for storing data is of course portable,
but the exact layout details is not. When you need exact layout to be
portable, bit masks, shifts, and fixed-width integer types is the way to go.

As I see it, there are three main uses of bit-fields.

1. For use within a program, to pack data more tightly. The exact
layout details don't matter, and bit-fields work fine regardless of the
rules used by the implementation.

2. For matching pre-defined data, such as in network protocols or file
formats. Here you do need exact matching of layout, but access sizes
don't matter. So if you have:

struct X {
uint32_t a : 5;
uint32_t b : 7;
uint32_t a : 4;
}

then you will need to know if X takes 2 bytes, 3 bytes or 4 bytes, and
you need to know the endianness. But you don't care if reading "x.a" is
done as an 8-bit read, a 16-bit read, or a 32-bit read (assuming there
is valid memory after the X). And you'd be equally happy with
"uint16_t" instead of "uint32_t" as the type.

3. For matching hardware registers, for device drivers or low-level
embedded code. Here you need more - you are usually using volatile
accesses, and if you say the bit-field type is "uint32_t" in X, then the
compiler must generate a 32-bit read and write instruction and not 8-bit
instructions.

>
> If I wanted to control the layout in that manner, I'd probably find
> it clearer to add unused members after the bit field for padding.
>

Agreed. (I do that in normal structs too, and use -Wpadded in gcc to
spot any missing padding. It is often not essential, but I like to have
the padding explicit rather than implicit.)

Scott Lurndal

unread,
Nov 8, 2018, 8:59:36 AM11/8/18
to
For this, we generally use the following pattern (autogenerated from YAML description):

#ifdef __cplusplus
union register_name_t
#else
typedef union
#endif
{
uint64_t u;
struct
{
#if __BYTE_ORDER == __BIG_ENDIAN
uint64_t reserved_44_63 : 20;
uint64_t ns : 1;
uint64_t ptag : 33;
uint64_t reserved_1_9 : 9;
uint64_t valid : 1;
#else
uint64_t valid : 1;
uint64_t reserved_1_9 : 9;
uint64_t ptag : 33;
uint64_t ns : 1;
uint64_t reserved_44_63 : 20;
#endif
} s;
#ifdef __cplusplus
register_name_t() { u = 0; }
register_name_t(uint64_t data) { u = data; }
register_name_t &operator=(uint64_t data) { u = data; return *this; }
#endif
#ifdef __cplusplus
};
#else
} register_name_t;
#endif

=====

register_name_t reg;

reg.u = read64(REGISTER_ADDRESS); /* inline function, handles access semantics */

if (reg.s.ptag == xx) YY;

Rick C. Hodgin

unread,
Nov 8, 2018, 9:22:30 AM11/8/18
to
On Thursday, November 8, 2018 at 8:59:36 AM UTC-5, Scott Lurndal wrote:
> For this, we generally use the following pattern (autogenerated from YAML description):
>
> #ifdef __cplusplus
> union register_name_t
> #else
> typedef union
> #endif
> {
> uint64_t u;
> struct
> {
> #if __BYTE_ORDER == __BIG_ENDIAN
> uint64_t reserved_44_63 : 20;
> uint64_t ns : 1;
> uint64_t ptag : 33;
> uint64_t reserved_1_9 : 9;
> uint64_t valid : 1;
> #else
> uint64_t valid : 1;
> uint64_t reserved_1_9 : 9;
> uint64_t ptag : 33;
> uint64_t ns : 1;
> uint64_t reserved_44_63 : 20;
> #endif
> } s;
> #ifdef __cplusplus

It would be nice if a C/C++ extension was added to no longer
require that duplicate coding:

struct SWhatever
{
u64 valid : 1, 0; // ", 0" indicates lsb
u64 reserved_1_9 : 9;
u64 ptag : 33;
u64 ns : 1;
u64 reserved_44_63 : 20;
};

It would then generate in the right order no matter the tar-
get machine's endian. If it didn't matter, then leave off
the ", 0" part and let the compiler do whatever it needs to.

CAlive provides this syntax, to allow the overall struct to
be big or little endian by design and not target machine
architecture, as well as individual members within:

// le=little endian, be=big endian
le struct SWhatever
{
u64 valid : 1, 0; // ", 0" indicates lsb
u64 reserved_1_9 : 9;
be u64 ptag : 33;
u64 ns : 1;
u64 reserved_44_63 : 20;
};

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Nov 8, 2018, 9:25:11 AM11/8/18
to
On Thursday, November 8, 2018 at 9:22:30 AM UTC-5, Rick C. Hodgin wrote:
> CAlive provides this syntax, to allow the overall struct to
> be big or little endian by design and not target machine
> architecture, as well as individual members within:
>
> // le=little endian, be=big endian
> le struct SWhatever
> {
> u64 valid : 1, 0; // ", 0" indicates lsb
> u64 reserved_1_9 : 9;
> be u64 ptag : 33;
> u64 ns : 1;
> u64 reserved_44_63 : 20;
> };

Should be:

// le=little endian, be=big endian
le struct SWhatever
{
u64 valid : 1;

David Brown

unread,
Nov 8, 2018, 10:16:46 AM11/8/18
to
Your second syntax looks better. And I agree that it would be nice to
have it in the language. I'd want something longer than "le" or "be"
for the identification, or at least a syntax that ties it to this usage.
I don't like short names being used for this kind of thing because it
restricts the use of identifiers in code, and it is harder to understand
abbreviations when you read the code.

I would not make it possible to change endianness within a bit-field
group like this - I think it is unlikely to be useful, and very likely
to be confusing if it /is/ used.

It is also possible to distinguish between endianness for byte order,
and bit order - they are not the same on all systems.

But you are on the right track. It makes sense to have this sort of
thing in the language rather than the pre-processor or conditional
assembly, when you are making a new language.

Mr Flibble

unread,
Nov 8, 2018, 12:09:53 PM11/8/18
to
On 08/11/2018 14:22, Rick C. Hodgin wrote:
[snip]
> CAlive provides this syntax, to allow...
[snip]

Nobody gives a fuck about "CAlive"; it has nothing to do with C++ so is
off topic and just adds noise to this newsgroup WHICH IS ABOUT C++.

Rick C. Hodgin

unread,
Nov 8, 2018, 1:53:02 PM11/8/18
to
On Thursday, November 8, 2018 at 12:09:53 PM UTC-5, Mr Flibble wrote:
> On 08/11/2018 14:22, Rick C. Hodgin wrote:
> [snip]
> > CAlive provides this syntax, to allow...
> [snip]
>
> Nobody gives .. about "CAlive"; it has nothing to do with C++ so is
> off topic and just adds noise to this newsgroup WHICH IS ABOUT C++.

I care about CAlive, so your statement is false. CAlive brings in-
to its existence several concepts from C++, and expands on othres.
I offer these ideas to the C++ community so they can be incorpor-
ated into future versions of C++ if someone finds merit in them.

It's not a good idea to hold ideas within. It's better to share
them with others so they can be moved upon. I heard an example of
this one time related to how information must flow in the semicon-
ductor industry: knowledge and inspiration have no value unless
they are in motion. A person might discover the cure for cancer,
but unless he/she tells others about it, it's won't do anyone any
good. That knowledge must not be hoarded, but shared, to have any
real value.

--
Rick C. Hodgin
0 new messages