bitfield allocation strategy

Greg Comeau

unread,

Feb 24, 2002, 6:59:10 PM2/24/02

to

W.r.t bitfield allocation, can somebody detail the original rationle
why MS C and C++ changes to the next "allocation unit" just because
the type (or perhaps just the size?) of the next bit field changed?
IOWs, it seems surprising that this:

struct xyz {
int x : 1;
int y : 1;
int z : 1;
};

takes less space than:

struct abc {
char x : 1;
int y : 1;
bool z : 1;
};

What is the basis that the int, and then the bool, finishes allocation
for the current word and then creates a new bitfield "unit" instead of
just putting them all into one long word? I fully realize that bitfield
allocation is severly implementation-defined by C and C++, but this
has come up for us a number of times and so any light into the rationale
that you might shed would be appreciated.
--
Greg Comeau GA BETA:4+ New Windows Backends PLUS 'export' beta#2 online!
Comeau C/C++ ONLINE ==> http://www.comeaucomputing.com/tryitout
World Class Compilers: Breathtaking C++, Amazing C99, Fabulous C90.
Comeau C/C++ with Dinkumware's Libraries... Have you tried it?

Jonathan Caves [MSFT]

unread,

Feb 25, 2002, 11:51:47 AM2/25/02

to

Greg: we have code in the compiler that checks the size of the type of the
current bitfield against the size of the type of the previous bitfield: if
they are different then the compiler starts a "new" bitfield. Hence the
difference you see.

As to why this decision was made -- I don't know -- this is very old code
and I suspect that things have been this way since the earliest days of the
Microsoft C compiler.

--
Jonathan Caves
Microsoft Corporation

This posting is provided "AS IS" with no warranties, and confers no rights.

"Greg Comeau" <com...@panix.com> wrote in message
news:a5buoe$oud$1...@panix3.panix.com...

Greg Comeau

unread,

Feb 25, 2002, 12:29:44 PM2/25/02

to

In article <O4e$7yhvBHA.2500@tkmsftngp03>,

Jonathan Caves [MSFT] <no_spam_...@microsoft.com> wrote:
>"Greg Comeau" <com...@panix.com> wrote in message
>news:a5buoe$oud$1...@panix3.panix.com...
>> W.r.t bitfield allocation, can somebody detail the original rationle
>> why MS C and C++ changes to the next "allocation unit" just because
>> the type (or perhaps just the size?) of the next bit field changed?
>> IOWs, it seems surprising that this:
>>
>> struct xyz {
>> int x : 1;
>> int y : 1;
>> int z : 1;
>> };
>>
>> takes less space than:
>>
>> struct abc {
>> char x : 1;
>> int y : 1;
>> bool z : 1;
>> };
>>
>> What is the basis that the int, and then the bool, finishes allocation
>> for the current word and then creates a new bitfield "unit" instead of
>> just putting them all into one long word? I fully realize that bitfield
>> allocation is severly implementation-defined by C and C++, but this
>> has come up for us a number of times and so any light into the rationale
>> that you might shed would be appreciated.
>

>Greg: we have code in the compiler that checks the size of the type of the
>current bitfield against the size of the type of the previous bitfield: if
>they are different then the compiler starts a "new" bitfield. Hence the
>difference you see.

Yes, and we have code which does the same thing for Comeau C++
under Windows.

>As to why this decision was made -- I don't know -- this is very old code
>and I suspect that things have been this way since the earliest days of the
>Microsoft C compiler.

Right, it's the "why?" that I'm asking. I've known about the behavior
for a while, but a customer's got me curious since it seems to be
something done on purpose? In fact, as far as I can tell, every vendor
except Borland also mimics this, and so that got me doubly curious
as to the rationale behind it.

Jason Shirk [MSFT]

unread,

Feb 25, 2002, 9:25:05 PM2/25/02

to

"Greg Comeau" <com...@panix.com> wrote in message

news:a5dsa8$411$1...@panix3.panix.com...

I'll venture a guess, since nobody around here probably knows the real
reason.

It was probably easier to get alignment issues correct if no packing was
done. Just as padding is normally placed b/w members of different types
based on the current packing, the same idea is necessary for bitfields,
otherwise the load before the extract could fault (or have bad performance,
depnding on the platform.)

Unfortunately, this wouldn't explain why #pragma pack doesn't seem to affect
the packing of bitfields.

Jason Shirk
VC++ Compiler Team

Markus Mauhart

unread,

Feb 26, 2002, 11:58:31 AM2/26/02

to

So as no one knows a good reason for this behaviour, may I
ask the VC++ team to remove this restriction in the next
version of VC++ ?

>.
>

Michael (michka) Kaplan

unread,

Feb 26, 2002, 12:18:24 PM2/26/02

to

So they can cause alignment faults for the many people who have known about
and depended on the behavior?

Ick!

--
MichKa

Michael Kaplan
Trigeminal Software, Inc. -- http://www.trigeminal.com/

International VB? -- http://www.i18nWithVB.com/
C++? MSLU -- http://msdn.microsoft.com/msdnmag/issues/01/10/

"Markus Mauhart" <markus....@chello.at> wrote in message
news:8e6601c1bee6$d210b3b0$9ae62ecf@tkmsftngxa02...

Doug Harrison [MVP]

unread,

Feb 26, 2002, 12:39:18 PM2/26/02

to

Michael (michka) Kaplan wrote:

>So they can cause alignment faults for the many people who have known about
>and depended on the behavior?

Not to mention those who unintentionally depend on it (e.g. people who
dump structs to disk and expect to read them back).

--
Doug Harrison [VC++ MVP]
Eluent Software, LLC
http://www.eluent.com
Tools for Visual C++ and Windows

Greg Comeau

unread,

Feb 26, 2002, 11:19:26 PM2/26/02

to

In article <eBc0imuvBHA.2212@tkmsftngp04>,

Michael $michka$ Kaplan <forme...@nospam.trigeminal.spamless.com> wrote:
>"Markus Mauhart" <markus....@chello.at> wrote in message
>news:8e6601c1bee6$d210b3b0$9ae62ecf@tkmsftngxa02...

>> >"Greg Comeau" <com...@panix.com> wrote in message
>> >news:a5dsa8$411$1...@panix3.panix.com...
>> >> In article <O4e$7yhvBHA.2500@tkmsftngp03>,
>> >> Jonathan Caves [MSFT] <no_spam_...@microsoft.com>
>> wrote:
>> >> >"Greg Comeau" <com...@panix.com> wrote in message
>> >> >news:a5buoe$oud$1...@panix3.panix.com...

>> >> >> ... it seems surprising that this:

>> >> >>
>> >> >> struct xyz {
>> >> >> int x : 1;
>> >> >> int y : 1;
>> >> >> int z : 1;
>> >> >> };
>> >> >>
>> >> >> takes less space than:
>> >> >>
>> >> >> struct abc {
>> >> >> char x : 1;
>> >> >> int y : 1;
>> >> >> bool z : 1;
>> >> >> };
>> >> >>
>> >> >> What is the basis that the int, and then the bool,
>> >> >> finishes allocation for the current word and then
>> >> >> creates a new bitfield "unit" instead of

>> >> >> just putting them all into one long word?....
>> >> .......

>>
>> >I'll venture a guess, since nobody around here probably
>> knows the real
>> >reason.
>> >
>> >It was probably easier to get alignment issues correct if
>> >no packing was done. Just as padding is normally placed
>> >b/w members of different types based on the current packing,
>> >the same idea is necessary for bitfields,
>> >otherwise the load before the extract could fault (or
>> >have bad performance, depnding on the platform.)
>> >
>> >Unfortunately, this wouldn't explain why #pragma pack
>> >doesn't seem to affect
>> >the packing of bitfields.
>>

>> So as no one knows a good reason for this behaviour, may I
>> ask the VC++ team to remove this restriction in the next
>> version of VC++ ?

The problem with removing it now is that it will change
the sizeof() structs, therefore all code would need to be
recompiled. And I don't know if there is a lurking issue
in say passing something to a Windows API call if such
a change were made.

>So they can cause alignment faults

I must be having a bad day, but I really can't see the
case/s where a fault can occur. Can somebody detail an exact
scenario where this is possible?

>for the many people who have known about
>and depended on the behavior?

Is this speculation? I'm curious why somebody would purposely
depend upon this over say using a :0 bitfield specification
(T : 0 being about the only defined thing about bitfield!:} )

David Lowndes

unread,

Feb 27, 2002, 3:23:21 AM2/27/02

to

>>for the many people who have known about
>>and depended on the behavior?
>
>Is this speculation? I'm curious why somebody would purposely
>depend upon this over say using a :0 bitfield specification
>(T : 0 being about the only defined thing about bitfield!:} )

I'd have thought it more likely to be:

... for the many people who inadvertently depend on this behaviour.

Dave
--
MVP VC++ FAQ: http://www.mvps.org/vcfaq
My address is altered to discourage junk mail.
Please post responses to the newsgroup thread,
there's no need for follow-up email copies.

Daniel James

unread,

Feb 27, 2002, 6:42:10 AM2/27/02

to

In article <a5hmoe$1fj$1...@panix3.panix.com>, Greg Comeau wrote:
> (T : 0 being about the only defined thing about bitfield!:} )

As you say, "T : 0" is about the only thing about bitfields that is
specified - but that specification doesn't make any sense unless
one takes it to imply that in the absence of such a specification
bitfields should be packed together with as little padding as
possible.

The PPN (RS6000 Pyhsical Page Number) example of bitfields given in
CPL3 (Appendix C, Sec. C.8.1) wouldn't pack as Bjarne describes if
the IBM RS6000 compiler inserted the unasked-for padding that the
MS compiler does. The intent is clearly that padding will not be
inserted unless it is asked for (or is needed to prevent a single
bitfield straddling a storage unit boundary - which is fair
enough).

It is sad that implementors treat bitfields so poorly - Microsoft
aren't the only ones to insert gratuitous padding between bitfields
of different types, Borland do much the same thing (to name but
one). I can only think that this laxity is prompted by a feeling
that it doesn't matter as nobody in their right minds would use
anything so non-portable anyway - a view I'm coming to subscribe to
myself.

I wish the C and C++ standards committees would stop
shilly-shallying around on this and either document the way
bitfields are to be packed (depending on the endianity of the CPU,
of course) or deprecate them.

Breaking existing code need not be a problem as any comiler that
needed to change it's behaviour to meet the standard could support
a compatibility option (#pragma bitfield_packing( "barmy" ),
perhaps).

Cheers,
Daniel.

Markus Mauhart

unread,

Feb 27, 2002, 10:35:16 AM2/27/02

to

"Doug Harrison [MVP]" <d...@mvps.org> wrote in message
news:j1in7uonhu4dgnbe1...@4ax.com...

> Michael (michka) Kaplan wrote:
>
> >So they can cause alignment faults for the many people who have known about
> >and depended on the behavior?
>
> Not to mention those who unintentionally depend on it (e.g. people who
> dump structs to disk and expect to read them back).

Nobody talked about changing default behaviour.
To turn on packing of bitfield members one would have to use either
a new compiler option or some kind of '#pragma packbitfield'.

Markus Mauhart

unread,

Feb 27, 2002, 10:29:19 AM2/27/02

to

"Michael (michka) Kaplan" <forme...@nospam.trigeminal.spamless.com> wrote in message
news:eBc0imuvBHA.2212@tkmsftngp04...

> So they can cause alignment faults for the many people who have known about
> and depended on the behavior?

The bit-padding of the bitfield members has nothing to to with alignment faults.
The VC++ compiler team surely will have no problems to generate correct
machine code for access to the bitfield members.

Jason Shirk

unread,

Feb 27, 2002, 1:36:40 PM2/27/02

to

"Greg Comeau" <com...@panix.com> wrote in message

news:a5hmoe$1fj$1...@panix3.panix.com...

>
> >So they can cause alignment faults
>
> I must be having a bad day, but I really can't see the
> case/s where a fault can occur. Can somebody detail an exact
> scenario where this is possible?
>

Sorry, I wasn't sufficiently clear. The required compiler implementation to
avoid alignment faults is a bit harder if bitfields are allowed to be
packed. So I was suggesting that somebody might have taken the easiest
route, perhaps because of schedule pressure or other more pressing features,
and stuck in padding instead.

Consider:

struct S {
char c1;
char c2 : 1;
int i : 31;
};

Assuming the layout of this class was:

byte 0: c1
byte 1: c2
byte 4: i

Loading i does not require unaligned code. But if it were:

byte 0: c1
byte 1: c2:1, i:0-6
byte 2: i:7-14
byte 3: i:15-22
byte 4: i:23-30

Then loading i does require extra code to avoid an unaligned fault (on the
appropriate platforms of course.)

If the struct were:

struct S {
char c1;
char c2 : 1;
int i : 7;
};

It would certainly be possible for a compiler to not generate the unaligned
code and just load a byte when loading member i.

This analysis, while not terribly complicated for a compiler writer, is
certainly more work than just avoiding the problem altogether, which was my
guess as to the reason why we have inefficient packing today.

It's also possible that someone thought the code bloat from unaligned code
was worse than the data bloat from not packing the bitfields, assuming
people would do their own packing if it was important.

Greg Comeau

unread,

Feb 27, 2002, 9:33:08 PM2/27/02

to

In article <VA.0000066...@nospam.demon.co.uk>,

Daniel James <inte...@nospam.demon.co.uk> wrote:
>In article <a5hmoe$1fj$1...@panix3.panix.com>, Greg Comeau wrote:
>> (T : 0 being about the only defined thing about bitfield!:} )
>
>As you say, "T : 0" is about the only thing about bitfields that is
>specified - but that specification doesn't make any sense unless
>one takes it to imply that in the absence of such a specification
>bitfields should be packed together with as little padding as
>possible.

That implication is there IMO, but, perhaps unfortunately,
so are others.

>The PPN (RS6000 Pyhsical Page Number) example of bitfields given in
>CPL3 (Appendix C, Sec. C.8.1) wouldn't pack as Bjarne describes if
>the IBM RS6000 compiler inserted the unasked-for padding that the
>MS compiler does. The intent is clearly that padding will not be
>inserted unless it is asked for (or is needed to prevent a single
>bitfield straddling a storage unit boundary - which is fair
>enough).
>
>It is sad that implementors treat bitfields so poorly - Microsoft
>aren't the only ones to insert gratuitous padding between bitfields
>of different types, Borland do much the same thing (to name but
>one). I can only think that this laxity is prompted by a feeling
>that it doesn't matter as nobody in their right minds would use
>anything so non-portable anyway - a view I'm coming to subscribe to
>myself.

Many people have avoided bitfields since day 1 for exactly
this reason. The thing is that if you want to latch onto
a register or something, you can't control the allocation
with bitfields in all cases.

>I wish the C and C++ standards committees would stop
>shilly-shallying around on this and either document the way
>bitfields are to be packed (depending on the endianity of the CPU,
>of course) or deprecate them.

It's probably among the least used feature. I know many C programmers
who don't even know they exist and/or what to do with them. Anyway,
I think this is all different from the underlying rationale here.

>Breaking existing code need not be a problem as any comiler that
>needed to change it's behaviour to meet the standard could support
>a compatibility option (#pragma bitfield_packing( "barmy" ),
>perhaps).

Sure.

Greg Comeau

unread,

Feb 27, 2002, 9:34:40 PM2/27/02

to

In article <#MVoeR6vBHA.356@tkmsftngp04>,

Markus Mauhart <markus....@nospamm.chello.at> wrote:
>"Doug Harrison [MVP]" <d...@mvps.org> wrote in message
>news:j1in7uonhu4dgnbe1...@4ax.com...
>> Michael (michka) Kaplan wrote:
>>
>> >So they can cause alignment faults for the many people who have known about
>> >and depended on the behavior?
>>
>> Not to mention those who unintentionally depend on it (e.g. people who
>> dump structs to disk and expect to read them back).
>
>Nobody talked about changing default behaviour.

Right, you wouldn't change it at this point.
But additional flexibility for this who want it sounds reasonable.

>To turn on packing of bitfield members one would have to use either
>a new compiler option or some kind of '#pragma packbitfield'.

Right.

Greg Comeau

unread,

Feb 27, 2002, 9:45:48 PM2/27/02

to

In article <3c7d...@news.microsoft.com>,

Jason Shirk <jas...@microsoft.com> wrote:
>"Greg Comeau" <com...@panix.com> wrote in message
>news:a5hmoe$1fj$1...@panix3.panix.com...
>> >So they can cause alignment faults
>>
>> I must be having a bad day, but I really can't see the
>> case/s where a fault can occur. Can somebody detail an exact
>> scenario where this is possible?
>
>Sorry, I wasn't sufficiently clear. The required compiler implementation
>to avoid alignment faults is a bit harder if bitfields are allowed to be
>packed.

Agreed.

>So I was suggesting that somebody might have taken the easiest
>route, perhaps because of schedule pressure or other more pressing
>features, and stuck in padding instead.

Speed clear won over size, but why the only one way is curious....

>Consider:
>
> struct S {
> char c1;
> char c2 : 1;
> int i : 31;
> };
>
>Assuming the layout of this class was:
>
> byte 0: c1
> byte 1: c2
> byte 4: i
>
>Loading i does not require unaligned code. But if it were:
>
> byte 0: c1
> byte 1: c2:1, i:0-6
> byte 2: i:7-14
> byte 3: i:15-22
> byte 4: i:23-30
>
>Then loading i does require extra code to avoid an unaligned fault (on the
>appropriate platforms of course.)

Sure. But it would make sense to let the user decide this by
using int : 0 or whatever.

>If the struct were:
>
> struct S {
> char c1;
> char c2 : 1;
> int i : 7;
> };
>
>It would certainly be possible for a compiler to not generate the unaligned
>code and just load a byte when loading member i.

Sure.

>This analysis, while not terribly complicated for a compiler writer, is
>certainly more work than just avoiding the problem altogether, which was my
>guess as to the reason why we have inefficient packing today.

More work, but not horribly more.

>It's also possible that someone thought the code bloat from unaligned code
>was worse than the data bloat from not packing the bitfields, assuming
>people would do their own packing if it was important.

I suspect this is it, but wanted to know for sure.
As an FYI, there is also an article on MSDN about accessing
members from asm statements needing to be in their own units,
but I can't believe this to be so compelling...

Andy Sinclair

unread,

Feb 28, 2002, 4:49:52 AM2/28/02

to

Daniel James wrote:

>I wish the C and C++ standards committees would stop
>shilly-shallying around on this and either document the way
>bitfields are to be packed (depending on the endianity of the CPU,
>of course) or deprecate them.
>

I don't think deprecating bitfields would be popular among embedded
developers. They're an incredibly useful way of mapping onto
hardware. I wouldn't use them in windows programming though.

Andy

Markus Mauhart

unread,

Feb 28, 2002, 10:24:43 AM2/28/02

to

"Jason Shirk" <jas...@microsoft.com> wrote in message news:3c7d...@news.microsoft.com...

>
> Consider:
>
> struct S {
> char c1;
> char c2 : 1;
> int i : 31;
> };
>
> Assuming the layout of this class was:
>
> byte 0: c1
> byte 1: c2
> byte 4: i
>
> Loading i does not require unaligned code. But if it were:
>
> byte 0: c1
> byte 1: c2:1, i:0-6
> byte 2: i:7-14
> byte 3: i:15-22
> byte 4: i:23-30

Thats interesting, from my old borland days I had expected a completely
different strategy:
1* Take the sum of all consecutive bitfield members
[maybe one has to add: .. but not more than 32 bits]
2* When it fits into a single byte/word, use a byte/word for storage and alignment.
Otherwise use DWORDS for storage and alignment.
3* On reading a n-bit member: extract the n bits, interpret it as [U]INTn (according
to the members signedness), and then widen it to the member's type w/o changing
the value
(on writing, forget some bits)

IMO VC++ can simulate this allocation behaviour by using only equally sized
quantities, but with correct signedness:

enum E {e1=-1 ,e2=256};

struct BC_MODE
{
X x ;
bool m1 :1 ;
E m2 :10 ;
UINT32 m3 :18 ;
X2 x2 ;
};

struct VC_MODE
{
X x ;
UINT32 m1 :1 ;
INT32 m2 :10 ;
UINT32 m3 :18 ;
X2 x2 ;
};

My problem with this workaround is the loss of type information.

BTW, would the VC++ team be so nice to add this bitfield packing
flexibility, for me this had another great benefit:
With other compilers like gcc or bc it is possible to minimize the
underlying integral type of enums, so one can pack enums into BYTEs
or WORDs.
When I realized that VC++ only has 32bit enums, I tried to declare
enum-typed members as 'SMALL_ENUM_TYPE e :8', but with the current
VC++ bitfield allocation this doesnt worked too.

Daniel James

unread,

Mar 1, 2002, 6:17:40 AM3/1/02

to

In article <3c7dfc8...@news.r2g2.co.uk>, Andy Sinclair wrote:
> I don't think deprecating bitfields would be popular among
> embedded developers.

Remember that when something is deprecated it it still supported
by the language. Deprecation is just a way of saying "don't use
this in new code - we may take it away someday".

> They're an incredibly useful way of mapping onto hardware.

I think that's true in C - at least when using a compiler that
does packs the fields sensibly into machine words - but I think
you can do better in C++. In C++ you can write a class that
represents a hardware register (or set of registers) and provide
accessor methods for all the data members of the class. Writing
this class is a bit more laborious than simply defining a bitfield
struct that maps onto the register(s), but once written it can be
used as easily as the bitfield struct, and with more clarity. The
use of accessor methods is a more powerful technique, as
additional processing, validity checks, debugging output, etc.,
can be added if required.

Even if the compiler you use now supports bitfields in a useful
and predictable manner, there's no guarantee that the compiler you
use next week will do it the same way (or at all), while a
carefully-written C++ class (not using bitfields) will be
portable.

If the language standard isn't going to define bitfields in a
portable manner I don't think it should do so at all.

Cheers,
Daniel.