Reading Struct not Located at Four-boundary

thomas

unread,

Aug 20, 2010, 4:33:11 AM8/20/10

to

Hi,

I have a struct A(undefined, can be any form) located at memory
pointed by "char *ptr".
I want to read it with "(A*)ptr".

Now I wonder if the position of ptr may affect the behavior when
accessing the struct.

Consider the following condition:
-----------------------------------------------
_ _ _ _ | _ _ _ _ |
0 1 2 3 4 5 6 7
position of pointer "ptr" = 2.

If the first element of struct A is an int-type one, it will span
positions 2~5.
The CPU will load 0~3, and then 4~7 to get the data of the int-type
member.
But we C++ programmers don't need to care the position of ptr, right?

(I remember some accesses to unbounded memory positions may get system
crash, but I cannot remember in which case. A little confused.)

tni

unread,

Aug 20, 2010, 5:00:32 AM8/20/10

to

On 2010-08-20 10:33, thomas wrote:

> I have a struct A(undefined, can be any form) located at memory
> pointed by "char *ptr".
> I want to read it with "(A*)ptr".
>
> Now I wonder if the position of ptr may affect the behavior when
> accessing the struct.
>
> Consider the following condition:
> -----------------------------------------------
> _ _ _ _ | _ _ _ _ |
> 0 1 2 3 4 5 6 7
> position of pointer "ptr" = 2.
>
> If the first element of struct A is an int-type one, it will span
> positions 2~5.
> The CPU will load 0~3, and then 4~7 to get the data of the int-type
> member.
> But we C++ programmers don't need to care the position of ptr, right?

It's not portable. On some platforms (e.g. Intel x86), unaligned access
works (but is slow), on some it doesn't (many RISC processors).

Paavo Helde

unread,

Aug 20, 2010, 5:00:19 AM8/20/10

to

thomas <fresh...@gmail.com> wrote in news:bb2ff772-9706-4b51-bcbb-
0353eb...@l20g2000yqm.googlegroups.com:

> Hi,
>
> I have a struct A(undefined, can be any form) located at memory
> pointed by "char *ptr".
> I want to read it with "(A*)ptr".
>
> Now I wonder if the position of ptr may affect the behavior when
> accessing the struct.
>
> Consider the following condition:
> -----------------------------------------------
> _ _ _ _ | _ _ _ _ |
> 0 1 2 3 4 5 6 7
> position of pointer "ptr" = 2.
>
> If the first element of struct A is an int-type one, it will span
> positions 2~5.
> The CPU will load 0~3, and then 4~7 to get the data of the int-type
> member.

Accessing unaligned data is UB, depending on the platform this may crash
or produce wrong results. On some platforms it may works (notably x86),
but with a performance penalty.

For portable code one should memcpy the data in a suitable aligned buffer
before casting it to int*.

> But we C++ programmers don't need to care the position of ptr, right?

We should care that our programs work (not only at the moment, but also
on other systems and 20 years from now).

Regards
Paavo

thomas

unread,

Aug 20, 2010, 5:35:35 AM8/20/10

to

On Aug 20, 5:00 pm, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
> thomas <freshtho...@gmail.com> wrote in news:bb2ff772-9706-4b51-bcbb-
> 0353eb723...@l20g2000yqm.googlegroups.com:

OK.. One thing I want to make sure is whether "new/malloc" will
guarantee that the address of the allocated memory will be aligned
properly?

Francesco S. Carta

unread,

Aug 20, 2010, 5:43:23 AM8/20/10

to

thomas <fresh...@gmail.com>, on 20/08/2010 02:35:35, wrote:

> OK.. One thing I want to make sure is whether "new/malloc" will
> guarantee that the address of the allocated memory will be aligned
> properly?

I don't know about "malloc", but for "new" the answer is yes, it will be
properly allocated for the type with which "new" is called.

You can't obviously allocate via "new" for a type then cast the result
to another type and finally expect it to be correctly aligned for the
latter type: some cases will work, some others won't, but no guarantee
there - if I recall correctly.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com

Ian Collins

unread,

Aug 20, 2010, 5:43:40 AM8/20/10

to

On 08/20/10 09:35 PM, thomas wrote:
>
> OK.. One thing I want to make sure is whether "new/malloc" will
> guarantee that the address of the allocated memory will be aligned
> properly?

Yes.
--
Ian Collins

Francesco S. Carta

unread,

Aug 20, 2010, 5:45:43 AM8/20/10

to

Francesco S. Carta <entu...@gmail.com>, on 20/08/2010 11:43:23, wrote:

> thomas <fresh...@gmail.com>, on 20/08/2010 02:35:35, wrote:
>
>> OK.. One thing I want to make sure is whether "new/malloc" will
>> guarantee that the address of the allocated memory will be aligned
>> properly?
>
> I don't know about "malloc", but for "new" the answer is yes, it will be
> properly allocated for the type with which "new" is called.

(the above should read "properly _aligned_ for" etcetera)

Öö Tiib

unread,

Aug 20, 2010, 5:49:39 AM8/20/10

to

Allocation functions return pointers to storage that is appropriately
aligned for objects of any type. You can provide your own allocation
functions, then these are also assumed to do the same.

Francesco S. Carta

unread,

Aug 20, 2010, 6:02:15 AM8/20/10

to

嘱 Tiib <oot...@hot.ee>, on 20/08/2010 02:49:39, wrote:

> On 20 aug, 12:35, thomas<freshtho...@gmail.com> wrote:

<snip>

>> OK.. One thing I want to make sure is whether "new/malloc" will
>> guarantee that the address of the allocated memory will be aligned
>> properly?
>
> Allocation functions return pointers to storage that is appropriately
> aligned for objects of any type.

I'm not sure I understand this correctly, and if I do, I was not aware
of this: does that mean that "new" always allocates using the stricter
alignment requirements for any possible type _regardless_ of the type it
is called with?

I strongly suspect I misunderstood your statement.

thomas

unread,

Aug 20, 2010, 6:06:07 AM8/20/10

to

But if I use placement new, I can specify the address of a struct (may
be not properly aligned).
It's programmers' own risk to do this right?

Ian Collins

unread,

Aug 20, 2010, 6:54:35 AM8/20/10

to

On 08/20/10 10:06 PM, thomas wrote:

> On Aug 20, 5:49 pm, 嘱 Tiib<oot...@hot.ee> wrote:
>>
>> Allocation functions return pointers to storage that is appropriately
>> aligned for objects of any type. You can provide your own allocation
>> functions, then these are also assumed to do the same.
>
> But if I use placement new, I can specify the address of a struct (may
> be not properly aligned).
> It's programmers' own risk to do this right?

Very much so.

--
Ian Collins

Ian Collins

unread,

Aug 20, 2010, 7:00:43 AM8/20/10

to

On 08/20/10 10:02 PM, Francesco S. Carta wrote:
> 嘱 Tiib <oot...@hot.ee>, on 20/08/2010 02:49:39, wrote:
>
>> On 20 aug, 12:35, thomas<freshtho...@gmail.com> wrote:
>
> <snip>
>
>>> OK.. One thing I want to make sure is whether "new/malloc" will
>>> guarantee that the address of the allocated memory will be aligned
>>> properly?
>>
>> Allocation functions return pointers to storage that is appropriately
>> aligned for objects of any type.
>
> I'm not sure I understand this correctly, and if I do, I was not aware
> of this: does that mean that "new" always allocates using the stricter
> alignment requirements for any possible type _regardless_ of the type it
> is called with?

operator new only knows the size of an object, not its type.

--
Ian Collins

Francesco S. Carta

unread,

Aug 20, 2010, 7:15:15 AM8/20/10

to

OK, for the matter of correct alignment knowing the size is enough.

My question, somewhat, still stands, but I'm pretty sure I completely
misunderstood the original statement so never mind, sorry for the
additional noise.

Ian Collins

unread,

Aug 20, 2010, 7:35:08 AM8/20/10

to

No, for alignment, the size is irrelevant. As 嘱 Tiib said, allocation

functions return pointers to storage that is appropriately aligned for
objects of any type.

See 5.3.4 New, paragraphs 10 and 12.

--
Ian Collins

Francesco S. Carta

unread,

Aug 20, 2010, 8:04:43 AM8/20/10

to

Thanks for the reference, after reading it I think I understand how it
works, but now your first sentence above seems to clash with that
understanding.

For example, if I write "new char", the alignment requirements are
different from, say, "new long", how come that the size does not play
into this?

As I understand it now, the size of the requested type plays a
fundamental role in order to decide the minimal alignment requirements
to respect - exception made for char arrays and unsigned char arrays,
where the minimal requirement amounts to the largest possible type
fitting into that array.

What am I missing now?

Goran Pusic

unread,

Aug 20, 2010, 8:18:21 AM8/20/10

to

On Aug 20, 10:33 am, thomas <freshtho...@gmail.com> wrote:
> Hi,
>
> I have a struct A(undefined, can be any form) located at memory
> pointed by "char *ptr".

Take a good look at your design: I bet you that type is not
"undefined", you have a finite set of types, don't you? and so, your
char* is actually

union meh
{
TYPE1* p1;
TYPE2* p2;
TYPE3* p3;
// etc.
}

(and void* is better than char*).

(But that is tangential).

> I want to read it with "(A*)ptr".
>
> Now I wonder if the position of ptr may affect the behavior when
> accessing the struct.
>
> Consider the following condition:
> -----------------------------------------------
> _ _ _ _ | _ _ _ _ |
> 0 1 2 3 4 5 6 7
> position of pointer "ptr" = 2.

Is it possible that your actual code is e.g.:

void parseBinary(void* in)
{
switch(*reinterpret_cast<uint16*>(in))
{
case T1:
TYPE1* p = reinterpret_cast<TYPE1*>(uintptr_t+sizeof(uint16));
use(*p);
break;
case T2:
TYPE2* p = reinterpret_cast<TYPE2*>(uintptr_t+sizeof(uint16));
use(*p);
break;
}
}

If it is, say so! ;-)

As noted here, if this data is e.g. coming over the wire, you must
know whether your hardware platform supports alignment of raw data as
it is on the wire, and whether you can create appropriate packing with
the compiler (normally yes, but that's compiler-specific). If hardware
supports unaligned data, you can simply cast and avoid some copying.
If not, you must write appropriate conversion routines to convert raw
primitive types hidden in the "wire format" and copy that to your
actual data types.

Goran.

Pete Becker

unread,

Aug 20, 2010, 9:38:38 AM8/20/10

to

On 2010-08-20 08:04:43 -0400, Francesco S. Carta said:

>
> Thanks for the reference, after reading it I think I understand how it
> works, but now your first sentence above seems to clash with that
> understanding.
>
> For example, if I write "new char", the alignment requirements are
> different from, say, "new long", how come that the size does not play
> into this?
>
> As I understand it now, the size of the requested type plays a
> fundamental role in order to decide the minimal alignment requirements
> to respect - exception made for char arrays and unsigned char arrays,
> where the minimal requirement amounts to the largest possible type
> fitting into that array.
>

Keep in mind that "new char" and operator new(1) are two different
things. "new char" calls operator new to get memory, then constructs
the char object (which in the case of char does nothing). operator new
allocates the requested amount of memory and returns a void* that
points to it. It doesn't know the type that you're allocating.

The requested size isn't a reliable guide to a type's alignment. For example:

struct a
{
double d;
};

struct b
{
char data[sizeof double];
};

Both structs are typically the same size, but the first might have
alignment requirements that are different from those of the second.
Char arrays usually have no alignment restrictions, while some
architectures require doubles to be aligned on 8-byte boundaries.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Francesco S. Carta

unread,

Aug 20, 2010, 10:03:20 AM8/20/10

to

I believe you're sincerely trying to help me wrap my head around this,
but you're confusing me even more.

The standard says, in [expr.new] 5.3.4p10, that char arrays do have such
restrictions, but it says that related to char arrays created using
"new", that would mean that "new char[sizeof(double)]" should return an
address properly aligned for any object of the same size of a double.

Are you saying that the requirements are different for automatic and
static char arrays as opposed to dynamically created ones?

Pete Becker

unread,

Aug 20, 2010, 11:32:45 AM8/20/10

to

Sorry, I made a bit of a muddle here. The key point is that alignment
is defined by the hardware; whatever the standard says about alignment
is pretty much handwaving, to avoid putting unnecessary restrictions on
implementations.

Those words in [expr.new] try to put some logic into what typically
hasn't been explicitly laid out in the past. The underlying principle
is that the size of the requested allocation might eliminate some
types, because they're too large to fit in the requested space; in that
case, the alignment of the returned pointer doesn't have to be
appropriate for those types, because the memory block is too small to
hold them. For example, the result of calling operator new(4) doesn't
have to be aligned appropriately for an 8-bit double, only for types
that occupy four bits or less.

Because of the special role of char, I shouldn't have used it in my
example. Change the second struct to:

struct b
{
unsigned short data[sizeof(double) / sizeof(unsigned short)];
};

(and assume that the size of double is 8 bytes, and that the size of
unsigned short is 4 or 2, which are typical).

So both structs are the same size, and from a hardware perspective,
both structs can have different alignment requirements. But since
operator new has been asked for 8 bytes and doesn't have any more
information, it has to return a pointer that will work for either of
those types (as well as all other 8-byte types, but that's just a
difference in degree, not in kind).

Arrays of char and unsigned char are special because they're typically
used as byte buffers, so they have to be aligned for pretty much
anything that the user wants to stuff in there:

unsigned char *buffer = new char[sizeof(double)];
*(double*)buffer = 7.0;

For this to work, buffer has to be aligned appropriately for double.

Francesco S. Carta

unread,

Aug 20, 2010, 12:04:50 PM8/20/10

to

No problem, the important thing is that the pieces are lining up neatly
in my mind, now.

> Those words in [expr.new] try to put some logic into what typically
> hasn't been explicitly laid out in the past. The underlying principle is
> that the size of the requested allocation might eliminate some types,
> because they're too large to fit in the requested space; in that case,
> the alignment of the returned pointer doesn't have to be appropriate for
> those types, because the memory block is too small to hold them. For
> example, the result of calling operator new(4) doesn't have to be
> aligned appropriately for an 8-bit double, only for types that occupy
> four bits or less.
>
> Because of the special role of char, I shouldn't have used it in my
> example. Change the second struct to:
>
> struct b
> {
> unsigned short data[sizeof(double) / sizeof(unsigned short)];
> };
>
> (and assume that the size of double is 8 bytes, and that the size of
> unsigned short is 4 or 2, which are typical).
>
> So both structs are the same size, and from a hardware perspective, both
> structs can have different alignment requirements. But since operator
> new has been asked for 8 bytes and doesn't have any more information, it
> has to return a pointer that will work for either of those types (as
> well as all other 8-byte types, but that's just a difference in degree,
> not in kind).

OK, let's see if I got this straight. Since the size of those two
structs is the same, they will both fit correctly aligned in a space
allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
that matter).

But "new short[sizeof(double) / sizeof(short)]", alone and by itself,
has to satisfy the alignment requirements of "short", regardless of the
size of the array itself.

So in some sense, both the type and the size of that type do play an
important role in the alignment of the address returned by dynamic
allocators - although that role is implementation defined within the
limits mandated by the standard.

Thanks a lot for your explanations Pete.

James Kanze

unread,

Aug 20, 2010, 12:15:59 PM8/20/10

to

On Aug 20, 9:33 am, thomas <freshtho...@gmail.com> wrote:

> I have a struct A(undefined, can be any form) located at memory
> pointed by "char *ptr".

Where did you get the char* from?

> I want to read it with "(A*)ptr".

If the pointer was originally an A*, then it should work.
Otherwise, you're looking for problems.

Note that you should be using reinterpret_cast here, so that
future readers of your code know that you're playing with fire.

> Now I wonder if the position of ptr may affect the behavior when
> accessing the struct.

> Consider the following condition:
> -----------------------------------------------
> _ _ _ _ | _ _ _ _ |
> 0 1 2 3 4 5 6 7
> position of pointer "ptr" = 2.

> If the first element of struct A is an int-type one, it will span
> positions 2~5.
> The CPU will load 0~3, and then 4~7 to get the data of the int-type
> member.
> But we C++ programmers don't need to care the position of ptr, right?

If the pointer initially came from an A*, it won't have position
2.

How did the data get into this array? If the data is from some
external source, alignment isn't the only problem you have to
worry about.

> (I remember some accesses to unbounded memory positions may
> get system crash, but I cannot remember in which case.
> A little confused.)

Out of bounds or misaligned are undefined behavior, and can
cause your program to crash.

--
James Kanze

James Kanze

unread,

Aug 20, 2010, 12:37:15 PM8/20/10

to

On Aug 20, 10:43 am, "Francesco S. Carta" <entul...@gmail.com> wrote:

> thomas <freshtho...@gmail.com>, on 20/08/2010 02:35:35, wrote:

> > OK.. One thing I want to make sure is whether "new/malloc" will
> > guarantee that the address of the allocated memory will be aligned
> > properly?

> I don't know about "malloc", but for "new" the answer is yes, it will be
> properly allocated for the type with which "new" is called.

If he was comparing to malloc, he probably meant the operator
new function. Both malloc and the operator new function are
guaranteed to return memory suitably aligned for any object.

> You can't obviously allocate via "new" for a type then cast
> the result to another type and finally expect it to be
> correctly aligned for the latter type: some cases will work,
> some others won't, but no guarantee there - if I recall
> correctly.

I you use a new expression to allocate, then go casting to some
unrelated type, you don't have any guarantees, except that you
can reinterpret_cast to char* and get a byte dump of the object
(and you can reinterpret_cast the char* back to the original
type, and use that pointer).

--
James Kanze

Francesco S. Carta

unread,

Aug 20, 2010, 12:47:39 PM8/20/10

to

James Kanze <james...@gmail.com>, on 20/08/2010 09:37:15, wrote:

> On Aug 20, 10:43 am, "Francesco S. Carta"<entul...@gmail.com> wrote:
>> thomas<freshtho...@gmail.com>, on 20/08/2010 02:35:35, wrote:
>
>>> OK.. One thing I want to make sure is whether "new/malloc" will
>>> guarantee that the address of the allocated memory will be aligned
>>> properly?
>
>> I don't know about "malloc", but for "new" the answer is yes, it will be
>> properly allocated for the type with which "new" is called.
>
> If he was comparing to malloc, he probably meant the operator
> new function. Both malloc and the operator new function are
> guaranteed to return memory suitably aligned for any object.

Suitably aligned for any object that fits into the allocated space, as I
seem to have understood, so far, from the other branches of this thread.

In any case, I was surely overlooking (once more) the difference between
using plain "new" and calling "the operator new function". How were they
commonly referred to? "new expression" and "operator new"? I never
recall this correctly and I have to resort to some weird periphrasis
that don't really fit well.

>> You can't obviously allocate via "new" for a type then cast
>> the result to another type and finally expect it to be
>> correctly aligned for the latter type: some cases will work,
>> some others won't, but no guarantee there - if I recall
>> correctly.
>
> I you use a new expression to allocate, then go casting to some
> unrelated type, you don't have any guarantees, except that you
> can reinterpret_cast to char* and get a byte dump of the object
> (and you can reinterpret_cast the char* back to the original
> type, and use that pointer).

Yep, I recalled correctly, "no guarantee there".

Ian Collins

unread,

Aug 20, 2010, 4:59:21 PM8/20/10

to

On 08/21/10 04:04 AM, Francesco S. Carta wrote:
>
> OK, let's see if I got this straight. Since the size of those two
> structs is the same, they will both fit correctly aligned in a space
> allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
> that matter).

I think you are still missing the point that size and alignment are
orthogonal.

> But "new short[sizeof(double) / sizeof(short)]", alone and by itself,
> has to satisfy the alignment requirements of "short", regardless of the
> size of the array itself.
>
> So in some sense, both the type and the size of that type do play an
> important role in the alignment of the address returned by dynamic
> allocators - although that role is implementation defined within the
> limits mandated by the standard.

No, the size is still irrelevant to the guarantee. Whether you are
allocating a double or an array of sizeof(double) char or a single char,
the alignment of the pointer will be the same. The pointer returned
will always be appropriately aligned for objects of any type.

--
Ian Collins

Pete Becker

unread,

Aug 20, 2010, 5:08:11 PM8/20/10

to

On 2010-08-20 12:47:39 -0400, Francesco S. Carta said:

>
> In any case, I was surely overlooking (once more) the difference
> between using plain "new" and calling "the operator new function". How
> were they commonly referred to? "new expression" and "operator new"? I
> never recall this correctly and I have to resort to some weird
> periphrasis that don't really fit well.
>

Those of us who are into close reading call them the "new operator" and
"operator new", the second one being a library function that allocates
raw memory. The first one is a keyword, and the compiler generates code
to call operator new and construct an object at the returned address.
(We used to have a similar distinction between a "class template" and a
"template class", but that was too confusing; now they're "template
class" and "template instantiation", not necessarily in that order,
because the first two confuse me).

If you're really into technicalities, since a class can override
operator new, the actual code that the compiler generates for new T
typically calls T's constructor with a null pointer, and T's
constructor calls operator new. For objects that don't have dynamic
storage duration (that is, those with static, thread, or automatic
storage duration in C++0x) the compiler passes the address where the
object will be located, and the constructor simply uses that. Either
way, the constructor actually returns a value: the address of the newly
constructed object. But that's all implementation details.

Alf P. Steinbach /Usenet

unread,

Aug 20, 2010, 7:37:15 PM8/20/10

to

* Ian Collins, on 20.08.2010 22:59:

> On 08/21/10 04:04 AM, Francesco S. Carta wrote:
>>
>> OK, let's see if I got this straight. Since the size of those two
>> structs is the same, they will both fit correctly aligned in a space
>> allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
>> that matter).
>
> I think you are still missing the point that size and alignment are
> orthogonal.

I'd miss it too, since alignment requirement /is/ a function of size.

That goes down to the hardware level, it's not a matter of formalism.

Regarding the formalism, specific alignment for a given object can be specified,
by language extensions or in C++0x within the standard language; such a specific
alignment must conform to the alignment requirement for the type; the alignment
requirement for the type in turn must conform to the alignment requirement for
the type's size, which is what matter to the HW.

>> But "new short[sizeof(double) / sizeof(short)]", alone and by itself,
>> has to satisfy the alignment requirements of "short", regardless of the
>> size of the array itself.
>>
>> So in some sense, both the type and the size of that type do play an
>> important role in the alignment of the address returned by dynamic
>> allocators - although that role is implementation defined within the
>> limits mandated by the standard.
>
> No, the size is still irrelevant to the guarantee.

In theory yes the size is irrelevant to the new operator's guarantee, since it
guarantees to return a pointer suitably aligned for any item.

However, in practice there can be hardware support for items of larger size than
standard C++ supports, then with special alignment requirements.

E.g., for SIMD instructions there is a 16-byte alignment requirement, and most
likely operator new will use an 8-byte alignment (at best), not sufficient.

> Whether you are
> allocating a double or an array of sizeof(double) char or a single char,
> the alignment of the pointer will be the same. The pointer returned will
> always be appropriately aligned for objects of any type.

Cheers,

- Alf (picking nits)

--
blog at <url: http://alfps.wordpress.com>

Gennaro Prota

unread,

Aug 20, 2010, 7:52:16 PM8/20/10

to

On 20/08/2010 18.47, Francesco S. Carta wrote:
> James Kanze <james...@gmail.com>, on 20/08/2010 09:37:15, wrote:
>
>> On Aug 20, 10:43 am, "Francesco S. Carta"<entul...@gmail.com> wrote:
>>> thomas<freshtho...@gmail.com>, on 20/08/2010 02:35:35, wrote:
>>
>>>> OK.. One thing I want to make sure is whether "new/malloc" will
>>>> guarantee that the address of the allocated memory will be aligned
>>>> properly?
>>
>>> I don't know about "malloc", but for "new" the answer is yes, it will be
>>> properly allocated for the type with which "new" is called.
>>
>> If he was comparing to malloc, he probably meant the operator
>> new function. Both malloc and the operator new function are
>> guaranteed to return memory suitably aligned for any object.
>
> Suitably aligned for any object that fits into the allocated space, as I
> seem to have understood, so far, from the other branches of this thread.
>
> In any case, I was surely overlooking (once more) the difference between
> using plain "new" and calling "the operator new function". How were they
> commonly referred to? "new expression" and "operator new"?

Use the terminology of the standard ("new expression" and
"operator new function")!

<asides>
- The reason for the excitement and the consequent exclamation
point is that I do use it, and I feel lonely.

- If you really want to refer to the operator (as opposed to
the expression) you have to think more.
</asides>

I seem to recall someone advocating the usage of "operator new"
for the function(s) and "new operator" for the operator but to
me... Perhaps it's crystal clear to a native speaker? (Anyway,
it may work in some contexts, especially when italics or
different fonts are used properly, and I've seen it used by
Dewhurst with extreme clarity. It's just not something to
advocate in general.)

I think one may also use "new allocation function", which is
unambiguous too.

PS: quite obviously, too, if one says for instance "write
operator delete if you write operator new", like our Scott
Meyers does, it's clear that he's not talking about the
operators :-) Yet I'm picky enough to avoid such language anyway
(no, not avoiding C++, just such terminology ;-)).

--
Gennaro Prota | I'm available for your projects.
Breeze (preview): <https://sourceforge.net/projects/breeze/>

Ian Collins

unread,

Aug 20, 2010, 8:12:42 PM8/20/10

to

On 08/21/10 11:37 AM, Alf P. Steinbach /Usenet wrote:
> * Ian Collins, on 20.08.2010 22:59:
>> On 08/21/10 04:04 AM, Francesco S. Carta wrote:
>>>
>>> OK, let's see if I got this straight. Since the size of those two
>>> structs is the same, they will both fit correctly aligned in a space
>>> allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
>>> that matter).
>>
>> I think you are still missing the point that size and alignment are
>> orthogonal.
>
> I'd miss it too, since alignment requirement /is/ a function of size.

No, it isn't, at least not to the size we are debating, that is the size
passed to the allocation function.

> That goes down to the hardware level, it's not a matter of formalism.

Correct. A 32 bit machine may be able to perform efficient byte aligned
operation on any type, so it could have an alignment requirement of one
for any type. Another 32 bit machine may be incapable of byte
addressing and therefore have an alignment requirement of four for any type.

A lot of 32 bit machines have the same alignment requirement (four) for
32 bit float and 64 bit double (or 32 bit long and 64 bit long long).

The alignment used by allocator functions is typically bigger than that
required by the machine. All the hosted environment allocators I've
checked use 16 or 32.

>> No, the size is still irrelevant to the guarantee.
>
> In theory yes the size is irrelevant to the new operator's guarantee,
> since it guarantees to return a pointer suitably aligned for any item.

In practice too. The guarantee is just that, a guarantee.

It might be clearer to say the size /passed/ the the allocation function
is irrelevant. The alignment used is derived from the hardware, not run
time information.

> Cheers,
>
> - Alf (picking nits)

Putting some back!

Cheers,

--
Ian Collins

Gennaro Prota

unread,

Aug 20, 2010, 8:36:49 PM8/20/10

to

On 20/08/2010 17.32, Pete Becker wrote:
[...]

>> The standard says, in [expr.new] 5.3.4p10, that char arrays do have
>> such restrictions, but it says that related to char arrays created
>> using "new", that would mean that "new char[sizeof(double)]" should
>> return an address properly aligned for any object of the same size of a
>> double.
>>
>> Are you saying that the requirements are different for automatic and
>> static char arrays as opposed to dynamically created ones?
>
> Sorry, I made a bit of a muddle here. The key point is that alignment
> is defined by the hardware; whatever the standard says about alignment
> is pretty much handwaving, to avoid putting unnecessary restrictions on
> implementations.
>
> Those words in [expr.new] try to put some logic into what typically
> hasn't been explicitly laid out in the past. The underlying principle
> is that the size of the requested allocation might eliminate some
> types, because they're too large to fit in the requested space; in that
> case, the alignment of the returned pointer doesn't have to be
> appropriate for those types, because the memory block is too small to
> hold them. For example, the result of calling operator new(4) doesn't
> have to be aligned appropriately for an 8-bit double, only for types
> that occupy four bits or less.

Actually this seems in contradiction with 3.7.3.1/2 and with the
note to 5.3.4/10 itself which says "aligned for objects of any
type". When will the standard writers employ the same principles
that hold for programming, such as DRY?

And I think, at this time of the night, the "difference between
the result of the new-expression and the address returned by the
allocation function" matter is just confusing me.

This stuff concretely affects what the programmer knows about
the pointer returned by an operator new function so, if after
some sleep, I still think that there's a contradiction I'll file
a DR (I don't file DRs for logical inconsistencies and such with
no concrete impact on real software. There are just too many of
those, IMHO, and the result is just wasting time, for the
committee and for language users, who would see other important
things delayed).

I'm particularly worried that someone just reads 3.7.3.1 and
assumes "aligned for everything" (which is correct: why should
he suspect that another section says otherwise?).

In this case, where should the DR go? Core or library? Both?
(I'm inclined to "core"... but this lays astride (well, more on
the core side of the things, as I see it)).

Ian Collins

unread,

Aug 20, 2010, 9:35:43 PM8/20/10

to

On 08/21/10 12:36 PM, Gennaro Prota wrote:
> On 20/08/2010 17.32, Pete Becker wrote:
>>
>> Those words in [expr.new] try to put some logic into what typically
>> hasn't been explicitly laid out in the past. The underlying principle
>> is that the size of the requested allocation might eliminate some
>> types, because they're too large to fit in the requested space; in that
>> case, the alignment of the returned pointer doesn't have to be
>> appropriate for those types, because the memory block is too small to
>> hold them. For example, the result of calling operator new(4) doesn't
>> have to be aligned appropriately for an 8-bit double, only for types
>> that occupy four bits or less.
>
> Actually this seems in contradiction with 3.7.3.1/2 and with the
> note to 5.3.4/10 itself which says "aligned for objects of any
> type".

I agree. 3.7.3.1 states that "The pointer returned shall be suitably
aligned so that it can be converted to a pointer of any complete object
type". The note in 5.3.4p10 is based on that.

The examples in 5.3.4p12 appears to contradict what Pete says:

5.3.4p12 [Example:

— new T results in a call of operator new(sizeof(T)),

which follows on form the note in 5.3.4p10

[Note: Because allocation functions are assumed to
return pointers to storage that is appropriately aligned for objects of
any type, this constraint on array allocation overhead permits the
common idiom of allocating character arrays into which objects of other
types will later be placed. ]

Which implies to me that new(4) *does* have to be (or at least will be)
aligned appropriately for an 8-bit double, even though the space is too
small.

> This stuff concretely affects what the programmer knows about
> the pointer returned by an operator new function so, if after
> some sleep, I still think that there's a contradiction I'll file
> a DR (I don't file DRs for logical inconsistencies and such with
> no concrete impact on real software.

All the programmer has to know is the pointer is suitably aligned.

> I'm particularly worried that someone just reads 3.7.3.1 and
> assumes "aligned for everything" (which is correct: why should
> he suspect that another section says otherwise?).

Because none does?

--
Ian Collins

Pete Becker

unread,

Aug 21, 2010, 6:49:21 AM8/21/10

to

On 2010-08-20 21:35:43 -0400, Ian Collins said:

>
> The examples in 5.3.4p12 appears to contradict what Pete says:
>
> 5.3.4p12 [Example:
>
> — new T results in a call of operator new(sizeof(T)),

new T, calls operator new

>
> which follows on form the note in 5.3.4p10
> [Note: Because allocation functions are assumed to
> return pointers to storage that is appropriately aligned for objects of
> any type, this constraint on array allocation overhead permits the
> common idiom of allocating character arrays into which objects of other
> types will later be placed. ]

"array allocation overhead", i.e. new T [], calls operator new[]

>
> Which implies to me that new(4) *does* have to be (or at least will be)
> aligned appropriately for an 8-bit double, even though the space is too
> small.

Array new and plain new have different requirements.

Francesco S. Carta

unread,

Aug 21, 2010, 7:30:48 AM8/21/10

to

I am replying to nobody in particular, right now. It seems to me that
people more experienced than me have not yet found an agreement in this
very thread, so I'll stay out of the discussion because my intervention
will likely add confusion instead of dissipating it.

This is just an update about what I've worked out till now.

I have tracked down a misconception of mine that was influencing my
thought about all of this.

I've created a struct that occupies 64 bytes of storage, then I
dynamically allocated it and I checked its address to see if, as I was
expecting, it was a multiple of 64. As it turned out, it wasn't, showing
me that my assumption was stupid at best.

I have no idea about where I have taken that misconception from, I could
very well have made it up all by myself.

I still need some study to clarify the concepts used by the standard
about "operator new", "new operator", their array versions, the flow of
these operations and so on, so I won't speak any further about a subject
that isn't clear enough to me yet, I'll come back about this later to
ask further informations if reading the relative parts of the standard
will not suffice.

About the implementation-defined part of this discussion, that is, the
actual value of the alignments, finding a method for determining this
value will at least help me understand the implementation's behavior,
which in turn will eventually help me understanding the abstract
behavior mandated by the standard.

I've found a method in the page linked below, and the explanations given
make sense to me, any further reference or eventually a warning about
wrong informations given there (if any) will be more than welcome:

http://www.monkeyspeak.com/alignment/

James Kanze

unread,

Aug 21, 2010, 11:08:21 AM8/21/10

to

On Aug 20, 10:08 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2010-08-20 12:47:39 -0400, Francesco S. Carta said:

[...]

> If you're really into technicalities, since a class can override
> operator new, the actual code that the compiler generates for new T
> typically calls T's constructor with a null pointer, and T's
> constructor calls operator new.

I know that this is a common implementation, but is it formally
conform? I'm not sure that it is 100% clear, but by my reading,
the following program is conform:

class C
{
void* operator new(size_t);
public:
C();
};

int
main()
{
C aC;
}

With at least some compilers I've used, this (or something
similar---I don't have a compiler installed on this machine to
test) fails to link, because there is no definition of
C::operator new (and the compiler is using the implementation
you suggest).

A similar problem arises with delete.

--
James Kanze

James Kanze

unread,

Aug 21, 2010, 11:14:17 AM8/21/10

to

On Aug 21, 12:37 am, "Alf P. Steinbach /Usenet" <alf.p.steinbach

+use...@gmail.com> wrote:
> * Ian Collins, on 20.08.2010 22:59:

> > On 08/21/10 04:04 AM, Francesco S. Carta wrote:

> >> OK, let's see if I got this straight. Since the size of those two
> >> structs is the same, they will both fit correctly aligned in a space
> >> allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
> >> that matter).

> > I think you are still missing the point that size and alignment are
> > orthogonal.

> I'd miss it too, since alignment requirement /is/ a function of size.

Not "a function of size", but "constrained by size". The
alignment isn't orthogonal, since the constraint exists
(alignment cannot be greater than size), but it's not a function
of size either, since as long as the constraint is met, there's
no problem.

> That goes down to the hardware level, it's not a matter of formalism.

Except that Ian has already posted a contradicting example:
char[sizeof(double)] and double have the same size (required by
the standard), but don't have the same alignment requirements.

[...]

> Regarding the formalism, specific alignment for a given object

> In theory yes the size is irrelevant to the new operator's
> guarantee, since it guarantees to return a pointer suitably
> aligned for any item.

I'm not sure about this. I've heard two explinations (and I
can't remember which is right): a non-member allocation function
must return memory sufficiently aligned for any object, or it
must return memory sufficiently aligned for any object which
will fit.

--
James Kanze

Alf P. Steinbach /Usenet

unread,

Aug 21, 2010, 11:49:48 AM8/21/10

to

* James Kanze, on 21.08.2010 17:14:

> On Aug 21, 12:37 am, "Alf P. Steinbach /Usenet"<alf.p.steinbach
> +use...@gmail.com> wrote:
>> * Ian Collins, on 20.08.2010 22:59:
>
>>> On 08/21/10 04:04 AM, Francesco S. Carta wrote:
>
>>>> OK, let's see if I got this straight. Since the size of those two
>>>> structs is the same, they will both fit correctly aligned in a space
>>>> allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
>>>> that matter).
>
>>> I think you are still missing the point that size and alignment are
>>> orthogonal.
>
>> I'd miss it too, since alignment requirement /is/ a function of size.
>
> Not "a function of size", but "constrained by size". The
> alignment isn't orthogonal, since the constraint exists
> (alignment cannot be greater than size), but it's not a function
> of size either, since as long as the constraint is met, there's
> no problem.

Well it gets thorny in the details. For the general case the basic HW alignment
on a given system is a function of size: size -> alignment requirement. But then
you have things like special instructions for special kinds of data, with
special alignment requirements -- outside of the general case, and not very
well covered, if at all, by C++ rules. Yeah, I could've been more precise.

>> That goes down to the hardware level, it's not a matter of formalism.
>
> Except that Ian has already posted a contradicting example:
> char[sizeof(double)] and double have the same size (required by
> the standard), but don't have the same alignment requirements.

No it's not a contradiction. The data item size for char[N], what you can deal
with via assignment, is 1. Just a matter of picking the Right Size. <g>

> [...]
>> Regarding the formalism, specific alignment for a given object
>> In theory yes the size is irrelevant to the new operator's
>> guarantee, since it guarantees to return a pointer suitably
>> aligned for any item.
>
> I'm not sure about this. I've heard two explinations (and I
> can't remember which is right): a non-member allocation function
> must return memory sufficiently aligned for any object, or it
> must return memory sufficiently aligned for any object which
> will fit.

C++98 §3.7.3.1/2 "The pointer returned shall be suitably aligned so that it can
be converted to a pointer of any complete object type and then used to access
the object or array in the storage allocated".

At the ultra-formal level I think this doesn't say anything, since "converted"
isn't defined (at least as far as I can see).

But from a mereley practical-formal point of view it says the address must be
suitably aligned for any complete object type.

Cheers,

- Alf

Larry Evans

unread,

Aug 21, 2010, 12:06:36 PM8/21/10

to

On 08/21/10 06:30, Francesco S. Carta wrote:
[snip]

> I've created a struct that occupies 64 bytes of storage, then I
> dynamically allocated it and I checked its address to see if, as I was
> expecting, it was a multiple of 64. As it turned out, it wasn't, showing
> me that my assumption was stupid at best.
>
> I have no idea about where I have taken that misconception from, I could
> very well have made it up all by myself.

[snip]

Hi Francesco.

I struggled a few yrs (months?) ago about that and think I've
figured it out.

The alignment of your 64 byte long structure is dependent on the
types contained in the structure.

Using variadic template:

http://www.osl.iu.edu/~dgregor/cpp/variadic-templates.html

notation, augmented with:

template
< unsigned Index
, typename... T
>
struct at_c
{
typedef
//T...[I] the I-th type in T...
type
;
};

assume:

template
< typename... T
>
struct C
{
typename at_c<1,T...>::type m_1;
typename at_c<2,T...>::type m_2;
...
typename at_c<n,T...>::type m_n;
};

then, again

then alignof(C) is *only* a function of the alignments of
T... .

I'm pretty sure:

eq[1]: alignof(C) == fold(lcm,1,alignof(T)...)

where fold is the haskell foldl or foldr
which is basically the stl::accumulate.
alignof(T)... basically expands to:

alignof(T_1), alignof(T_2), ..., alignof(T_n)

NOTE, sizeof doesn't appear in eq[1]:.

If eq[1] is true, then the
offsets for all the members can
be easily calculated by assuring that
the offset of m_i in C is divisible by the
alignof(T_i). This is because, since the start of C is
at an address that is divisible by alignof(T_i)
(since alignof(C) is a multiple of alignof(T_i),
since alignof(C) was calculated as the lcm of all of
the alignof(T...)) , then m_i will be located at an address divisible by
aligndof(T_i).

All this is implemented with templates here:

http://svn.boost.org/svn/boost/sandbox/variadic_templates/boost/composite_storage/

Now the one thing missing from this explanation is the role
of size. Now in order for all C's in:

C cvec[2];

to be properly aligned, the sizeof(C) must also be a multiple
of the alignof(C). Thus there may be some padding at the
end of C to achieve this. Likewise, in order to calculates
the offset of m_i to put m_i at the proper alignment, some
padding may also be needed. I'm pretty sure:

Sorry if this seems complicated. It is, at least to me :(

HTH.

-Larry

Francesco S. Carta

unread,

Aug 21, 2010, 1:35:51 PM8/21/10

to

Larry Evans <cpplj...@suddenlink.net>, on 21/08/2010 11:06:36, wrote:

> On 08/21/10 06:30, Francesco S. Carta wrote:
> [snip]
>> I've created a struct that occupies 64 bytes of storage, then I
>> dynamically allocated it and I checked its address to see if, as I was
>> expecting, it was a multiple of 64. As it turned out, it wasn't, showing
>> me that my assumption was stupid at best.
>>
>> I have no idea about where I have taken that misconception from, I could
>> very well have made it up all by myself.
> [snip]
>
> Hi Francesco.
>
> I struggled a few yrs (months?) ago about that and think I've
> figured it out.
>
> The alignment of your 64 byte long structure is dependent on the
> types contained in the structure.

Hi Larry, thank you for your post.

Indeed, I suspected something like that, by now I see that regardless of
what I am allocating, the "largest" alignment amounts to 32 bytes on my
system, so it all should come down to the most stringent alignment
required for fundamental types.

> Using variadic template:

I'm snipping all of your post for brevity as...

<snip>

> Sorry if this seems complicated. It is, at least to me :(

...believe me, all of that is not complicated for me, it is just plain
unintelligible: I never really put my hands on such advanced features of
the upcoming new C++ standard, I've only read about some of them in some
article here and there, I'm afraid I'll have to stay on the current
standard for some good time still, as there are too many basic concepts
that still trick me even if I already have acquired some good bits on
all the existing features.

Nonetheless I'll give it a shot and I'll come back here, because sooner
or later I'll have to face such stuff so maybe it's better to start
before my brain goes completely rusty :-)

Gennaro Prota

unread,

Aug 21, 2010, 1:40:48 PM8/21/10

to

And it becomes particularly interesting when you add =delete to
the declaration (in C++0x, of course).

Larry Evans

unread,

Aug 21, 2010, 1:41:50 PM8/21/10

to

On 08/21/10 11:06, Larry Evans wrote:
[snip]

An alternate implementation might decide on:

alignof(C) = alignof(T_1)

and then adjust the offsets,
(off_2, off_3,...,off_n)
of:

(m_2, m_3,...m_n)

to assure that, for any memory location:

loc_C

such that loc_C%alignof(C) == 0,

(loc_C+off_i)%alignof(T_i) == 0

However, I've not tried figuring out how
to do that.

Maybe someone else could or explain why it's
worse than the first method?

-Larry

Francesco S. Carta

unread,

Aug 21, 2010, 2:31:26 PM8/21/10

to

Francesco S. Carta <entu...@gmail.com>, on 21/08/2010 19:35:51, wrote:

> the "largest" alignment amounts to 32 bytes on my system

Scratch that. I thought it was so, but it was just a weird result of
some kind of template hack I've tried to implement on my own in order to
guess the actual alignment.

I still wonder if the method provided here can be trusted or not:

http://www.monkeyspeak.com/alignment/

Take a look at the following piece of code and its results on my setup
[MinGW 4.4.0, Windows7, AMD Athlon(tm) II Dual-Core M300 2.00 Ghz]

I wonder why "long double" gives a stride of 4 while "double" gives a
stride of 8 - where "stride" should stand for something like
"alignment", in the above linked page:

(maybe "long double" is stored as three floats or three longs in my
implementation?)

//-------
#include <iostream>

template <typename T>
struct Tchar {
T t;
char c;
};

#define strideof(T) \
((sizeof(Tchar<T>) > sizeof(T)) ? \
sizeof(Tchar<T>)-sizeof(T) : sizeof(T))

#define PRINT_DATA_ABOUT(T) \
std::cout << strideof(T) << ", " \
<< sizeof(T) << ", " << #T << std::endl

struct eight_doubles {
double data[8];
};

int main() {
std::cout << "stride, size, type" << std::endl;
PRINT_DATA_ABOUT(char);
PRINT_DATA_ABOUT(short);
PRINT_DATA_ABOUT(int);
PRINT_DATA_ABOUT(long);
PRINT_DATA_ABOUT(float);
PRINT_DATA_ABOUT(double);
PRINT_DATA_ABOUT(long double);
PRINT_DATA_ABOUT(eight_doubles);
}

/*
output:

stride, size, type
1, 1, char
2, 2, short
4, 4, int
4, 4, long
4, 4, float
8, 8, double
4, 12, long double
8, 64, eight_doubles

*/

//-------

Well, never mind. Just implementation details curiosities, I suppose.

Francesco S. Carta

unread,

Aug 21, 2010, 2:58:48 PM8/21/10

to

Gennaro Prota <gennar...@yahoo.com>, on 21/08/2010 01:52:16, wrote:

> On 20/08/2010 18.47, Francesco S. Carta wrote:
>> James Kanze<james...@gmail.com>, on 20/08/2010 09:37:15, wrote:
>>
>>> On Aug 20, 10:43 am, "Francesco S. Carta"<entul...@gmail.com> wrote:
>>>> thomas<freshtho...@gmail.com>, on 20/08/2010 02:35:35, wrote:
>>>
>>>>> OK.. One thing I want to make sure is whether "new/malloc" will
>>>>> guarantee that the address of the allocated memory will be aligned
>>>>> properly?
>>>
>>>> I don't know about "malloc", but for "new" the answer is yes, it will be
>>>> properly allocated for the type with which "new" is called.
>>>
>>> If he was comparing to malloc, he probably meant the operator
>>> new function. Both malloc and the operator new function are
>>> guaranteed to return memory suitably aligned for any object.
>>
>> Suitably aligned for any object that fits into the allocated space, as I
>> seem to have understood, so far, from the other branches of this thread.
>>
>> In any case, I was surely overlooking (once more) the difference between
>> using plain "new" and calling "the operator new function". How were they
>> commonly referred to? "new expression" and "operator new"?
>
> Use the terminology of the standard ("new expression" and
> "operator new function")!

Indeed I should, I just need more study to actually associate some
well-grasped content to those labels ;-)

> <asides>
> - The reason for the excitement and the consequent exclamation
> point is that I do use it, and I feel lonely.

Of course you're not the only one who uses the standard terms. I
actually try to use them as long as I feel on firm ground.

> - If you really want to refer to the operator (as opposed to
> the expression) you have to think more.

This is not clear to me. Do you mean that I should think more if I
decide to add an "operator new" to some class? I actually tried, in the
past, when I was playing with some kind of memory pool allocator. I
realized, with the help of this group, that it was too an advanced task
for my knowledge back then (and the situation hasn't change much since
then).

> </asides>
>
> I seem to recall someone advocating the usage of "operator new"
> for the function(s) and "new operator" for the operator but to
> me... Perhaps it's crystal clear to a native speaker? (Anyway,
> it may work in some contexts, especially when italics or
> different fonts are used properly, and I've seen it used by
> Dewhurst with extreme clarity. It's just not something to
> advocate in general.)
>
> I think one may also use "new allocation function", which is
> unambiguous too.

That seems good, yes.

Thanks for your notes Gennaro.

Francesco S. Carta

unread,

Aug 21, 2010, 3:14:24 PM8/21/10

to

Pete Becker <pe...@versatilecoding.com>, on 20/08/2010 17:08:11, wrote:

> On 2010-08-20 12:47:39 -0400, Francesco S. Carta said:
>
>>
>> In any case, I was surely overlooking (once more) the difference
>> between using plain "new" and calling "the operator new function". How
>> were they commonly referred to? "new expression" and "operator new"? I
>> never recall this correctly and I have to resort to some weird
>> periphrasis that don't really fit well.
>>
>
> Those of us who are into close reading call them the "new operator" and
> "operator new", the second one being a library function that allocates
> raw memory. The first one is a keyword, and the compiler generates code
> to call operator new and construct an object at the returned address.
> (We used to have a similar distinction between a "class template" and a
> "template class", but that was too confusing; now they're "template
> class" and "template instantiation", not necessarily in that order,
> because the first two confuse me).

Yes, "template class" and "template instantiation" are quite less
confusing and more explicative, but now that I'm really thinking into
it, calling "new operator" the plain "new" used in the expressions makes
sense, as well calling "operator new" the library function and their
overrides (either the general or the per-class ones) makes sense as they
reflect their declarations... it's quite less confusing now.

> If you're really into technicalities

I try to, not so successfully at times, but I try.

> since a class can override
> operator new, the actual code that the compiler generates for new T
> typically calls T's constructor with a null pointer, and T's constructor
> calls operator new. For objects that don't have dynamic storage duration
> (that is, those with static, thread, or automatic storage duration in
> C++0x) the compiler passes the address where the object will be located,
> and the constructor simply uses that. Either way, the constructor
> actually returns a value: the address of the newly constructed object.
> But that's all implementation details.

Well, I like to know them.

Actually I'm not scared by the amount of details about such technical
subjects - the complexity of C++ has been lamented lots of times in lots
of places, but I never felt it as something I could not overcome with
some good effort - the point is just to understand these details in
order to keep them better impressed into my memory - after all, if I
able to speak several different natural languages without getting
confused (most of the time, at least) then C++ should fall within my
capabilities.

Thanks for your notes Pete.

Larry Evans

unread,

Aug 21, 2010, 3:32:27 PM8/21/10

to

On 08/21/10 13:31, Francesco S. Carta wrote:
[snip]

> I still wonder if the method provided here can be trusted or not:
>
> http://www.monkeyspeak.com/alignment/
>
> Take a look at the following piece of code and its results on my setup
> [MinGW 4.4.0, Windows7, AMD Athlon(tm) II Dual-Core M300 2.00 Ghz]
>
> I wonder why "long double" gives a stride of 4 while "double" gives a
> stride of 8 - where "stride" should stand for something like
> "alignment", in the above linked page:
>
> (maybe "long double" is stored as three floats or three longs in my
> implementation?)
>
> //-------
> #include <iostream>
>
> template <typename T>
> struct Tchar {
> T t;
> char c;
> };
>
> #define strideof(T) \
> ((sizeof(Tchar<T>) > sizeof(T)) ? \
> sizeof(Tchar<T>)-sizeof(T) : sizeof(T))
>

I'm puzzled by the strideof(T) macro.
I would have thought sizeof(Tchar<T>) > size(T) would always be
true. OTOH, maybe the compiler knows that T has some padding at
the end where it puts the 'char c'. In that case,
the code I mentioned earlier:

http://svn.boost.org/svn/boost/sandbox/variadic_templates/boost/composite_storage/

might not calculate the same size as the compiler since
it doesn't know what the padding of any type may be :(

[snip]

Francesco S. Carta

unread,

Aug 21, 2010, 5:16:42 PM8/21/10

to

The author motivates the above macro exactly for that "padding reuse"
possibility, see the second paragraph:

[quote from http://www.monkeyspeak.com/alignment/ ]
There are two cases in which we might not get T's stride exactly. The
first is if the compiler (for some reason) chooses to pad the struct too
much. In that case, Tchar's size will be bumped up to some higher
multiple of T's stride. The result we get is not optimal, but it's still
guaranteed to be safe to use. Such excessive padding should probably be
considered a bug in the compiler, but our technique deals safely with
that case anyway.

The second case is if T already has some padding in it, and the compiler
chooses to reuse some of T's padding to store the extra char. In that
case, Tchar will be the same size as T, and our subtraction will yield
zero. When that happens, our macro defaults to sizeof(T), which is not
optimal but is at least known to be a multiple of T's stride. (There is
some dispute as to whether this case can arise--some believe that it is
illegal for a compiler to reuse padding like this. Either way, our
technique deals with it.)
[/quote]

(I reported it here for the sake of eventual further citation)

Gennaro Prota

unread,

Aug 21, 2010, 5:39:39 PM8/21/10

to

On 21/08/2010 20.31, Francesco S. Carta wrote:
> Francesco S. Carta <entu...@gmail.com>, on 21/08/2010 19:35:51, wrote:
>
>> the "largest" alignment amounts to 32 bytes on my system
>
> Scratch that. I thought it was so, but it was just a weird result of
> some kind of template hack I've tried to implement on my own in order to
> guess the actual alignment.
>
> I still wonder if the method provided here can be trusted or not:
>
> http://www.monkeyspeak.com/alignment/

The article is not very precise. It doesn't exclude some types,
for instance (non-object types, etc.).

And when T is not a POD, there can be padding before t in

struct Foo { T t ; char c ; } ;

or even before c in

struct Foo { char c ; T t ; } ;

In the latter case, though, that's unlikely.

FWIW, I'm using a similar technique for "computing" a suitable
alignment down in my (experimental) version of fallible<> (the
non-experimental version has a T member and requires
default-construction, just like the Barton & Nackman one; the
new version drops the default-constructible requirement and uses
placement new---using, of course, a "suitably aligned" array of
unsigned chars).

Copy-pasting from there (beware: I have yet to review it---and
it started as a quick experiment):

template< typename T >
struct align_of
{
struct trick { // gps how to name it?
char c ;
T t ;
} ;

enum { v = sizeof( trick ) - sizeof( T ) } ;
enum { has_padding_at_end = v > sizeof( T ) } ; // should name
// this better

static std::size_t const value = has_padding_at_end
? sizeof( T )
: v
;
} ;

template< typename T >
std::size_t const
align_of< T >::value ;

When T is a POD the whole trick structure (which contains only a
T and a char) is a POD as well: thus there can't be padding
before the first member. (And I find it slightly clearer if the
char appears first.)

When T is not a POD, that padding is allowed. I see no reason
for the compiler to exploit that possibility with the struct
above. Anyway, if it is that "capricious" the whole thing breaks
down (numerical "coincidences" apart :-)).

Of course, this template is not exposed: it's only used
internally, to try one after another the POD types in a
predefined "list", in order to find the first of them (if any)
that has the same alignment as the type T on which fallible<> is
instantiated (more precisely: for which align_of<>::value is the
same as align_of< T >::value). At the end of the dance, it
doesn't matter what the alignment really is, numerically: it
picks the POD and puts it in the same union that contains the
array of unsigned chars.

Another thing I found perplexing in the article was

template <typename T>
struct Tchar {
T t;
char c;
};

#define strideof(T) \
((sizeof(Tchar<T>) > sizeof(T)) ? \
sizeof(Tchar<T>)-sizeof(T) : sizeof(T))

How can ever be the size of the struct be smaller than, or even
equal to, sizeof( T )? [footnote]

(In theory, instead, the difference sizeof( Struct ) - sizeof(
T) may be greater than sizeof( T ). That's what the second
enumerator takes care of, though as noted in the comments I'm
not sure the name conveys the right thing.)

__

[footnote] I'm aware that in some cases an object may be
"smaller than its type" (virtual base classes etc.) but in this
case...

Gennaro Prota

unread,

Aug 21, 2010, 6:22:13 PM8/21/10

to

On 21/08/2010 20.58, Francesco S. Carta wrote:
[...]

>>> In any case, I was surely overlooking (once more) the difference between
>>> using plain "new" and calling "the operator new function". How were they
>>> commonly referred to? "new expression" and "operator new"?
>>
>> Use the terminology of the standard ("new expression" and
>> "operator new function")!
>
> Indeed I should, I just need more study to actually associate some
> well-grasped content to those labels ;-)
>
>> <asides>
>> - The reason for the excitement and the consequent exclamation
>> point is that I do use it, and I feel lonely.
>
> Of course you're not the only one who uses the standard terms. I
> actually try to use them as long as I feel on firm ground.

Oh, but it wasn't addressed at you! And it was meant to be
humorous ("I feel lonely" :-)). I just find that for these two
things using the standard terminology isn't very common.

>> - If you really want to refer to the operator (as opposed to
>> the expression) you have to think more.
>
> This is not clear to me. Do you mean that I should think more if I
> decide to add an "operator new" to some class?

No no, it was one of my "off at a tangent" thoughts. I was
thinking that the standard names work well because one of them
avoids the term "operator" altogether, referring to the
expression, instead. But *if* you need to talk about the
operator (the "lexical thing"), together with the operator
function, then things become more awkward. I meant: you'd have
to think more about how to avoid confusion.

Just to clarify:

3 + 2 additive "expression"
^--------- + operator

Similarly:

new T() new expression
^---------- new operator

An operator and an expression are two different things; it just
happens that in most cases we can "get away" by referring to the
expression (in fact, it usually makes more sense than referring
to the lexical notion, unless the topic of discussion is
compilers, etc.).

I hope I managed to explain what crossed my mind :-)

Francesco S. Carta

unread,

Aug 21, 2010, 7:33:41 PM8/21/10

to

Gennaro Prota <gennar...@yahoo.com>, on 21/08/2010 23:39:39, wrote:

> On 21/08/2010 20.31, Francesco S. Carta wrote:
>> Francesco S. Carta<entu...@gmail.com>, on 21/08/2010 19:35:51, wrote:
>>
>>> the "largest" alignment amounts to 32 bytes on my system
>>
>> Scratch that. I thought it was so, but it was just a weird result of
>> some kind of template hack I've tried to implement on my own in order to
>> guess the actual alignment.
>>
>> I still wonder if the method provided here can be trusted or not:
>>
>> http://www.monkeyspeak.com/alignment/
>
> The article is not very precise. It doesn't exclude some types,
> for instance (non-object types, etc.).

Which are the non-object types?

> And when T is not a POD, there can be padding before t in

I have skimmed the standard and I only found a mention about the fact
that a POD type cannot have initial padding, does that suffice to imply
that non-POD types can actually have initial padding or is there some
other explicit reference in the standard?

> struct Foo { T t ; char c ; } ;
>
> or even before c in
>
> struct Foo { char c ; T t ; } ;
>
> In the latter case, though, that's unlikely.
>
> FWIW, I'm using a similar technique for "computing" a suitable
> alignment down in my (experimental) version of fallible<> (the
> non-experimental version has a T member and requires

> default-construction, just like the Barton& Nackman one; the

I suppose you could compute the initial padding by subtracting the
address of the object from the address of the first member - assuming
that is doable and useful...

> Of course, this template is not exposed: it's only used
> internally, to try one after another the POD types in a
> predefined "list", in order to find the first of them (if any)
> that has the same alignment as the type T on which fallible<> is
> instantiated (more precisely: for which align_of<>::value is the
> same as align_of< T>::value). At the end of the dance, it
> doesn't matter what the alignment really is, numerically: it
> picks the POD and puts it in the same union that contains the
> array of unsigned chars.
>
> Another thing I found perplexing in the article was
>
> template<typename T>
> struct Tchar {
> T t;
> char c;
> };
>
> #define strideof(T) \
> ((sizeof(Tchar<T>)> sizeof(T)) ? \
> sizeof(Tchar<T>)-sizeof(T) : sizeof(T))
>
>
> How can ever be the size of the struct be smaller than, or even
> equal to, sizeof( T )? [footnote]

> [footnote] I'm aware that in some cases an object may be

> "smaller than its type" (virtual base classes etc.) but in this
> case...

The author of that macro explains what (and why) has been addressed by
that ternary test: it's something about a possible and presumably
illegal padding reuse, I reported it also in the other replies to the
message you have just quoted from me.

Francesco S. Carta

unread,

Aug 21, 2010, 7:49:05 PM8/21/10

to

Gennaro Prota <gennar...@yahoo.com>, on 22/08/2010 00:22:13, wrote:

> On 21/08/2010 20.58, Francesco S. Carta wrote:
> [...]
>>>> In any case, I was surely overlooking (once more) the difference between
>>>> using plain "new" and calling "the operator new function". How were they
>>>> commonly referred to? "new expression" and "operator new"?
>>>
>>> Use the terminology of the standard ("new expression" and
>>> "operator new function")!
>>
>> Indeed I should, I just need more study to actually associate some
>> well-grasped content to those labels ;-)
>>
>>> <asides>
>>> - The reason for the excitement and the consequent exclamation
>>> point is that I do use it, and I feel lonely.
>>
>> Of course you're not the only one who uses the standard terms. I
>> actually try to use them as long as I feel on firm ground.
>
> Oh, but it wasn't addressed at you! And it was meant to be
> humorous ("I feel lonely" :-)). I just find that for these two
> things using the standard terminology isn't very common.

Sorry, I understood that yours was a pun, I just forgot to interleave a
smile between those two sentences ;-)

The second sentence was just to say that I try to speak "formally and
technically" despite being a DIY full-time hobbyist, whatever that could
mean ;-)

>
>>> - If you really want to refer to the operator (as opposed to
>>> the expression) you have to think more.
>>
>> This is not clear to me. Do you mean that I should think more if I
>> decide to add an "operator new" to some class?
>
> No no, it was one of my "off at a tangent" thoughts. I was
> thinking that the standard names work well because one of them
> avoids the term "operator" altogether, referring to the
> expression, instead. But *if* you need to talk about the
> operator (the "lexical thing"), together with the operator
> function, then things become more awkward. I meant: you'd have
> to think more about how to avoid confusion.
>
> Just to clarify:
>
> 3 + 2 additive "expression"
> ^--------- + operator
>
> Similarly:
>
> new T() new expression
> ^---------- new operator
>
> An operator and an expression are two different things; it just
> happens that in most cases we can "get away" by referring to the
> expression (in fact, it usually makes more sense than referring
> to the lexical notion, unless the topic of discussion is
> compilers, etc.).
>
> I hope I managed to explain what crossed my mind :-)
>

You did, fully.

As I see them now and as I posted elsethread, "operator new" and "new
operator" do not look all that ambiguous anymore.

For the moment I feel that no additional thinking is needed to parse
them and to insert them into my sentences - but maybe that will change
within a couple of weeks, when the whole subject will pass to some
second plane of my attention ;-)

Gennaro Prota

unread,

Aug 22, 2010, 5:15:53 AM8/22/10

to

Hmm, well, with the caveat that I've just been through a
sleepless night, I'll try to correct myself. (Ehm :-s)

[...]

> And when T is not a POD, there can be padding before t in
>
> struct Foo { T t ; char c ; } ;
>
> or even before c in
>
> struct Foo { char c ; T t ; } ;

True, but the sizeof() - sizeof() would still work.

For reference, we are trying to calculate the alignment of a
type T (or a multiple thereof) as:

struct Foo
{
char c ;
T t ;
} ;

align_of( T ) = sizeof( Foo ) - sizeof( T )

Hereby I'll call this "the technique".

It may be helpful for learning purposes to use the same example
that made me realize that I was in error (actually I knew all
the pieces that go into the puzzle, and shouldn't have needed
the example. Yet... I needed it :-/)

I was trying to make the technique fail, thinking of odd but
legal ways to add padding within the structure Foo.

Let's consider the case that the alignment of our T, as well as
its size in bytes, are 4. (In the following I'll use the
expression "at n" as a shorthand to "at n-byte boundaries").

One obvious think to do for the compiler is to align any
instance of Foo at 4 as well, putting 4 bytes before t (one for
the char plus 3 padding bytes).

But given that the char may be moved happily around and be
placed "everywhere", the compiler might "decide" (I think) that
instances of Foo will always be allocated at addresses that are
multiples of 2 but not of 4. That is, addresses of the form
2(2k-1) = 4k-2 (with k = 1, 2, 3, 4, ...) or, which is the same,
4(k-1)+2.

Just to fix our attention, let's put k=1. In this case, it would
put the char and one padding byte at the addresses 0x2 and 0x3,
then the subobject t at 0x04.

I thought that this would make the technique fail because I'd
got sizeof( Foo ) = 6 => align_of( T ) = 6 - 4 = 2.

But I was wrong. If the compiler does so, it has to add more
padding at the end. Consider, in fact, another Foo instance
which immediately follows the first one (think of an array). If
sizeof( Foo ) were 6 then the second instance would have the
char and the padding byte at 0x08 and 0x09, and then t at 0x0a:
misaligned! To make things line up properly the compiler would
have to add another 2 bytes (or 6, 10, 14...) at the end of the
struct. Thus the technique would work anyway.

I think that I got confused by thinking in terms of offset from
start only. In C, you may find a similar technique:

#define align_of( T ) offsetof( struct { char c; T t ; }, t )

This one *would* fail with my example. The sizeof() - sizeof()
is more powerful than offsetof, even in C.

It's also useful IMHO to spell out some guarantees on which all
the reasonings that are behind the technique may rest:

- T must not be an array type (actually I haven't thought much
about this; it may work, but I think it's just easier to treat
the case separately, saying that the alignment of an array
object is that of its first element (recursively))

- T must not be a reference type (we'd have problems due to how
sizeof works on a reference type)

- T must be instantiatable, and arrays of T must be
instantiatable too. Thus, for instance, no abstract classes,
no types whose constructors are all =delete.

- alignment must be a function *of the type*. In fact, we don't
even instantiate Foo but make reasonings (it's a multiple of
this, it's a multiple of that...) about the address at which
its instances will be allocated. In particular, we don't
consider the case that, say, instances within arrays are
aligned at 8 and "standalone instances" are aligned at 2. We
consider *one* alignment value.

Juha Nieminen

unread,

Aug 22, 2010, 5:29:25 AM8/22/10

to

Paavo Helde <myfir...@osa.pri.ee> wrote:
> Accessing unaligned data is UB, depending on the platform this may crash
> or produce wrong results. On some platforms it may works (notably x86),
> but with a performance penalty.

UltraSparc is an example of a platform where trying to access unaligned
data will crash the program (with a "bus error" signal, to be precise).

Accessing unaligned data in portable code is indeed a no-no.

Larry Evans

unread,

Aug 22, 2010, 6:23:10 AM8/22/10

to

On 08/21/10 13:31, Francesco S. Carta wrote:
[snip]

> I still wonder if the method provided here can be trusted or not:
>
> http://www.monkeyspeak.com/alignment/
>
> Take a look at the following piece of code and its results on my setup
> [MinGW 4.4.0, Windows7, AMD Athlon(tm) II Dual-Core M300 2.00 Ghz]
>
> I wonder why "long double" gives a stride of 4 while "double" gives a
> stride of 8 - where "stride" should stand for something like
> "alignment", in the above linked page:
>
> (maybe "long double" is stored as three floats or three longs in my
> implementation?)

[snip]
See if:

http://www.boost.org/doc/libs/1_44_0/libs/type_traits/doc/html/boost_typetraits/reference/alignment_of.html

doesn't give a better number. It's implementation:

http://svn.boost.org/svn/boost/trunk/boost/type_traits/alignment_of.hpp

does seem to use a similar method as the monkeyspeak one in some
cases; however, the comments in that code do suggest it doesn't always
get the right answer. Also, it uses a different order of T and char:

template <typename T>
struct alignment_of_hack
{
char c;
T t;
alignment_of_hack();
};

Larry Evans

unread,

Aug 22, 2010, 10:10:18 AM8/22/10

to

On 08/22/10 04:15, Gennaro Prota wrote:
[snip]

> struct Foo
> {
> char c ;
> T t ;
> } ;
>

[snip]

> One obvious think to do for the compiler is to align any
> instance of Foo at 4 as well, putting 4 bytes before t (one for
> the char plus 3 padding bytes).
>
> But given that the char may be moved happily around and be
> placed "everywhere", the compiler might "decide" (I think) that
> instances of Foo will always be allocated at addresses that are
> multiples of 2 but not of 4. That is, addresses of the form
> 2(2k-1) = 4k-2 (with k = 1, 2, 3, 4, ...) or, which is the same,
> 4(k-1)+2.
>
> Just to fix our attention, let's put k=1. In this case, it would
> put the char and one padding byte at the addresses 0x2 and 0x3,
> then the subobject t at 0x04.
>
> I thought that this would make the technique fail because I'd
> got sizeof( Foo ) = 6 => align_of( T ) = 6 - 4 = 2.

To be more explicit, lets call the Foo at the address designated by k=1
as foo, then the 6 comes from:
offset(Foo,t)+sizeof(T) - &foo
= 0x04+sizeof(T) - 0x2
= 0x04+0x04 - 0x2
= 0x08 - 0x2
= 0x06

Although it may seem obvious to you (as it does now to me)
it wasn't obvious to me before where the 6 came from :(

>
> But I was wrong. If the compiler does so, it has to add more
> padding at the end. Consider, in fact, another Foo instance
> which immediately follows the first one (think of an array). If
> sizeof( Foo ) were 6 then the second instance would have the
> char and the padding byte at 0x08 and 0x09, and then t at 0x0a:
> misaligned! To make things line up properly the compiler would
> have to add another 2 bytes (or 6, 10, 14...) at the end of the
> struct. Thus the technique would work anyway.

So, what's the sizeof(Foo) then? I'm not understanding why
you say the technique would work anyway. Are you saying it would
work if sizeof(Foo) is a certain value? Then what is that value
that would make it work? I'm guessing that only sizeof(Foo) = 8
would work as suggested by the above quoted sentence, repeated here:

> One obvious think to do for the compiler is to align any
> instance of Foo at 4 as well, putting 4 bytes before t (one for
> the char plus 3 padding bytes).

[snip]

Gennaro Prota

unread,

Aug 22, 2010, 2:45:50 PM8/22/10

to

No. It came from 2 (the offset before t in the layout I
imagined) plus sizeof( T ): I thought, stupidly, that the
compiler could "terminate" the structure Foo just after t.

> Although it may seem obvious to you (as it does now to me)
> it wasn't obvious to me before where the 6 came from :(
>
>>
>> But I was wrong. If the compiler does so, it has to add more
>> padding at the end. Consider, in fact, another Foo instance
>> which immediately follows the first one (think of an array). If
>> sizeof( Foo ) were 6 then the second instance would have the
>> char and the padding byte at 0x08 and 0x09, and then t at 0x0a:
>> misaligned! To make things line up properly the compiler would
>> have to add another 2 bytes (or 6, 10, 14...) at the end of the
>> struct. Thus the technique would work anyway.
>
> So, what's the sizeof(Foo) then?

It can be 8, 12, 16, ... but not 6. See also what is called
Appendix 2 in the article Francesco linked to.

> I'm not understanding why
> you say the technique would work anyway.

Because both sizeof( Foo ) and sizeof( T ) are multiples of the
alignment of T. Thus their difference is, too.

Gennaro Prota

unread,

Aug 22, 2010, 2:56:33 PM8/22/10

to

On 22/08/2010 1.33, Francesco S. Carta wrote:
> Gennaro Prota <gennar...@yahoo.com>, on 21/08/2010 23:39:39, wrote:
>
>> On 21/08/2010 20.31, Francesco S. Carta wrote:
>>> Francesco S. Carta<entu...@gmail.com>, on 21/08/2010 19:35:51, wrote:
>>>
>>>> the "largest" alignment amounts to 32 bytes on my system
>>>
>>> Scratch that. I thought it was so, but it was just a weird result of
>>> some kind of template hack I've tried to implement on my own in order to
>>> guess the actual alignment.
>>>
>>> I still wonder if the method provided here can be trusted or not:
>>>
>>> http://www.monkeyspeak.com/alignment/
>>
>> The article is not very precise. It doesn't exclude some types,
>> for instance (non-object types, etc.).
>
> Which are the non-object types?

See and [basic.types]/5 and [basic.types]/9. (For the non-"non-"
version :-))

>> And when T is not a POD, there can be padding before t in
>
> I have skimmed the standard and I only found a mention about the fact
> that a POD type cannot have initial padding, does that suffice to imply
> that non-POD types can actually have initial padding or is there some
> other explicit reference in the standard?

I see no explicit reference, but the rules seem to be carefully
crafted to allow it. I'm sure someone will come up with an
example where it is necessary. I'm just too tired right now to
think of anything sensible.

[...]

>> Another thing I found perplexing in the article was
>>
>> template<typename T>
>> struct Tchar {
>> T t;
>> char c;
>> };
>>
>> #define strideof(T) \
>> ((sizeof(Tchar<T>)> sizeof(T)) ? \
>> sizeof(Tchar<T>)-sizeof(T) : sizeof(T))
>>
>>
>> How can ever be the size of the struct be smaller than, or even
>> equal to, sizeof( T )? [footnote]
>
>> [footnote] I'm aware that in some cases an object may be
>> "smaller than its type" (virtual base classes etc.) but in this
>> case...
>
> The author of that macro explains what (and why) has been addressed by
> that ternary test: it's something about a possible and presumably
> illegal padding reuse, I reported it also in the other replies to the
> message you have just quoted from me.

OK. It had never occurred to me but, apparently, it has been
asked, even to the C committee. See bullet g of

<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_074.html>

I don't have the C90 standard so I can't tell what the content
of subclause 6.3 is. I suspect that it's something related to
accessing the object at runtime, though (such as memset()'ing t
and getting something in c). Could anyone provide a quotation?

So, if my guess is right, one might claim that if the struct is
never instantiated... Or he can invent cases that are slightly
different from question g in that DR... I'd say that the intent
is just to disallow it, and quickly skimming through a couple of
comp.std.c threads seems to confirm my gut feeling.

As Francis Glassborow put it in one of the posts that I read:

I do not think you need specific text [that prohibits that
reusage] because as you have already noted, such an
arrangement would result in storage for a variable being
changed without program instruction. Such behaviour must be
wrong else we might as well all give up and go home.

In fact, when the "questioning" reaches such a fundamental level
I begin to hate C++. Especially since the standard doesn't
provide ground to prove these theorems, even if you are willing
to (but even if you are... how can you program if you have to
prove theorems in the language at every corner).

BTW, that DR also has some bad news in the answer to bullet a.
For the little that I've been able to gather from the new C++0x
requirements, this seems an incompatibility between C and C++
(because C++ seems to intend that the alignment is a function of
the type: an object of the same type has the same alignment. But
as you might have seen, I've asked for a confirmation about
this, without success. In fact, I'm looking for an effective way
to deal with it before C++0x is finalized. Some national body
should probably help, as a DR would just be dealt with after the
publication of the standard).

Larry Evans

unread,

Aug 22, 2010, 4:47:36 PM8/22/10

to

The statement from my previous post:

alignof(C) == fold(lcm,1,alignof(T)...)

is suggested by:

> Can A(struct foo) be greater than the least common multiple of
> A(type_1), A(type_2), ..., A(type_n),
> where type_1 to type_n are the types of the elements of struct foo?

from the defect report (DR) reference Gennaro gave (thanks Gennaro)
in another post:

http://groups.google.com/group/comp.lang.c++/msg/a1d02e03e148dea3

More explicitly:

alignof(T)... in my post
corresponds to:
A(type_1), A(type_2), ..., A(type_n) from the DR.

alignof(C) in my post
corresponds to:
A(struct foo) from the DR.

Does that make things clearer? I did write some justifcation
for this and uploaded it to the boost vault. IIRC, the
justification is just in a .txt file and I think (judging
from the description shown on the web) that it located somewhere
in the zip file here:

http://www.boostpro.com/vault/index.php?action=downloadfile&
filename=aligned_types.2.zip&directory=variadic_templates&

I'm pretty sure the .txt file doesn't use in variadic template
notation; so, hopefully that won't be part of any roadblock to
your understanding.

HTH.

-Larry

Francesco S. Carta

unread,

Aug 22, 2010, 5:03:26 PM8/22/10

to

Gennaro Prota <gennar...@yahoo.com>, on 22/08/2010 20:56:33, wrote:

> On 22/08/2010 1.33, Francesco S. Carta wrote:
>> Gennaro Prota<gennar...@yahoo.com>, on 21/08/2010 23:39:39, wrote:
>>
>>> On 21/08/2010 20.31, Francesco S. Carta wrote:
>>>> Francesco S. Carta<entu...@gmail.com>, on 21/08/2010 19:35:51, wrote:
>>>>
>>>>> the "largest" alignment amounts to 32 bytes on my system
>>>>
>>>> Scratch that. I thought it was so, but it was just a weird result of
>>>> some kind of template hack I've tried to implement on my own in order to
>>>> guess the actual alignment.
>>>>
>>>> I still wonder if the method provided here can be trusted or not:
>>>>
>>>> http://www.monkeyspeak.com/alignment/
>>>
>>> The article is not very precise. It doesn't exclude some types,
>>> for instance (non-object types, etc.).
>>
>> Which are the non-object types?
>
> See and [basic.types]/5 and [basic.types]/9. (For the non-"non-"
> version :-))

Ah, yes, of course, silly me...

>>> And when T is not a POD, there can be padding before t in
>>
>> I have skimmed the standard and I only found a mention about the fact
>> that a POD type cannot have initial padding, does that suffice to imply
>> that non-POD types can actually have initial padding or is there some
>> other explicit reference in the standard?
>
> I see no explicit reference, but the rules seem to be carefully
> crafted to allow it. I'm sure someone will come up with an
> example where it is necessary. I'm just too tired right now to
> think of anything sensible.

I'm really curious to see an example where such initial padding could
come useful... anybody else still keeping an eye here?

Thanks for the references.

As I mentioned elsethread, I'm almost completely unaware of the details
of the new standard, I only had a read to some articles here and there,
but sooner or later I'll have to dip my fingers there.

Unfortunately I cannot give any help about your reports... isn't there
in the committee some Italian entity or person that you could contact to
focus a bit of attention on those issues?

Well maybe you just need to exercise some more patience, your reports
can still get a feedback before the new publication, I think.

Or eventually, I wonder whether the DR is not the proper shape to
present those issues, as you say those should be addressed after the
publication... but I'm far from my firm ground, so I'll be better
shutting up before writing some (further) silliness ;-)

Öö Tiib

unread,

Aug 22, 2010, 6:24:36 PM8/22/10

to

On 23 aug, 00:03, "Francesco S. Carta" <entul...@gmail.com> wrote:

> Gennaro Prota <gennaro.pr...@yahoo.com>, on 22/08/2010 20:56:33, wrote:
>
> >> I have skimmed the standard and I only found a mention about the fact
> >> that a POD type cannot have initial padding, does that suffice to imply
> >> that non-POD types can actually have initial padding or is there some
> >> other explicit reference in the standard?
>
> > I see no explicit reference, but the rules seem to be carefully
> > crafted to allow it. I'm sure someone will come up with an
> > example where it is necessary. I'm just too tired right now to
> > think of anything sensible.
>
> I'm really curious to see an example where such initial padding could
> come useful... anybody else still keeping an eye here?
>

Perhaps it is not about initial padding, but about stuff that may be
before first data member (for example for polymorphism or RTTI). It is
up to implementation so standard is careful not guaranteeing nor
denying that something is there.

Larry Evans

unread,

Aug 22, 2010, 7:59:33 PM8/22/10

to

On 08/21/10 16:16, Francesco S. Carta wrote:
> Larry Evans <cpplj...@suddenlink.net>, on 21/08/2010 14:32:27, wrote:
>
>> On 08/21/10 13:31, Francesco S. Carta wrote:
>> [snip]
>>> I still wonder if the method provided here can be trusted or not:
>>>
>>> http://www.monkeyspeak.com/alignment/

[snip]

>> I'm puzzled by the strideof(T) macro.
>> I would have thought sizeof(Tchar<T>) > size(T) would always be
>> true. OTOH, maybe the compiler knows that T has some padding at
>> the end where it puts the 'char c'.
>
> The author motivates the above macro exactly for that "padding reuse"
> possibility, see the second paragraph:
>
> [quote from http://www.monkeyspeak.com/alignment/ ]

[snip]

> The second case is if T already has some padding in it, and the compiler
> chooses to reuse some of T's padding to store the extra char. In that
> case, Tchar will be the same size as T, and our subtraction will yield
> zero. When that happens, our macro defaults to sizeof(T), which is not
> optimal but is at least known to be a multiple of T's stride. (There is
> some dispute as to whether this case can arise--some believe that it is
> illegal for a compiler to reuse padding like this. Either way, our
> technique deals with it.)
> [/quote]

[snip]
From http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_074.html,
response g) says such reuse of padding (called sharing of storage
in the ref) is not allowed.

Larry Evans

unread,

Aug 22, 2010, 8:27:08 PM8/22/10

to

On 08/22/10 18:59, Larry Evans wrote:
> On 08/21/10 16:16, Francesco S. Carta wrote:

[snip]

>> [quote from http://www.monkeyspeak.com/alignment/ ]
> [snip]
>> The second case is if T already has some padding in it, and the compiler
>> chooses to reuse some of T's padding to store the extra char. In that
>> case, Tchar will be the same size as T, and our subtraction will yield
>> zero. When that happens, our macro defaults to sizeof(T), which is not
>> optimal but is at least known to be a multiple of T's stride. (There is
>> some dispute as to whether this case can arise--some believe that it is
>> illegal for a compiler to reuse padding like this. Either way, our
>> technique deals with it.)
>> [/quote]
> [snip]
> From http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_074.html,
> response g) says such reuse of padding (called sharing of storage
> in the ref) is not allowed.
>

I've a feeling of deja vu. This reuse of padding was actually
done to create a tagged union as described in this thread:

http://sourceforge.net/mailarchive/message.php?msg_name=hnevds%24sl7%241%40dough.gmane.org

In summary, Joel had a utree which was a tagged union; however,
he had to store the tag somewhere which means he had to create
something like:

struct tagged_union
{ u_type u_part;
char tag_part;
};

where u_type was "equivalent" to some union type.
However, that union type had some padding due to
alignment requirements and, to save space, joel
wanted to put the tag_part in that padding.
This was done by store the whole thing a a char
buffer and using template metaprogramming to
extract the u_part and tag_part.

Hope you find it interesting.

-regards,
Larry

James Kanze

unread,

Aug 23, 2010, 6:20:27 AM8/23/10

to

On Aug 22, 10:03 pm, "Francesco S. Carta" <entul...@gmail.com> wrote:
> Gennaro Prota <gennaro.pr...@yahoo.com>, on 22/08/2010 20:56:33, wrote:

[...]

> >>> And when T is not a POD, there can be padding before t in

> >> I have skimmed the standard and I only found a mention
> >> about the fact that a POD type cannot have initial padding,
> >> does that suffice to imply that non-POD types can actually
> >> have initial padding or is there some other explicit
> >> reference in the standard?

> > I see no explicit reference, but the rules seem to be
> > carefully crafted to allow it. I'm sure someone will come up
> > with an example where it is necessary. I'm just too tired
> > right now to think of anything sensible.

> I'm really curious to see an example where such initial
> padding could come useful... anybody else still keeping an eye
> here?

It might depend on what you consider "padding". What is certain
is that if C is not a PODS, then there is no guarantee that the
address of a C object is the same as the address of its first
member. In a typical implementation, for example:

struct C
{
int a;
virtual ~C(); // So not PODS
};

C c;

, (void*)&c and (void*)&c.a will not compare equal. (Drop the
virtual destructor, and they are required to by the standard.)

Basically, all the standard is doing is allowing implementations
additional freedom with regards to layout if the type isn't POD.
All of the implementations I know use this freedom in some
specific cases.

--
James Kanze

Francesco S. Carta

unread,

Aug 23, 2010, 12:42:07 PM8/23/10

to

<snip>

Indeed, I'm trying to sum up all these references to get a better grasp
on the subject, thanks for them Larry.

Francesco S. Carta

unread,

Aug 23, 2010, 12:44:15 PM8/23/10

to

I understand, thank you for the note 嘱.

An explicit mention in the standard would have been nice, though, just
to avoid people wandering around looking for something that is there
just as an implication of something else ;-)

Francesco S. Carta

unread,

Aug 23, 2010, 12:45:01 PM8/23/10

to

I see, thanks for the details James.

As I said elsethread, an explicit mention in the standard would have
been nice.

Gennaro Prota

unread,

Aug 24, 2010, 3:59:17 PM8/24/10

to

On 22/08/2010 23.03, Francesco S. Carta wrote:
[...]

>> OK. It had never occurred to me but, apparently, it has been
>> asked, even to the C committee. See bullet g of
>>
>> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_074.html>

[...]

>> BTW, that DR also has some bad news in the answer to bullet a.
>> For the little that I've been able to gather from the new C++0x
>> requirements, this seems an incompatibility between C and C++
>> (because C++ seems to intend that the alignment is a function of
>> the type: an object of the same type has the same alignment. But
>> as you might have seen, I've asked for a confirmation about
>> this, without success. In fact, I'm looking for an effective way
>> to deal with it before C++0x is finalized. Some national body
>> should probably help, as a DR would just be dealt with after the
>> publication of the standard).
>>
>
> Thanks for the references.
>
> As I mentioned elsethread, I'm almost completely unaware of the details
> of the new standard, I only had a read to some articles here and there,
> but sooner or later I'll have to dip my fingers there.

I'm not very up-to-date either. But since I've been playing with
alignment as of lately (due to the experimental fallible<> I
mentioned elsewhere) I had a look at what C++0x has to say about
it, so that I would already code on the ground of what will be
guaranteed (and likely already holds in practice). Alas, the
result was the long message I posted here under the subject

"On alignment (final committee draft for C++0x and n1425 for C1X)"

> Unfortunately I cannot give any help about your reports... isn't there
> in the committee some Italian entity or person that you could contact to
> focus a bit of attention on those issues?

Don't you know that Italy doesn't exist programming-wise? We
were once decent at playing football, but as it concerns
programming and C++ in particular...

> Well maybe you just need to exercise some more patience, your reports
> can still get a feedback before the new publication, I think.
>
> Or eventually, I wonder whether the DR is not the proper shape to
> present those issues, as you say those should be addressed after the
> publication... but I'm far from my firm ground, so I'll be better
> shutting up before writing some (further) silliness ;-)

No, the contrary! :-) IMHO the whole paragraph is so broken that
it should be fixed *before* publication. If you look at the core
or library DR lists you'll see that there are at times issues
that remain open for ten years or more. Add that even when the
whole thing is "fixed" it will require compiler vendors to
(monitor the committee decisions and) carefully check if their
implementation respects the resolution. (So: twenty years? Is
that a good guess?)

I had a vague idea to contact someone on the French body
(ahehm... James? Still keeping an eye?) but I'm also
increasingly feeling discouraged. It seems that few people care.
I already wasted a lot of time writing the post (although
noticing the problems took really a nanosecond).

Gennaro Prota

unread,

Aug 28, 2010, 1:48:55 PM8/28/10

to

On 22/08/2010 20.45, Gennaro Prota wrote:
[...]

>> I'm not understanding why
>> you say the technique would work anyway.
>
> Because both sizeof( Foo ) and sizeof( T ) are multiples of the
> alignment of T. Thus their difference is, too.

I should have been more precise.

It "works" if the requirement is to get the true alignment or a
multiple of it.

In some cases, however (including my usage for the fallible<>
template), this requirement isn't useful.

What I was after was for the buffer in which I'll placement-new
my T to have exactly the same size as T.

Of course I can't just do

union {
unsigned char buffer[ sizeof( T ) ] ;
T dummy ;
} ;

because not all T's can go into a union. So, my idea was to
choose, for each T, a POD type T2 having *the same alignment* as
T (if it exists), and put that in the union. But if the align_of
template begins to "lie" I'm in trouble: it might tell me that
both T and T2 have an alignment of 4 whereas, for instance, T
really aligns at 4 but T2 has a true alignment of 2. Bang, we
are dead. (Note the switch from "I" to "we" :-))

And what's worse we might be dead at intermittent times (not
necessarily the same moments for "I" and "we").

A scenario where it is still useful, instead, and probably the
one for which the various boost's and then C++0x's facilities
were thought, is when you have a buffer *large enough* and don't
care to waste some of it. In that case, for instance, you may be
able to do some simple math and discard some bytes at the start.

But if I wanted to leave unused space then I'd just resort to
the old trick of using a union ("MaxAlign") with most of the
fundamental types in it. (Which is what I was doing; all this
dance started exactly as an experimental attempt to save
allocation space with respect to using MaxAlign).

So, one thing that I'd be glad to have now is a way to tell if
two types have the same alignment (meant as *minimum*
alignment), without necessarily knowing what the alignment is.
If you know of any technique to do this please voice your voice
:-)

Besides that, there's also a completely different approach which
has been wandering though my mind for a while. But I won't make
my ruminations public until I've some feeling that they may
really work.

Larry Evans

unread,

Aug 28, 2010, 2:47:34 PM8/28/10

to

On 08/28/10 12:48, Gennaro Prota wrote:
[snip]

> But if the align_of
> template begins to "lie" I'm in trouble: it might tell me that
> both T and T2 have an alignment of 4 whereas, for instance, T
> really aligns at 4 but T2 has a true alignment of 2. Bang, we
> are dead. (Note the switch from "I" to "we" :-))

But if align_of "lies", isn't that a compiler bug?

[snip]

Larry Evans

unread,

Aug 28, 2010, 3:27:37 PM8/28/10

to

On 08/28/10 12:48, Gennaro Prota wrote:
[snip]

> So, one thing that I'd be glad to have now is a way to tell if
> two types have the same alignment (meant as *minimum*
> alignment), without necessarily knowing what the alignment is.
> If you know of any technique to do this please voice your voice
> :-)

A few years ago, I had a look at boost::variant which uses (IIRC)
boost::type_traits::aligned_storage. IIRC, somewhere in the code
there was a list of all fundamental types and their alignments
and I think those alignments were compared to.... Oh now I don't
know. All I can suggest is too see how boost variant does it.

>
> Besides that, there's also a completely different approach which
> has been wandering though my mind for a while. But I won't make
> my ruminations public until I've some feeling that they may
> really work.
>

I'd be interested in seeing that when you feel it may really work.

BTW, what I found interesting is both:
variant<T1,T2,...,Tn>
and:
tuple<T1,T2,...,Tn>
when implemented using a aligned_storage<Size,Align>,
can use the same Align argument. At least that's the way
it's done here:

http://svn.boost.org/svn/boost/sandbox/variadic_templates/boost/composite_storage/pack/

where instead of variant and tuple, the template names are:

container_one_of_maybe
container_all_of_aligned

The reason for the maybe is to allow an uninitialized variant.
That bother me a bit, but then another thread:

http://groups.google.com/group/comp.std.c++/msg/a226f4a643b104ae

suggested a benefit, i.e. if operator= didn't work, then
the variant would revert to the uninitialized state.

-regards,
Larry