On 1/14/13 10:12 AM, fmatthew5876 wrote:
>> Let T be "long double", implemented as a 10 byte IEEE float. It
>> might have a "hard" alignment requirement of 2, but this type of
>> access causes some slow down to fetch, so the compiler might impose
>> an opportunistic 16 byte alignment. Because of the rules for
>> arrays, m[1] will be aligned for a valid, but slow access, but y
>> might be spaced with 6 byte of padding so that it can use the
>> faster access.
>
> In that case we would have sizeof(long double) == 16. I think they
> would use internal padding to fulfill the alignment constraint. The
> long double itself may only use the first/last n-bits of the storage
> internally (endian), essentially being internally padded.
No, the proposed implementation only reserves 10 bytes for the long
double, if the long double is followed by other smaller items, they
might get packed into the the same 16 byte chunk. If sizeof(long
double) in this case was made 16, then 6 bytes would be wasted for
every long double allocated since the padding bytes could NOT be
shared with other variables.
> Again we have this notion of arrays as second class citizens. An
> array whether allocated dynamically or created statically on the
> stack is the fastest possible data structure to use when iterating
> over a collection of elements 1 at a time.
Where is there any promise that an array is the "fastest possible data
structure"? In the case I am indicating, there is a natural trade off,
the 10 byte value can either be allocated so as to be "fast" or
"small", it may make sense that for single values, fast is good,
especially if the wasted space can often be reclaimed by reordering
where variables are stored (which is allowed if not in a struct), but
if creating an array, then the size consideration becomes more
important. The wasted space might not be that much in the single
variable case, as the alignment rule might well be that speed is lost
only if the 10 byte number crosses as 16 byte boundary, giving even
more options for not wasting the space.
> Finally also consider the new alignof() operator. alignof(T) is one
> value per type T. It doesn't change if T is allocated on the stack,
> as a class member, dynamically allocated, or put into an array.
alignof() does not promise that such an alignment is "optimal", only
that it is allowed. In the defined case, alignof(long double) is 2,
this does not preclude the compiler of adding more padding to optimize
placement.
For some machines, reading a multi-byte value from an improperly
aligned address causes a malfunction (trap, or wrong value read),
implementations for these machines must have a value of alignof() big
enough to make it work. Other machines may support the non-aligned
read, but at some cost, on these the implementation has a choice of
what to do with alignof(), it could be 1, indicating that the
unaligned access is permissible, or it could make it higher, ignoring
that the hardware could support the unaligned access, but if it does,
then the compiler must use the higher value itself when it is
allocating objects. Note that if the implementation chose to make the
value of alignof() 1 in this case, there is no requirement in the
standard that would prohibit the addition of padding bytes between
objects to make the accesses of the multi-byte object more efficient.
>> Note also in your proof, you have invoked undefined behavior, as &y
>> - &x is not defined, - for pointers only has defined behavior if
>> the two operands are members of the same array. (Also note that by
>> definition &m[1] - &m[0] == 1, since the difference of pointers
>> work as the converse of pointer addition, and (&m[0])+1 = &m[1].
>
> I believe if you cast them to character pointers you can do pointer
> arithmetic on any pointers. You could also cast to intptr_t. That's
> what my intention was.
Casting the pointers will give the difference in the array to be
sizeof(T), that is true. If x and y were unrelated variables, it does
not make their difference defined, I would have to look more closely
at the standard to see if the fact that x and y are members of the
same struct make taking the difference of them cast to char defined
(since the struct can be viewed as an array of char).
Note that casting to intptr_t does NOT make your result hold. About
all the standard promises for intptr_t is that you can (if the type
exists) convert a pointer to intptr_t and then convert that exact same
value back to a pointer that will then be equivalent. There is no
promise that math on the intptr_t relates in any way to math on the
pointer. This particularly won't hold for some cases on segmented
architectures.
>> The standard imposes few limitations on what the implementation may
>> chose for alignment rules. There is no explicit requirement that
>> "Implementation alignment requirements" force object to be at their
>> minimum alignment requirement. Also, the phrase "Implementation
>> alignment requirements might cause two adjacent members not to be
>> allocated immediately after each other;" is written in a permissive
>> manner, not restrictive, and contains no solid constraints on the
>> implementation (it contains no MUST or SHALL, or even a SHOULD),
>> but to me seems more an explanatory comment warning the programmer
>> that they should NOT make the assumption that elements are packed
>> next to each other.
>
> This is true, I admit the most shakey part of the argument is the
> rather liberal interpretation of the standard with regards to
> structure padding. Its true that it doesn't say you cannot add
> padding for other reasons.
>
> It would be nice if the standard offered more clarity on this
> subject. Casting through unions is a popular and useful technique.
The key is that the standard only promises what it is willing to force
all implementations to do. The standard intentionally gives
implementations room to do things to generate better code for their
platform. In general, this form of "type punning" is going to be
implementation dependent anyway, so having the behavior by the
language standard being "undefined" isn't that bad, if you have other
documentation/standards defining it.
The important part is that your program works, and knowing where/when
it will work. I suspect that the percentage of programs that are
"strictly conforming", relying on no implementation defined behavior
is vanishingly small. Implementation defined behavior isn't bad, but
depending on it without knowing it can be, as you can be caught
unawares if that behavior changes without you knowing it (due to
upgrades, or porting to new systems, etc).