C++ [expr.reinterpret.cast]p7 (N4527) defines pointer conversions in terms
of conversions from void*:
An object pointer can be explicitly converted to an object pointer
of a different type. When a prvalue v of object pointer type is
converted to the object pointer type “pointer to cv T”, the result
is static_cast<cv T*>(static_cast<cv void*>(v)).
C++ [expr.static.cast]p13 says of conversions from void*:
A prvalue of type “pointer to cv1 void” can be converted to a
prvalue of type “pointer to cv2 T” .... If the original pointer value
represents the address A of a byte in memory and A satisfies the alignment
requirement of T, then the resulting pointer value represents the same
address as the original pointer value, that is, A. The result of any
other such pointer conversion is unspecified.
The clear intent of these rules is that the implementation may assume
that any pointer is adequately aligned for its pointee type, because any
attempt to actually create such a pointer has undefined behavior. It is
very likely that, if we found a hole in those rules that seemed to permit
the creation of unaligned pointers, we could go to the committees and
have that hole closed. The language policy here is clear.
There are architectures where this policy is mandatory. The classic
example is (I believe) the Cray C90, which provides word-addressed 32-bit
pointers. Pointers to most types can use this native representation directly.
char*, however, requires sub-word addressing, which means void* and char*
are actually 64 bits in order to permit the storage of the sub-word offset.
An int* therefore literally cannot express an arbitrary void*.
Less dramatically, there are architectural features that clearly depend
on alignment. It's unreasonable to expect processors to support atomic
accesses that straddle the basic unit of their cache coherence implementations.
Supporting small unaligned accesses has a fairly marginal cost in extra
hardware, but as accesses grow to 128 bits or larger, those costs can spiral
out of control. These restrictions are fairly widely understood by compiler
users.
Everything below is mushier. It's clearly advantageous for the compiler to
be able to make stronger assumptions about alignment when accessing memory.
ISAs often allow more efficient accesses to properly-aligned memory; for
example, 32-bit ARM can perform a 64-bit memory access in a single
instruction, but the address is required to be 8-byte-aligned. Alignment
also affects compiler decisions even when the architecture doesn't enforce
it; for example, it can be profitable to combine two adjacent loads into
a single, wider load, but this will often slow down code if the wider load is
no longer properly aligned.
As is the case with most forms of undefined behavior, programmers have at
best an abstract appreciation for the positive effects of these optimizations,
but they have a very concrete understanding of the disruptive life effects
of being forced to fix crashes from mis-alignment.
Our standard response in LLVM/Clang is to explain the undefined behavior
rule, explain the benefits it provides, and politely ask users to, well,
deal with it. And that's appropriate; most forms of undefined behavior
under the standard(s) are reasonable requests with reasonable code
workarounds. However, we have also occasionally looked at a particular
undefined behavior rule and decided that there's a real usability problem
with enforcing it as written. In these cases, we use our power as
implementors to make some subset of that behavior well-defined in order to
fix that problem. For example, we did this for TBAA, because we recognized
that certain "obvious" aliasing violations were idiomatic and only had
awkward workarounds under the standard.
There's a similar problem here. Much like TBAA, fixing it doesn't require
completely abandoning the idea of enforcing type-based alignment assumptions.
It does, however, require a significant adjustment to the language rule.
The problem is this: the standards make it undefined behavior to even
create an unaligned pointer. Therefore, as soon as I've got such a
pointer, I'm basically doomed; it is no longer possible to locally
work around the problem. I have to change the type of the pointer to
something that requires less alignment, and not just where I'm using
it, or even just within my function, but all the way up to wherever it
came from.
For example, suppose I've got this function:
void processBuffer(const int32_t *buffer, size_t length) {
...
}
I get a bug report saying that my function is crashing, and I decide
that the right fix is to make the function handle unaligned buffers
correctly. Maybe that's a binary-compatibility requirement, or maybe the
buffer is usually coming from a serialized format that doesn't guarantee
alignment, and it's clearly unreasonable to copy the buffer just to satisfy
my function.
So how can I make this function handle unaligned buffers? The type of the
argument itself means that being passed an unaligned buffer has undefined
behavior. Now, I can change that parameter to use an unaligned typedef:
typedef int32_t unaligned_int32_t __attribute__((aligned(1)));
void processBuffer(const unaligned_int32_t *buffer, size_t length) {
...
}
But this has severe problems. First off, this is a GCC/Clang extension; a lot
of programmers feel uncomfortable adopting that, especially to fix a problem
that's in principle common across compilers. Second, alignment attributes
are not really part of the type system, which means that they can be
silently dropped by any number of things, including both major features
like templates and just day-to-day quality-of-implementation stuff like the
common-type logic of the conditional operator. And finally, my callers
still have undefined behavior, and I really need to go audit all of them
to make sure they're using the same sort of typedef. This is not a reliable
solution to the bug.
Furthermore, the compiler doesn't really care whether the pointer is
abstractly aligned independent of any access to memory. There aren't very
many interesting alignment-based optimizations on pointer values as mere
values. In principle, we could optimize operations that cast the pointer
to an integral type and examine the low bits, but those operations are not
very common, and when they're there, it's probably for a good reason;
that's the kind of optimization is very likely to just create miscompiles
without really showing any benefit.
Therefore, I would like to propose that Clang formally adopt a significantly
weaker language rule for enforcing the alignment of pointers. The basic
idea is this:
It is not undefined behavior to create a pointer that is less aligned
than its pointee type. Instead, it is only undefined behavior to
access memory through a pointer that is less aligned than its pointee
type.
That is, the only thing that matters is the type when you actually perform
the access, not any type the pointer might have had at some earlier point
during execution.
Notably, I believe that this rule doesn't require any changes in our
current behavior, so adopting it is just a restriction on future compiler
optimization. For the most part, LLVM IR only attaches alignment to loads,
stores, and specific intrinsics like llvm.memcpy; there is no way to say
that a pointer value is expected to have a particular alignment. The
one exception that I'm aware of is that an indirect parameter can have
an expected alignment. However, Clang currently only sets this for
by-value arguments that the calling convention says to pass indirectly,
and that remains acceptable under this new rule because it's an ABI rule
rather than a constraint on programmer behavior (other than assembly
programmers). The rule just means that we can't start setting it on
arbitrary pointer parameters.
It is also a very portable rule; I'm not aware of any compilers that do
try to take advantage of the formal alignment of pointer values independent
of access.
The key question in this new rule is what counts as an "access". I'll spell
this out in more detail, but it's mostly intuitive: anything that ultimately
requires a load or store. The only thing that's perhaps questionable is that
we'd like to treat calls to library functions that access memory as if they
were direct accesses to their arguments. For example, we'd like to assume
that the pointer arguments to memcpy are properly aligned for their types
(that is, their explicit types, before the implicit conversion to void*) so
that we can generate a more efficient copy operation. This analysis
currently relies on the language rule that pointers may not be misaligned;
preserving it requires us to treat calls to library functions as special,
which of course we already do. Programmers can still suppress this
assumption by explicitly casting the arguments to void*.
Here's the proposed new rule, expressed more formally:
---
It is well-defined behavior to construct a pointer to memory that
is less aligned than the alignment of the pointee type (if a complete
type). However, it is undefined behavior to “access” an expression that
is an r-value of type T* or an l-value of type T if T is a complete type
and the memory is less aligned than T.
An r-value expression of pointer type is accessed if:
- it is dereferenced (with *) and the resulting l-value is accessed,
- it is implicitly converted to another pointer type and the
result is accessed,
- it undergoes pointer addition and the result is accessed,
- it is passed to a function in the C standard library that is known
to access the memory,
- in C++, it is converted to a pointer to a virtual base, or
- in C++, it is explicitly cast (other than by a reinterpret_cast) to
a related class pointer type and the result is accessed.
An l-value expression is accessed if:
- it undergoes an lvalue-to-rvalue conversion (i.e. it is loaded),
- it is the LHS of an assignment operator (including the
compound assignments),
- it is the base of a member access (with .) and the resulting l-value
is accessed (recall that x->y is defined as ((*x).y),
- it undergoes indirection (with &) and the resulting pointer is accessed,
- in C++, it is implicitly converted to be an l-value to a base type
and the result is accessed,
- in C++, it is converted to be an l-value of a virtual base type,
- in C++, it is used as the "this""" argument of a call to a
non-static member function, or
- in C++, a reference is bound to it (which includes explicit
casts to reference type).
These are the cases covered by the language standard. There is a
very long tail of other kinds of expression that obviously access memory,
like the atomic and overflow builtins, which I can't reasonably enumerate.
The intent should be obvious, but I'm willing to spell it out in other
cases where necessary.
Note that this definition is *syntactic*, meaning that it is expressed
in terms of the components of a single statement. This means that an
access that might be undefined behavior if written as a single statement:
highlyAlignedStruct->charMember = 0;
may not be undefined behavior if split across two statements:
“char *member = &highlyAlignedStruct->charMember;
*member = 0;
In effect, the compiler promises to never propagate alignment assumptions
between statements through its knowledge of how a pointer was constructed.
This is necessary in order to allow local workarounds to be reliable.
Note also that this definition does not propagate through explicit casts,
other than class-hierarchy casts in C++. Again, this is a deliberate
choice to make misalignment workarounds more straightforward.
But note that this rule does still allow the compiler to make stronger
abstract assumptions about the alignment of C++ references and the
"this" pointer.
---
Please let me know what you think.
John.
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Thanks for the very detailed explanation. I’m not a language expert, but i felt like i could understand what you want to do here.
In particular, as a user, I’m surprised it doesn’t already work (in the standard) they way you said it should. That is
It is not undefined behavior to create a pointer that is less aligned
than its pointee type. Instead, it is only undefined behavior to
access memory through a pointer that is less aligned than its pointee
type.
I thought the above was how it should work now. Its the behaviour i expect when I write code already, and as a user i’d hope that this continues to be what we implement.
So yeah, +1 from me.
Thanks,
Pete
Sounds like a good idea.
For example, on Hexagon, the long vectors (64- and 128-bytes long)
normally need to be aligned to a boundary that is a multiple of their
size. There exist, however, instructions to load/store vectors at an
unaligned address, although they have some restrictions that the aligned
instructions don't. Treating vectors as if they had to be aligned,
while allowing unaligned pointers makes perfect sense.
-Krzysztof
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
For the sake of completeness, I'll mention one exception. If the pointer (or its type via a typedef) as the __attribute__((align_value(N))) attribute, then we do emit alignment attributes on the pointer values themselves and use that information in later optimizations. This is by design, but given that it is explicitly opt-in, I feel this falls into a different category than the situations you've described.
Realistically, if we ever were to implement optimizations based on default type alignments, we'd need a flag to turn off those assumptions (just like we have a flag to turn off strict aliasing assumptions).
-Hal
> cfe-dev mailing list
> cfe...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
- it undergoes indirection (with &) and the resulting pointer is accessed,
- in C++, it is implicitly converted to be an l-value to a base type
and the result is accessed,
Sure, that seems reasonable. It’s the default language rule I’m concerned about.
John.
It removes the specific clauses from the quotes above about alignment and instead re-imposes alignment requirements based on the immediate form of the expression in several other places that touch memory. I intentionally did not layer this onto the existing definition of access because non-static data members are formally accessed even if you don’t touch the underlying memory,
and I don’t feel that the compiler should be allowed to assume alignment in those situations. If you feel that there’s a better formalization that still captures that, I’m open to it.
(Note that sometimes the only way we detect the UB stemming from member access on a non-object -- for instance, with UBSan -- is because the pointer is misaligned. Your list can be read as suggesting that the UBSan alignment check for member access would violate our guarantees.)
Part of my point is indeed an acknowledgement that valid objects can exist at misaligned addresses, and it should not be UB to perform a member access into them as long as the memory isn't accessed. Consider, say, a pointer serialized data structure held in an unaligned buffer. I am trying to say that code which drills into that data structure via that pointer and then works around the lack of alignment on the resulting address is not buggy; you seem to be suggesting that it is, and that the user has a responsibility to ensure that all of their pointer arithmetic is done on properly-aligned pointers. I don’t think that's a defensible model.(Note that sometimes the only way we detect the UB stemming from member access on a non-object -- for instance, with UBSan -- is because the pointer is misaligned. Your list can be read as suggesting that the UBSan alignment check for member access would violate our guarantees.)That’s true; I do not think the UBSan alignment check should be kicking in when we’re not accessing memory.
[1]: That's not completely true, as it's possible to create an object that is underaligned for its type:struct __attribute__((packed)) A {char k;struct B { int n; } b;} a;A::B *p = &a.b; // Misaligned pointer, now guaranteed OKint main() {int *q = &p->n; // UB? UBSan diagnoses this member accessreturn *q; // Obviously UB}It seems that we do need to have some syntactic rules for how far the known alignment propagates to handle this case; your proposed rules don't do the right thing here.
On 15 Jan 2016, at 08:14, John McCall via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> The question at hand is whether we should require the user to write this:
> misaligned_A_B *p = &a.b;
> instead of, say:
> A::B *p = &a.b;
> int x = *(misaligned_int*) &p->n;
> because we want to reserve the right to invoke undefined behavior and propagate our “knowledge" that p is 4-byte-aligned to “improve” the 1-byte-aligned access on the next line.
>
> My contention is that this is a clean and elegantly simple formal model that is disastrously bad for actual users because it is no longer possible to locally work around a mis-alignment bug without tracking the entire history of the pointer. It is the sort of compiler policy that gets people to roll their eyes and ask for new options called things like -fno-strict-type-alignment which gradually get adopted by 90% of projects.
I’ve had the misfortune to look at a lot of code that does unaligned access over the last few years. By far the most common reason for it that I’ve seen is networking code that uses packed structures to represent packets. For example:
__attribute__((packed))
struct somePacket
{
uint8_t a;
uint32_t b;
// ...
};
In your model, what happens when:
- I use field b directly?
- I take the address of field b and store it in an int* variable?
David
The compiler recognizes that this access is to a valid but underaligned uint32_t object and generates code assuming a lower alignment. This doesn’t change, except inasmuch as we gain a formal model that accepts the existence of valid-but-underaligned objects.
> - I take the address of field b and store it in an int* variable?
It’s not undefined behavior to form that pointer. It is, however, still undefined behavior to access the object through that int*, because that type assumes a higher alignment. (The undefined behavior buys us a lot here: otherwise, LLVM would have to assume that all pointers are unaligned unless it could prove that they point to aligned memory. That’s prohibitive.) However, if you don’t access the object as an int*, and instead access it in a less-aligned way, there’s no undefined behavior and the code is guaranteed to work.
For example, given this:
uint32_t *pb = &packet->b;
Under my model, this code would still have undefined behavior and might trap on an alignment-enforcing system:
uint32_t b = *pb;
This code would still have undefined behavior, because the formal type of the access is still uint32_t here:
uint32_t b;
memcpy(&b, pb, sizeof(b));
This code is fine:
uint32_t b;
memcpy(&b, (const char*) pb, sizeof(b));
As is this code:
__attribute__((aligned(1))) typedef uint32_t unaligned_uint32_t;
uint32_t b = *(unaligned_uint32_t*) pb;
Note that, under the language standards, both of the last two examples have undefined behavior: there’s no concept of a valid unaligned object at all, and if you shoe-horned one in, it would be probably be undefined behavior to take its address. Clang would be allowed to say “okay, you took the address of this, and we can assume it was actually properly aligned despite being the address of a less-aligned object” and then propagate that alignment assumption to the later accesses to promote the alignment assumption. The goal of my model — and perhaps I’ve mis-formalized it, but I think the goal is quite clear — is just to forswear this capability in the compiler.
John.
That is, for clarity, clang does now and should always continue to allow the following, despite the spec saying it doesn't need to:char *x = malloc(10);
int *y = (int*)&x[1]; // assign a misaligned int*.
char c = *(char*)y; // but accessed only as char*, so no problem.However, one part of your suggested rules that both I and Richard questioned was the requirement that the expression "&p->n" be valid, even if "p" is misaligned for its type. I still don't think that it's necessary or even particularly useful to start allowing that. And, note, that would be an actual change in behavior, not a clarification/formalization of existing behavior.
That is:1) Is it valid to do "p->n", when p is not a valid object which is properly aligned for its type?
2) Assuming that's not valid, does adding a & cause it to then be valid, via some special case? E.g. the rule in C that states that "&a[n]" translates to "(a + n)", and "&*a" translates to "a", regardless of the value or validity of "a". (Without that rule, &*a would be invalid, too, if "a" was null or misaligned.)Just to reiterate, I think the issue here is **not** about whether clang can make alignment assumptions in later code, it's about whether the member access expression *itself* is valid. (If it's not valid, then what happens in later code is irrelevant.)
We already have a warning for almost this cast, but it’s flawed in two respects:
- It’s noisy, so people normally turn it off.
- It’s silenced by an explicit cast.
The latter is problematic again in your model, because programmers now get no warning when they do the dangerous thing.
> This code would still have undefined behavior, because the formal type of the access is still uint32_t here:
> uint32_t b;
> memcpy(&b, pb, sizeof(b));
>
> This code is fine:
> uint32_t b;
> memcpy(&b, (const char*) pb, sizeof(b));
This makes me nervous, because presumably void* has an alignment of 1 and now we have different behaviour depending on whether we perform an implicit or explicit cast.
I’m also somewhat uncomfortable with the idea that assigning a uint32_t* temporary to a uint32_t* variable increases its alignment. I’d be tempted to propose modelling the alignment more explicitly in the C type system, so that &pb is not a uint32_t*, it’s a __alignment__(1) uint32_t* and can’t be implicitly cast to a uint32_t*. That would mean that we could explicitly warn on casts that increased the alignment and provide a type for b that would both preserve the (lack of) alignment. For example:
typedef __attribute__((aligned(1))) uint32_t unaligned_uint32_t;
struct foo
{
char a;
uint32_t b;
}
__attribute__((packed));
struct foo packet;
...
uint32_t *pb = &packet.b;
unaligned_uint32_t *upb = &packet.b;
Currently, this code is accepted by clang. I would propose that:
// This should not be allowed (or, if it is, with a warning that can be turned into an error)
uint32_t *pb = &packet.b;
// This should be permitted, but the static analyser should complain
uint32_t *pb = (uint32_t*)&packet.b;
// This should be permitted to silently work
unaligned_uint32_t *upb = &packet.b;
I believe that you would get most of this from clang by implicitly providing the alignment information to members of packed structs. This would mean that the type of packet.b would implicitly be unaligned_uint32_t, not uint32_t.
> As is this code:
> __attribute__((aligned(1))) typedef uint32_t unaligned_uint32_t;
> uint32_t b = *(unaligned_uint32_t*) pb;
>
> Note that, under the language standards, both of the last two examples have undefined behavior: there’s no concept of a valid unaligned object at all, and if you shoe-horned one in, it would be probably be undefined behavior to take its address. Clang would be allowed to say “okay, you took the address of this, and we can assume it was actually properly aligned despite being the address of a less-aligned object” and then propagate that alignment assumption to the later accesses to promote the alignment assumption. The goal of my model — and perhaps I’ve mis-formalized it, but I think the goal is quite clear — is just to forswear this capability in the compiler.
This is, as you say, a language extension, but it’s one that’s been supported by GCC since at least the 2.x days and existing code relies on it.
David