After discussing this with Chandler offline last week, here's the proposed solution: instead of having both !alias.scope and !noalias metadata, we'll have only !alias.scope metadata and an intrinsic: i8* @llvm.noalias(i8* ptr, !metadata !?) where the metadata argument corresponds to a list of !alias.scopes. The idea being that the pointer returned by this intrinsic, and all pointers derived from it, are assumed not to alias with memory accesses tagged with any of the associated !alias.scope metadata entries.
This intrinsic needs to carry control dependencies (it cannot be hoisted out of a loop, for example) -- in this sense it is very much like @llvm.assume. And like @llvm.assume, we'll need to add logic to various passes to ignore it as appropriate so that it does not block optimizations unnecessarily.
From a different scope? No. Here's why:
int i = ...;
T a, b;
T * y1 = ..., y2 = ...
{
T * restrict x = y1;
a = x[i];
}
{
T * restrict x = y2;
b = x[i];
}
which becomes:
int i = ...;
T a, b;
T * y1 = ..., y2 = ...
T * x1 = @llvm.noalias(y1, !scope1);
a = x1[i]; // alias.scope !scope1
T * x2 = @llvm.noalias(y2, !scope2);
b = x2[i]; // alias.scope !scope2
but can we assume that 'x1[i]' does not alias with 'x2[i]'? No.
Now one possible design here is to solve this problem by mutual dominance, but to do that, we'd need to make code motion around the @llvm.noalias calls highly restricted. I'd like to allow for as much data motion as possible.
>
>
> 2. Can you describe the case for explicitly preventing the intrinsic
> being hoisted out of a loop? Since the intrinsic generates a new
> pointer, wouldn't the interesting cases be handled by avoiding
> commoning of the intrinsic?
>
No, here's why. Take the following example:
T * y = ...;
for (int i = 0; i < 1600; ++i) {
T * restrict x = y;
T t = x[i];
t += 1;
x[i] = t;
}
which becomes something like:
T * y = ...;
for (int i = 0; i < 1600; ++i) {
T * x = @llvm.noalias(y, !scope1)
T t = x[i]; // alias.scope !scope1
t += 1;
x[i] = t; // alias.scope !scope1
}
now we need to make sure that the call to @llvm.noalias is not hoisted out of the loop. In order to do that, we tag it as writing (at least to y). The problem is that, as noted in the original e-mail, when looking across loop iterations (as the loop vectorizer must do), it is important to be able to test whether or not the call to @llvm.noalias dominates the loop in question. This example is over-simplified (because there is only one pointer used in the loop), but imagine there are several such pointers used. Does that make sense?
Thanks again,
Hal
This intrinsic needs to carry control dependencies (it cannot be hoisted out of a loop, for example) -- in this sense it is very much like @llvm.assume. And like @llvm.assume, we'll need to add logic to various passes to ignore it as appropriate so that it does not block optimizations unnecessarily.
We should do something to make this simpler. I think we should have an intrinsic inst base class that assume, lifetime, and other intrinsics which do not represent actual code in the final program derive from so that we don't have to update these lists all over the place. If we need 2 tiers to model assume & noalias as distinct from the lifetime or other intrinsics, fine. We should have high-level categories that can be tested and updated.
----- Original Message -----
> From: "Raul Silvera" <rsil...@google.com>
> To: "Hal Finkel" <hfi...@anl.gov>
> Cc: "Chandler Carruth" <chan...@google.com>, "LLVM Developers Mailing List" <llv...@cs.uiuc.edu>
> Sent: Friday, November 14, 2014 10:34:39 AM
> Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
>
>
>
> Hal, a couple of questions:
>
>
> 1. Do you preserve alias to stores+loads derived through a different
> @llvm.noalias call which has different scope metadata? Couldn't that
> be treated similarly to S+Ls derived from other identified objects?
From a different scope? No. Here's why:
int i = ...;
T a, b;
You don't have x1 and x2 in your example, assuming you mean:
int i = 0;
T A;
T * y2 = ...
{
T * x1 = &A;
a = x1[i];
}
{
T * restrict x2 = y2;
b = x2[i];
}
then the answer is no, the fact that x2 is restrict qualified does not allow us to conclude that &x2[i] and &x1[i] don't alias.
No, it doesn't. The fact that x2 is restrict does not mean that it does not alias with any other potential accesses from variables live in its block. It only means it does not alias with other accesses with that occur in the block where x2 is live. There is no access to A or x1 in that block, so we can say nothing about it.
>
>
> This example is flawed anyway because it only has loads,
Yes, but I'm ignoring that for the time being.
> so the
> aliasing isn't that interesting. Something more realistic:
>
>
>
T A, B;
T * x1 = .... // either &A or &B
T * y2 = .... // maybe &A
{
T * restrict x2 = y2;
*x1 = ...
*x2 = ...
}
>
>
>
> In this case you'll be able to tell *x1 doesn't alias *x2, right?
In this case, yes, we can conclude that x1 and x2 don't alias (because *x1 and *x2 cannot both legally refer to the same object).
> How about if you add restrict to x1?
The conclusion is the same, but if you add restrict to x1, you don't need it on x2. x2 is definitely not based on x1, so if x1 is restrict, then we know that x1 and x2 don't alias.
Thanks again,
Hal
> You don't have x1 and x2 in your example, assuming you mean:
>
> int i = 0;
> T A;
> T * y2 = ...
> {
> T * x1 = &A;
> a = x1[i];
> }
> {
> T * restrict x2 = y2;
> b = x2[i];
> }
>
> It should, no? by virtue of x2 being restrict you know that *x2
> doesn't alias A, and *x1 is A.
No, it doesn't. The fact that x2 is restrict does not mean that it does not alias with any other potential accesses from variables live in its block. It only means it does not alias with other accesses with that occur in the block where x2 is live. There is no access to A or x1 in that block, so we can say nothing about it.
T A, B;
T * x1 = .... // either &A or &B
T * y2 = .... // maybe &A
{
T * restrict x2 = y2;
*x1 = ...
*x2 = ...
}
>
> In this case you'll be able to tell *x1 doesn't alias *x2, right?
In this case, yes, we can conclude that x1 and x2 don't alias (because *x1 and *x2 cannot both legally refer to the same object).
> How about if you add restrict to x1?
The conclusion is the same, but if you add restrict to x1, you don't need it on x2. x2 is definitely not based on x1, so if x1 is restrict, then we know that x1 and x2 don't alias.
I don't understand exactly what you're saying here. You can do that at the source level where you still have the original blocks. The problem is that, at the IR level, these blocks don't remain separate basic blocks, and the distinction then matters.
>
>
> Going further, logically the intrinsic should return a pointer to a
> new object, disjoint from all other live objects. It is not aliased
> to A, and is well defined even if it contains &A because A is not
> referenced in the scope.
This is essentially what is done, but only for accesses in the scope (or some sub-scope). I don't think the semantics allow for what you're suggesting. The specific language from 6.7.3.1p4 says:
[from C]
During each execution of B, let L be any lvalue that has &L based on P. If L is used to
access the value of the object X that it designates, ...,
then the following requirements apply: ... Every other lvalue
used to access the value of X shall also have its address based on P.
[end from C]
Where B is defined in 6.7.3.1p2 to be, essentially, the block in which the relevant declaration appears. And we can really only extrapolate from that to the other access in that block, and not to the containing block.
This does require dataflow barriers on
> entrance/exits to the scope, but those can be made no worse than the
> original code.
These don't turn into general scheduling barriers anyway. They'll be tagged as writing to memory, yes, but like with @llvm.assume, they'll get special treatment in BasicAA and a few other places so they don't hurt code motion too badly.
>
>
>
> Aliasing x2 to A is not only unnecessary, but also pessimistic
It is pessimistic, but only in the sense that the restrict qualifier does not say anything about it.
> because in general you do not have access to the dynamic scope of
> the restricted pointer.
>
>
>
>
> T A, B;
> T * x1 = .... // either &A or &B
> T * y2 = .... // maybe &A
> {
> T * restrict x2 = y2;
> *x1 = ...
> *x2 = ...
> }
>
> >
> > In this case you'll be able to tell *x1 doesn't alias *x2, right?
>
> In this case, yes, we can conclude that x1 and x2 don't alias
> (because *x1 and *x2 cannot both legally refer to the same object).
>
> > How about if you add restrict to x1?
>
> The conclusion is the same, but if you add restrict to x1, you don't
> need it on x2. x2 is definitely not based on x1, so if x1 is
> restrict, then we know that x1 and x2 don't alias.
>
> Agreed. So will your approach be able to catch both cases? It seemed
> to me it wouldn't be able to catch the second one because it would
> have a different scope, but probably I'm missing something.
Yes, it will catch it. Just as in the current metadata design, the scope of each access is really a list of scopes. The accesses in the inner blocks get tagged with both the inner and the outer scopes, so they pick up the restrict from the outer scope.
>
>
> Thanks for your patience,
>
Not a problem; I appreciate the feedback!
-Hal
>
>
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
> > You don't have x1 and x2 in your example, assuming you mean:
> >
> > int i = 0;
> > T A;
> > T * y2 = ...
> > {
> > T * x1 = &A;
> > a = x1[i];
> > }
> > {
> > T * restrict x2 = y2;
> > b = x2[i];
> > }
> >
> > It should, no? by virtue of x2 being restrict you know that *x2
> > doesn't alias A, and *x1 is A.
>
> No, it doesn't. The fact that x2 is restrict does not mean that it
> does not alias with any other potential accesses from variables live
> in its block. It only means it does not alias with other accesses
> with that occur in the block where x2 is live. There is no access to
> A or x1 in that block, so we can say nothing about it.
>
>
>
> It does. You can assume x2 is not aliased to A and still get
> well-defined semantics, precisely because A is not referenced in the
> scope of x2. That refinement would only get you into trouble if A is
> referenced in the scope of x2, which would trigger UB.
I don't understand exactly what you're saying here. You can do that at the source level where you still have the original blocks. The problem is that, at the IR level, these blocks don't remain separate basic blocks, and the distinction then matters.
> Going further, logically the intrinsic should return a pointer to a
> new object, disjoint from all other live objects. It is not aliased
> to A, and is well defined even if it contains &A because A is not
> referenced in the scope.
This is essentially what is done, but only for accesses in the scope (or some sub-scope). I don't think the semantics allow for what you're suggesting. The specific language from 6.7.3.1p4 says:
[from C]
During each execution of B, let L be any lvalue that has &L based on P. If L is used to
access the value of the object X that it designates, ...,
then the following requirements apply: ... Every other lvalue
used to access the value of X shall also have its address based on P.
[end from C]
Where B is defined in 6.7.3.1p2 to be, essentially, the block in which the relevant declaration appears. And we can really only extrapolate from that to the other access in that block, and not to the containing block.
I preserve them only weakly... I don't want full barriers; in fact, I plan to add InstCombine code to combine calls to @llvm.noalias (it will also take a list of scopes, not just one, so this is possible). The goal is to have as few barriers as possible.
> so t here are no pairs of accesses to incorrectly reorder as all
> accesses to A in
> the block are done through P. You just need to delimit the block
> with dataflow barriers
> , summar iz
> ing the effect of the block at entry/exit.
Okay, I think I agree with you assuming that we put in entry/exit barriers to preserve the block boundaries. I'd specifically like to avoid that, however.
>
>
>
> This is similar to the way dummy args are implemented on Fortran
> compilers, extended to arbitrary scopes.
Interesting.
Thanks again,
Hal
_______________________________________________
I preserve them only weakly... I don't want full barriers; in fact, I plan to add InstCombine code to combine calls to @llvm.noalias (it will also take a list of scopes, not just one, so this is possible). The goal is to have as few barriers as possible.
Thanks for explaining, I now understand what you're proposing.
>
> This approach makes the restrict attribute effective against all live
> variables without having to examine the extent of the scope to
> collect all references, which is in general impractical.
I think you've misunderstood this. For restrict-qualified local variables, every memory access within the containing block (which is everything in the function for function argument restrict-qualified pointers) get tagged with the scope. This is trivial to determine.
> It also
> removes the need for scope metadata, as there would be no need to
> name the scopes.
Indeed.
>
>
> Anyway, this is just a general alternate design, since you were
> asking for one.
Yes, and thank you for doing so.
> I'm sure still would take some time/effort to map it
> onto the LLVM framework.
That does not seem too difficult, the question is really just whether or not it gives us what we need...
>
So in this scheme, we'd have the following:
void foo(T * restrict a, T * restrict b) {
*a = *b;
}
T * x = ..., *y = ..., *z = ..., *w = ...;
foo(x, y);
foo(z, w);
become:
T * x = ..., *y = ..., *z = ..., *w = ...;
T * a1 = @llvm.noalias.start(x); // model: reads from x (with a general write control dep).
T * b1 = @llvm.noalias.start(y);
*a1 = *b1;
@llvm.noalias.end(a1, x); // model: reads from a1, writes to x.
@llvm.noalias.end(b1, y);
T * a2 = @llvm.noalias.start(z);
T * b2 = @llvm.noalias.start(w);
*a2 = *b2;
@llvm.noalias.end(a2, z);
@llvm.noalias.end(b2, w);
This does indeed seem generally equivalent to the original proposal in the sense that the original proposal has an implicit ending barrier at the last relevant derived access, and here we have explicit ending barriers. The advantage is the lack of metadata (and associated implementation complexity). The disadvantage is that we have additional barriers to manage, and these are write barriers on the underlying pointers. It is not clear to me this would make too much difference, so long as we aggressively hoisted the ending barriers to just after the last use based on their 'noalias' operands.
So this is relatively appealing, and I think would not be a bad way to model C99 restrict (extending the scheme to handle mutually-ambiguous restrict-qualified pointers from aggregates seems straightforward). It does not, however, cover cases where the region of guaranteed disjointness (for lack of a better term) is not continuous. This will come up when implementing a scheme such as that in the current C++ alias-set proposal (N4150). To construct a quick example, imagine that our implementation of std::vector is annotated such that (assuming the standard allocator) each std::vector object's internal storage has a distinct alias set, and we have:
std::vector<T> x, y;
...
T * q = &x[0];
for (int i = 0; i < 1600; ++i) {
x[i] = y[i];
*q += x[i];
}
so here we know that the memory accesses inside the operator[] from x and y don't alias, but the alias-set attribute does not tell us about the relationship between those accesses and the *q. The point of dominance, however, needs to associated with the declaration of x and y (specifically, we want to preserve the dominance over the loop). A start/end barrier scheme localized around the inlined operator[] functions would not do that, and placing start/end barriers around the entire live region of x and y would not be correct. I can, however, represent this using the metadata scheme.
Thanks again,
Hal
>
>
>
> Regards,
Within the current AA framework, this question has no well-defined meaning. AA only answers queries sensibly regarding values that are simultaneously live and referenced at some point in the function's control flow. Metadata aside, even with the current function argument attributes, querying alias('B', '*b') will respond with NoAlias (for both values, llvm::isIdentifiedObject will return true -- it will return true for both GlobalVariables and for noalias arguments).
This is another example of why you cannot use our current AA framework for IPA, doing so simply does not make sense. The metadata/intrinsics scheme I've proposed does not change this.
>
>
> Also, what if the reference is in a side path, like this:
>
>
> T A, B;
> void foo(T *restrict a, T *restrict b, bool c) {
> T tmp = *b;
> if (c) {
> A = tmp;
> }
> *a = *b;
> }
>
>
> Will you alias A to *a and *b? Seems to me you have to, as foo(&A,
> &B, false) has well-defined semantics.
No, and it does not do so now. Querying A or B (global variables) vs. *a or *b will return NoAlias. There is no point in the CFG of this function where an object accessed through *a or *b can legally be A or B.
Now to the proposed metadata/intrinsics, let's assume that A and B are local variables in the caller. It is true that *a and *b will be based on a @llvm.noalias result, but that noalias result will only be used if the access is tagged with the associated metadata scope (which will be true for *a, *b and the A within the inlined function, but not true otherwise, thus the semantics will be identical after inlining.
> On the scheme I proposed
> there is no need, as the exit barrier would separate *a from
> references to A after the call. Not aliasing A *b will save the
> reload of *b after the conditional.
I agree your scheme will also capture this.
Thanks again,
Hal
_______________________________________________
I was planning on keeping the current logic within the inliner that will adjust for that... if you inline a callsite with scope metadata attached, it will add it to all of the memory accessing instructions that result from the inlining.
>
>
> About N4150, I agree your scheme is needed to implement it.
Okay, thanks!
> Another
> idea for extending C99 restrict to C++ might be to annotate pointer
> class members as "owned" or "unique", to allow the restrict property
> to propagate through.
>
Agreed. I like the simple expressive power of the current proposal -- we'll see if I still feel that way once I've completed the implementation ;)
-Hal
_______________________________________________