I think this is an interesting problem.
I'd probably lean towards the use of a separate attribute, but not strongly so.
The example which makes me prefer the separate attribute would be a function with an out-param. It's very unlikely that an out-param will be read on the exception path. Being able to perform DSE for such out params seems quite interesting.
However, I'll note that the same problem can be framed as an
escape problem. That is, we have an annotation not that a value
is dead on the exception path, but that it hasn't been captured on
entry to the routine. Then, we can apply local reasoning to show
that the first store can't be visible to may_unwind, and eliminate
it.
I'd want to give the escape framing more thought as that seems potentially more general. Does knowing that an argument does not point to escaped memory on entry help on all of your motivating examples?
Philip
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
I think this is an interesting problem.
I'd probably lean towards the use of a separate attribute, but not strongly so.
The example which makes me prefer the separate attribute would be a function with an out-param. It's very unlikely that an out-param will be read on the exception path. Being able to perform DSE for such out params seems quite interesting.
However, I'll note that the same problem can be framed as an escape problem. That is, we have an annotation not that a value is dead on the exception path, but that it hasn't been captured on entry to the routine. Then, we can apply local reasoning to show that the first store can't be visible to may_unwind, and eliminate it.
On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <list...@philipreames.com> wrote:
I think this is an interesting problem.
I'd probably lean towards the use of a separate attribute, but not strongly so.
The example which makes me prefer the separate attribute would be a function with an out-param. It's very unlikely that an out-param will be read on the exception path. Being able to perform DSE for such out params seems quite interesting.
Right. I think it's mainly a question of whether we'd be able to infer the attribute in practice, in cases where it's not annotated by the frontend (which it should do in the sret case). I think this is possible at least for the case where all calls to the function pass in an alloca to this parameter (or another argument with nounwindread I guess) and don't use a landingpad for the call. However, I believe we do inference in this direction (RPO rather than PO) only in the module optimization pipeline, which means that DSE/MemCpyOpt might not be able to make use of the inferred information.
A couple of points here:
Aside: sret has the under-specified problem today. I have no
idea when it would be legal to infer sret.
However, I'll note that the same problem can be framed as an escape problem. That is, we have an annotation not that a value is dead on the exception path, but that it hasn't been captured on entry to the routine. Then, we can apply local reasoning to show that the first store can't be visible to may_unwind, and eliminate it.
I don't think this would solve the same problem. In the given examples the pointer is already not visible to @may_unwind because it is noalias. "noalias" here is a weaker version of "not captured before": The pointer may be captured, but it's illegal to write through the captured pointer, which is sufficient for alias analysis. The problem with unwinding is that after the unwind, the calling function may read the stored value in a landingpad, which does not require any capture of the pointer.
On 12/7/21 12:29 AM, Nikita Popov wrote:
On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <list...@philipreames.com> wrote:
I think this is an interesting problem.
I'd probably lean towards the use of a separate attribute, but not strongly so.
The example which makes me prefer the separate attribute would be a function with an out-param. It's very unlikely that an out-param will be read on the exception path. Being able to perform DSE for such out params seems quite interesting.
Right. I think it's mainly a question of whether we'd be able to infer the attribute in practice, in cases where it's not annotated by the frontend (which it should do in the sret case). I think this is possible at least for the case where all calls to the function pass in an alloca to this parameter (or another argument with nounwindread I guess) and don't use a landingpad for the call. However, I believe we do inference in this direction (RPO rather than PO) only in the module optimization pipeline, which means that DSE/MemCpyOpt might not be able to make use of the inferred information.
A couple of points here:
- I often see attributes for which inference isn't a goal as being under specified. It's too easy not to think about all the corner cases up front, and that bites us later. I think it's important to have specified the valid inference rules as part of the initial definition discussion, even if we don't implement them. It forces us to think through the subtleties.
- I think you're alloc rule can be extended to any unescaped allocation for which we can indentify all accesses and that none are reachable on the exceptional path. The trivial call rule (there is no exceptional path) is one sub-case of that. This backwards walk may seem expensive, but I think we already do it in DSE, and could leave converting callsite attributes to functions attributes to a later RPO phase.
- You're right that we don't really do RPO today. See point (1). I wouldn't want to add such just for this.
Aside: sret has the under-specified problem today. I have no idea when it would be legal to infer sret.
I ran across something a bit similar, and thought I'd share the
case for purposes of idea generation.
For a function w/out-params, it's common to have cases where the out-params are not actually used by the callee. I've recently been making some improvements for the cases where the out-param is the only thing holding the call live (D115829), but if we actually use the return value, we're left with a dead write (inside the callee) which we can't seem to eliminate without inlining.
As an example:
declare i1 @callee(i32* %out) {
store i32 1, i32* %out
ret i1 true
}
declare void @test() {
%a = alloca i32
%res = call i1 @callee(i32* %a)
call void @use(%res)
}
If we had similar in spirit to your "nounwindread" but applied to
the normal return path (e.g. "noreadonreturn"), we could in
principal leverage this to simplify the callee. DSE has all the
information today to annotate "noreadonreturn" arguments at the
call site. We could have an IPO transform which merges the
information from all callees, and drops the writes. (We could
also e.g. specialize if not all had the param as dead.)
This particular case isn't strongly motivated enough to bother building out infrastructure for, but it's interesting that another somewhat analogous use case has popped up.
Philip
Who and when is the attribute added? If it is implied by sret that's
a good start. For the non-sret deduction it seems very specialized.
I mean, now we have something for the unwind case but not a different
"early exit" or if it is read/writeonly rather than readnone.
The argument about invoke vs. call instruction call sites only holds for
sret args anyway, so maybe what you are designing here is too sret specific.
Long term I'd like us to have a proper "side-effect" encoding with values
and that could include conditions, e.g.,
```
sideeffects( write(_unknown_, %arg), read(_unknown_),
unwind{write(_unknown_), read(_unknown_)},
cond(load %arg eq 0, {read($arg)})
)
```
While this is still long away (probably), I'm not convinced an attribute
that is specific to unwind *and* readnone is the right intermediate step.
It should compose better with readonly/writeonly/readnone at least.
All that said, would your deduction strategy alone solve the problem?
So, the cases you care about could they be optimized by looking at the
call sites and determining if none is an invoke?
~ Johannes
I somewhat missed this thread and while I should maybe respond
to a few of the other mails too I figured I start with a conceptual
question I have reading this:
Who and when is the attribute added? If it is implied by sret that's
a good start. For the non-sret deduction it seems very specialized.
I mean, now we have something for the unwind case but not a different
"early exit" or if it is read/writeonly rather than readnone.
The argument about invoke vs. call instruction call sites only holds for
sret args anyway, so maybe what you are designing here is too sret specific.
OK. That's interesting. I'm not fluent enough in rust, can you
elaborate what the semantics there would be, maybe an IR example?
Spitballing: `byval(nocopy, %a)` might be worth thinking about
given the short description.
>
> Note that as proposed, the noreadonunwind attribute would be the "writeonly
> on unwind" combination (and noreadonreturn the "writeonly on return"
> combination). I can see that there are conjugated "readonly on unwind" and
> "readonly on return" attributes that could be defined here, but I can't
> think of any circumstances under which these would actually be useful for
> optimization purposes. How would the presence or absence of later writes
> impact optimization in the current function?
Just as an example, `readonly on unwind` allows you to do GVN/CSE
from prior to the call to the "unwind path". Return then on the
"return path". That is not inside the call but in the caller.
Does that make sense?
>
> The argument about invoke vs. call instruction call sites only holds for
>> sret args anyway, so maybe what you are designing here is too sret
>> specific.
>>
> Not sure I follow, why would that argument only hold for sret?
```
static void I_will_unwind(int *A) {
*A = 42;
may_unwind();
*A = 4711;
unwind();
}
void someone_might_catch_me_as_I_also_unwind(int *A) {
/* call */ I_will_unwind(A);
}
```
Maybe I misunderstood your idea but doesn't the above show how
we have only call instruction call sites and we still cannot
assume the store is dead on the unwind path? If you check
transitively throughout the entire call chain it's different,
but that is not how I read your first mail. I figured it works
for sret because the memory does not outlive the caller.
~ Johannes
On 1/4/22 03:39, Nikita Popov wrote:
> On Mon, Jan 3, 2022 at 6:33 PM Johannes Doerfert <johannes...@gmail.com>
> wrote:
>
>> I somewhat missed this thread and while I should maybe respond
>> to a few of the other mails too I figured I start with a conceptual
>> question I have reading this:
>>
>> Who and when is the attribute added? If it is implied by sret that's
>> a good start. For the non-sret deduction it seems very specialized.
>> I mean, now we have something for the unwind case but not a different
>> "early exit" or if it is read/writeonly rather than readnone.
>>
> I'm mainly interested in frontend-annotated cases here, rather than deduced
> ones. The primary use case there is adding it to sret arguments (and only
> changing sret semantics would be "good enough" for me, I guess). However,
> there is another frontend-annotated case I have my eyes on, which is move
> arguments in rust. These could be modeled by a combination of
> noreadonunwind and noreadonreturn to indicate that the value will not be
> used after the call at all, regardless of how it exits. (This would be kind
> of similar to a byval argument, just without the ABI implication that an
> actual copy gets inserted.)
OK. That's interesting. I'm not fluent enough in rust, can you
elaborate what the semantics there would be, maybe an IR example?
Spitballing: `byval(nocopy, %a)` might be worth thinking about
given the short description.
>
> Note that as proposed, the noreadonunwind attribute would be the "writeonly
> on unwind" combination (and noreadonreturn the "writeonly on return"
> combination). I can see that there are conjugated "readonly on unwind" and
> "readonly on return" attributes that could be defined here, but I can't
> think of any circumstances under which these would actually be useful for
> optimization purposes. How would the presence or absence of later writes
> impact optimization in the current function?
Just as an example, `readonly on unwind` allows you to do GVN/CSE
from prior to the call to the "unwind path". Return then on the
"return path". That is not inside the call but in the caller.
Does that make sense?
>
> The argument about invoke vs. call instruction call sites only holds for
>> sret args anyway, so maybe what you are designing here is too sret
>> specific.
>>
> Not sure I follow, why would that argument only hold for sret?
```
static void I_will_unwind(int *A) {
*A = 42;
may_unwind();
*A = 4711;
unwind();
}
void someone_might_catch_me_as_I_also_unwind(int *A) {
/* call */ I_will_unwind(A);
}
```
Maybe I misunderstood your idea but doesn't the above show how
we have only call instruction call sites and we still cannot
assume the store is dead on the unwind path? If you check
transitively throughout the entire call chain it's different,
but that is not how I read your first mail. I figured it works
for sret because the memory does not outlive the caller.
```
s = "foobar".to_string();
// other code
virtual_memset(s, sizeof(s), 0);
ret void
```
Now DSE will do the work for us.
It is not clear if we could do something similar for the other cases
though.
Whatever we do, I can see how this is information that is worth encoding.
Maybe I am confused but I thought something like this pseudo-code
could be optimized, readonly_on_return is similar.
```
int a = 42;
invoke foo(/* readonly_on_unwind */ &a);
lp:
return a; // a == 42
cont:
return a; // a unknown
```
>
>>> The argument about invoke vs. call instruction call sites only holds for
>>>> sret args anyway, so maybe what you are designing here is too sret
>>>> specific.
>>>>
>>> Not sure I follow, why would that argument only hold for sret?
>> ```
>> static void I_will_unwind(int *A) {
>> *A = 42;
>> may_unwind();
>> *A = 4711;
>> unwind();
>> }
>> void someone_might_catch_me_as_I_also_unwind(int *A) {
>> /* call */ I_will_unwind(A);
>> }
>> ```
>>
>> Maybe I misunderstood your idea but doesn't the above show how
>> we have only call instruction call sites and we still cannot
>> assume the store is dead on the unwind path? If you check
>> transitively throughout the entire call chain it's different,
>> but that is not how I read your first mail. I figured it works
>> for sret because the memory does not outlive the caller.
>>
> Ah yes, this was imprecise in the original mail. We need that a) it's only
> used with call (if we don't want to analyze the unwind paths to be more
> precise) and b) is noreadonunwind itself. Where the latter might be because
> it's based on an argument with that attribute, or because it's an alloca,
> which is always noreadonunwind.
Right, something along those lines.
~ Johannes
I think the answer to this is "never", because sret is considered an ABI attribute -- though to be honest I'm not really clear in which way it actually affects the call ABI.
So I misunderstood the entire idea :D
I am not sure how I feel about such a new "category" of attributes.
Can we try to throw around some ideas before we commit to it?
One of the weirdest things is that the semantics are somewhat
described in terms of the caller. Making it callee-centric might
help, e.g., `poison_on_unwind(%a)` to indicate the value in the
unwind case is "not read" or rather, replaced with poison.
You noted in the other mail that we don't want to make unwind paths
explicit (which I agree with, FWIW). However, in the caller we already
have them explicitly, no? So what materializing the "virtual memset"/
lifetime.end stuff there? Frontends (like Rust) can do it based on the
callee type so that should not be a problem.
Thoughts?
You noted in the other mail that we don't want to make unwind paths
explicit (which I agree with, FWIW). However, in the caller we already
have them explicitly, no? So what materializing the "virtual memset"/
lifetime.end stuff there? Frontends (like Rust) can do it based on the
callee type so that should not be a problem.
Or we lift these things into the inter-procedural space ;)
>
> At this point I'm starting to lean towards not introducing a separate
> attribute for this, and only tweaking sret semantics to specify it can't be
> read on unwind. That's the original motivation, and I'm not sure solving
> something more general is really worthwhile, as this turned out to be
> trickier than I expected.
Making sret imply this seems fine to me.
> For the Rust move arguments, I think that the proper modeling is likely a
> stronger form of "noalias" rather than this kind of
> noreadafterunwind/noreadafterreturn attributes. Normally, "noalias" only
> means that it is noalias for the duration of the call. Rust move arguments
> remain "noalias" after the call. That is, even after we return or unwind,
> accesses to the memory can only happen through pointers based on the
> original noalias argument (which would have to be captured at that point).
> I just realized that this is similar (the same?) as the semantics for
> "noalias" on return values, which is not the same as "noalias" on
> arguments. Not sure what to call this concept though. really_noalias :)
noalias on args becomes restrict, and noalias on return value
semantics will become noalias?
Anyway, we should keep this in mind for later.