[llvm-dev] sret read after unwind

169 views
Skip to first unread message

Nikita Popov via llvm-dev

unread,
Dec 4, 2021, 5:39:39 AM12/4/21
to llvm-dev
Hi,

Consider the following IR:

declare void @may_unwind()
define void @test(i32* noalias sret(i32) %out) {
    store i32 0, i32* %out
    call void @may_unwind()
    store i32 1, i32* %out
    ret void
}

Currently, we can't remove the first store as dead, because the @may_unwind() call may unwind, and the caller might read %out at that point, making the first store visible.

Similarly, it prevents call slot optimization in the following example, because the call may unwind and make an early write to the sret argument visible:

declare void @may_unwind(i32*)
declare void @llvm.memcpy.p0i8.p0i8.i64(i8*, i8*, i64, i1)
define void @test(i32* noalias sret(i32) %arg) {
    %tmp = alloca i32
    call void @may_unwind(i32* nocapture %tmp)
    %tmp.8 = bitcast i32* %tmp to i8*
    %arg.8 = bitcast i32* %arg to i8*
    call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %arg.8, i8* align 4 %tmp.8, i64 4, i1 false)
    ret void
}

I would like to address this in some form. The easiest way would be to change LangRef to specify that sret arguments cannot be read on unwind paths. I think that matches how sret arguments are generally used.

Alternatively, this could be handled using a separate attribute that can be applied to any argument, something along the lines of "i32* nounwindread sret(i32) %arg". The benefit would be that this is decoupled from sret ABI semantics and could potentially be inferred (e.g. if the function is only ever used with call and not invoke, this should be a given).

Any thoughts on this? Is this a problem worth solving, and if yes, would a new attribute be preferred over restricting sret semantics?

Regards,
Nikita

Philip Reames via llvm-dev

unread,
Dec 6, 2021, 4:51:57 PM12/6/21
to Nikita Popov, llvm-dev

I think this is an interesting problem.

I'd probably lean towards the use of a separate attribute, but not strongly so.

The example which makes me prefer the separate attribute would be a function with an out-param.  It's very unlikely that an out-param will be read on the exception path.  Being able to perform DSE for such out params seems quite interesting.

However, I'll note that the same problem can be framed as an escape problem.  That is, we have an annotation not that a value is dead on the exception path, but that it hasn't been captured on entry to the routine.  Then, we can apply local reasoning to show that the first store can't be visible to may_unwind, and eliminate it. 

I'd want to give the escape framing more thought as that seems potentially more general.  Does knowing that an argument does not point to escaped memory on entry help on all of your motivating examples?

Philip

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Nikita Popov via llvm-dev

unread,
Dec 7, 2021, 3:30:18 AM12/7/21
to Philip Reames, llvm-dev
On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <list...@philipreames.com> wrote:

I think this is an interesting problem.

I'd probably lean towards the use of a separate attribute, but not strongly so.

The example which makes me prefer the separate attribute would be a function with an out-param.  It's very unlikely that an out-param will be read on the exception path.  Being able to perform DSE for such out params seems quite interesting.

Right. I think it's mainly a question of whether we'd be able to infer the attribute in practice, in cases where it's not annotated by the frontend (which it should do in the sret case). I think this is possible at least for the case where all calls to the function pass in an alloca to this parameter (or another argument with nounwindread I guess) and don't use a landingpad for the call. However, I believe we do inference in this direction (RPO rather than PO) only in the module optimization pipeline, which means that DSE/MemCpyOpt might not be able to make use of the inferred information.

However, I'll note that the same problem can be framed as an escape problem.  That is, we have an annotation not that a value is dead on the exception path, but that it hasn't been captured on entry to the routine.  Then, we can apply local reasoning to show that the first store can't be visible to may_unwind, and eliminate it.

I don't think this would solve the same problem. In the given examples the pointer is already not visible to @may_unwind because it is noalias. "noalias" here is a weaker version of "not captured before": The pointer may be captured, but it's illegal to write through the captured pointer, which is sufficient for alias analysis. The problem with unwinding is that after the unwind, the calling function may read the stored value in a landingpad, which does not require any capture of the pointer.

Regards,
Nikita

Philip Reames via llvm-dev

unread,
Dec 13, 2021, 1:39:59 PM12/13/21
to Nikita Popov, llvm-dev


On 12/7/21 12:29 AM, Nikita Popov wrote:
On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <list...@philipreames.com> wrote:

I think this is an interesting problem.

I'd probably lean towards the use of a separate attribute, but not strongly so.

The example which makes me prefer the separate attribute would be a function with an out-param.  It's very unlikely that an out-param will be read on the exception path.  Being able to perform DSE for such out params seems quite interesting.

Right. I think it's mainly a question of whether we'd be able to infer the attribute in practice, in cases where it's not annotated by the frontend (which it should do in the sret case). I think this is possible at least for the case where all calls to the function pass in an alloca to this parameter (or another argument with nounwindread I guess) and don't use a landingpad for the call. However, I believe we do inference in this direction (RPO rather than PO) only in the module optimization pipeline, which means that DSE/MemCpyOpt might not be able to make use of the inferred information.

A couple of points here:

  1. I often see attributes for which inference isn't a goal as being under specified.  It's too easy not to think about all the corner cases up front, and that bites us later.  I think it's important to have specified the valid inference rules as part of the initial definition discussion, even if we don't implement them.  It forces us to think through the subtleties. 
  2. I think you're alloc rule can be extended to any unescaped allocation for which we can indentify all accesses and that none are reachable on the exceptional path.  The trivial call rule (there is no exceptional path) is one sub-case of that.  This backwards walk may seem expensive, but I think we already do it in DSE, and could leave converting callsite attributes to functions attributes to a later RPO phase.
  3. You're right that we don't really do RPO today.  See point (1).  I wouldn't want to add such just for this. 

Aside: sret has the under-specified problem today.  I have no idea when it would be legal to infer sret. 

However, I'll note that the same problem can be framed as an escape problem.  That is, we have an annotation not that a value is dead on the exception path, but that it hasn't been captured on entry to the routine.  Then, we can apply local reasoning to show that the first store can't be visible to may_unwind, and eliminate it.

I don't think this would solve the same problem. In the given examples the pointer is already not visible to @may_unwind because it is noalias. "noalias" here is a weaker version of "not captured before": The pointer may be captured, but it's illegal to write through the captured pointer, which is sufficient for alias analysis. The problem with unwinding is that after the unwind, the calling function may read the stored value in a landingpad, which does not require any capture of the pointer.
You are completely correct, particularly on that last point.  Don't know what I was thinking when I first responded. 

Nikita Popov via llvm-dev

unread,
Dec 13, 2021, 3:13:14 PM12/13/21
to Philip Reames, llvm-dev
On Mon, Dec 13, 2021 at 7:39 PM Philip Reames <list...@philipreames.com> wrote:


On 12/7/21 12:29 AM, Nikita Popov wrote:
On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <list...@philipreames.com> wrote:

I think this is an interesting problem.

I'd probably lean towards the use of a separate attribute, but not strongly so.

The example which makes me prefer the separate attribute would be a function with an out-param.  It's very unlikely that an out-param will be read on the exception path.  Being able to perform DSE for such out params seems quite interesting.

Right. I think it's mainly a question of whether we'd be able to infer the attribute in practice, in cases where it's not annotated by the frontend (which it should do in the sret case). I think this is possible at least for the case where all calls to the function pass in an alloca to this parameter (or another argument with nounwindread I guess) and don't use a landingpad for the call. However, I believe we do inference in this direction (RPO rather than PO) only in the module optimization pipeline, which means that DSE/MemCpyOpt might not be able to make use of the inferred information.

A couple of points here:

  1. I often see attributes for which inference isn't a goal as being under specified.  It's too easy not to think about all the corner cases up front, and that bites us later.  I think it's important to have specified the valid inference rules as part of the initial definition discussion, even if we don't implement them.  It forces us to think through the subtleties. 
  2. I think you're alloc rule can be extended to any unescaped allocation for which we can indentify all accesses and that none are reachable on the exceptional path.  The trivial call rule (there is no exceptional path) is one sub-case of that.  This backwards walk may seem expensive, but I think we already do it in DSE, and could leave converting callsite attributes to functions attributes to a later RPO phase.
  3. You're right that we don't really do RPO today.  See point (1).  I wouldn't want to add such just for this.
In terms of detailed semantics, I think the main interesting question is what exactly "no read on unwind" means. I see two general approaches: The first is that reading (or possibly accessing) the argument memory after an unwind is immediate undefined behavior. The other is that the behavior is "as if" the argument memory is overwritten with poison on unwind. This means that the memory can be read without UB, but it cannot depend on any value written into it during the call. For example, if the argument memory is fully overwritten after the call and read again afterwards, that would still be nounwindread. I'd personally lean towards the latter interpretation, in that it is more generally applicable without giving up any useful optimization power that I see.

The other question would be what "argument memory" is. This could either be the whole underlying "allocated object" associated with the argument, or the size of the memory region would have to be specified as an attribute argument. So something like "i32* noalias nounwindread(4) %out" to say that the four bytes starting at the passed pointer are not read on unwind. I'd lean towards the former here, because it is simpler in terms of analysis/reasoning, and works even if we don't know the exact access location, just the underlying object.

Aside: sret has the under-specified problem today.  I have no idea when it would be legal to infer sret. 

I think the answer to this is "never", because sret is considered an ABI attribute -- though to be honest I'm not really clear in which way it actually affects the call ABI.

Regards,
Nikita

Philip Reames via llvm-dev

unread,
Dec 16, 2021, 5:05:08 PM12/16/21
to Nikita Popov, llvm-dev

I ran across something a bit similar, and thought I'd share the case for purposes of idea generation. 

For a function w/out-params, it's common to have cases where the out-params are not actually used by the callee.  I've recently been making some improvements for the cases where the out-param is the only thing holding the call live (D115829), but if we actually use the return value, we're left with a dead write (inside the callee) which we can't seem to eliminate without inlining.

As an example:

declare i1 @callee(i32* %out) {


  store i32 1, i32* %out

  ret i1 true
}

declare void @test() {
  %a = alloca i32
   %res = call i1 @callee(i32* %a)
   call void @use(%res)
}

If we had similar in spirit to your "nounwindread" but applied to the normal return path (e.g. "noreadonreturn"), we could in principal leverage this to simplify the callee.  DSE has all the information today to annotate "noreadonreturn" arguments at the call site.  We could have an IPO transform which merges the information from all callees, and drops the writes.  (We could also e.g. specialize if not all had the param as dead.)

This particular case isn't strongly motivated enough to bother building out infrastructure for, but it's interesting that another somewhat analogous use case has popped up.

Philip

On 12/4/21 2:39 AM, Nikita Popov via llvm-dev wrote:

Johannes Doerfert via llvm-dev

unread,
Jan 3, 2022, 12:33:35 PM1/3/22
to Nikita Popov, llvm-dev
I somewhat missed this thread and while I should maybe respond
to a few of the other mails too I figured I start with a conceptual
question I have reading this:

Who and when is the attribute added? If it is implied by sret that's
a good start. For the non-sret deduction it seems very specialized.
I mean, now we have something for the unwind case but not a different
"early exit" or if it is read/writeonly rather than readnone.

The argument about invoke vs. call instruction call sites only holds for
sret args anyway, so maybe what you are designing here is too sret specific.

Long term I'd like us to have a proper "side-effect" encoding with values
and that could include conditions, e.g.,

```
sideeffects(       write(_unknown_, %arg), read(_unknown_),
            unwind{write(_unknown_), read(_unknown_)},
            cond(load %arg eq 0, {read($arg)})
           )
```

While this is still long away (probably), I'm not convinced an attribute
that is specific to unwind *and* readnone is the right intermediate step.
It should compose better with readonly/writeonly/readnone at least.

All that said, would your deduction strategy alone solve the problem?
So, the cases you care about could they be optimized by looking at the
call sites and determining if none is an invoke?

~ Johannes

Nikita Popov via llvm-dev

unread,
Jan 4, 2022, 4:39:48 AM1/4/22
to Johannes Doerfert, llvm-dev
On Mon, Jan 3, 2022 at 6:33 PM Johannes Doerfert <johannes...@gmail.com> wrote:
I somewhat missed this thread and while I should maybe respond
to a few of the other mails too I figured I start with a conceptual
question I have reading this:

Who and when is the attribute added? If it is implied by sret that's
a good start. For the non-sret deduction it seems very specialized.
I mean, now we have something for the unwind case but not a different
"early exit" or if it is read/writeonly rather than readnone.

I'm mainly interested in frontend-annotated cases here, rather than deduced ones. The primary use case there is adding it to sret arguments (and only changing sret semantics would be "good enough" for me, I guess). However, there is another frontend-annotated case I have my eyes on, which is move arguments in rust. These could be modeled by a combination of noreadonunwind and noreadonreturn to indicate that the value will not be used after the call at all, regardless of how it exits. (This would be kind of similar to a byval argument, just without the ABI implication that an actual copy gets inserted.)

Note that as proposed, the noreadonunwind attribute would be the "writeonly on unwind" combination (and noreadonreturn the "writeonly on return" combination). I can see that there are conjugated "readonly on unwind" and "readonly on return" attributes that could be defined here, but I can't think of any circumstances under which these would actually be useful for optimization purposes. How would the presence or absence of later writes impact optimization in the current function?

The argument about invoke vs. call instruction call sites only holds for
sret args anyway, so maybe what you are designing here is too sret specific.

Not sure I follow, why would that argument only hold for sret?

Regards,
Nikita

Johannes Doerfert via llvm-dev

unread,
Jan 4, 2022, 11:27:11 AM1/4/22
to Nikita Popov, llvm-dev

On 1/4/22 03:39, Nikita Popov wrote:
> On Mon, Jan 3, 2022 at 6:33 PM Johannes Doerfert <johannes...@gmail.com>
> wrote:
>
>> I somewhat missed this thread and while I should maybe respond
>> to a few of the other mails too I figured I start with a conceptual
>> question I have reading this:
>>
>> Who and when is the attribute added? If it is implied by sret that's
>> a good start. For the non-sret deduction it seems very specialized.
>> I mean, now we have something for the unwind case but not a different
>> "early exit" or if it is read/writeonly rather than readnone.
>>
> I'm mainly interested in frontend-annotated cases here, rather than deduced
> ones. The primary use case there is adding it to sret arguments (and only
> changing sret semantics would be "good enough" for me, I guess). However,
> there is another frontend-annotated case I have my eyes on, which is move
> arguments in rust. These could be modeled by a combination of
> noreadonunwind and noreadonreturn to indicate that the value will not be
> used after the call at all, regardless of how it exits. (This would be kind
> of similar to a byval argument, just without the ABI implication that an
> actual copy gets inserted.)

OK. That's interesting. I'm not fluent enough in rust, can you
elaborate what the semantics there would be, maybe an IR example?

Spitballing: `byval(nocopy, %a)` might be worth thinking about
given the short description.


>
> Note that as proposed, the noreadonunwind attribute would be the "writeonly
> on unwind" combination (and noreadonreturn the "writeonly on return"
> combination). I can see that there are conjugated "readonly on unwind" and
> "readonly on return" attributes that could be defined here, but I can't
> think of any circumstances under which these would actually be useful for
> optimization purposes. How would the presence or absence of later writes
> impact optimization in the current function?

Just as an example, `readonly on unwind` allows you to do GVN/CSE
from prior to the call to the "unwind path". Return then on the
"return path". That is not inside the call but in the caller.
Does that make sense?


>
> The argument about invoke vs. call instruction call sites only holds for
>> sret args anyway, so maybe what you are designing here is too sret
>> specific.
>>
> Not sure I follow, why would that argument only hold for sret?

```
static void I_will_unwind(int *A) {
  *A = 42;
  may_unwind();
  *A = 4711;
  unwind();
}
void someone_might_catch_me_as_I_also_unwind(int *A) {
  /* call */ I_will_unwind(A);
}
```

Maybe I misunderstood your idea but doesn't the above show how
we have only call instruction call sites and we still cannot
assume the store is dead on the unwind path? If you check
transitively throughout the entire call chain it's different,
but that is not how I read your first mail. I figured it works
for sret because the memory does not outlive the caller.

~ Johannes

Nikita Popov via llvm-dev

unread,
Jan 4, 2022, 11:58:07 AM1/4/22
to Johannes Doerfert, llvm-dev
On Tue, Jan 4, 2022 at 5:27 PM Johannes Doerfert <johannes...@gmail.com> wrote:

On 1/4/22 03:39, Nikita Popov wrote:
> On Mon, Jan 3, 2022 at 6:33 PM Johannes Doerfert <johannes...@gmail.com>
> wrote:
>
>> I somewhat missed this thread and while I should maybe respond
>> to a few of the other mails too I figured I start with a conceptual
>> question I have reading this:
>>
>> Who and when is the attribute added? If it is implied by sret that's
>> a good start. For the non-sret deduction it seems very specialized.
>> I mean, now we have something for the unwind case but not a different
>> "early exit" or if it is read/writeonly rather than readnone.
>>
> I'm mainly interested in frontend-annotated cases here, rather than deduced
> ones. The primary use case there is adding it to sret arguments (and only
> changing sret semantics would be "good enough" for me, I guess). However,
> there is another frontend-annotated case I have my eyes on, which is move
> arguments in rust. These could be modeled by a combination of
> noreadonunwind and noreadonreturn to indicate that the value will not be
> used after the call at all, regardless of how it exits. (This would be kind
> of similar to a byval argument, just without the ABI implication that an
> actual copy gets inserted.)

OK. That's interesting. I'm not fluent enough in rust, can you
elaborate what the semantics there would be, maybe an IR example?

To give a silly example, take these two functions in rust (https://rust.godbolt.org/z/9cvefedsP):

pub fn test1(mut s: String) {
    s = "foobar".to_string();
}
pub fn test2(s: &mut String) {
    *s = "foobar".to_string();
}

From an ABI perspective, these are basically the same. In both cases rust will lower this to passing in a String*. However, because String is a non-Copy type, any call "test(s)" will move the "s" variable, which means that "s" cannot be used after the call anymore. For that reason, the store "s =" would be definitely dead in the first example, and usually not dead in the second example.
 
Spitballing: `byval(nocopy, %a)` might be worth thinking about
given the short description.

Yeah, I guess that would work -- though I'd rather not mix ABI and optimization attributes in that way...

>
> Note that as proposed, the noreadonunwind attribute would be the "writeonly
> on unwind" combination (and noreadonreturn the "writeonly on return"
> combination). I can see that there are conjugated "readonly on unwind" and
> "readonly on return" attributes that could be defined here, but I can't
> think of any circumstances under which these would actually be useful for
> optimization purposes. How would the presence or absence of later writes
> impact optimization in the current function?

Just as an example, `readonly on unwind` allows you to do GVN/CSE
from prior to the call to the "unwind path". Return then on the
"return path". That is not inside the call but in the caller.
Does that make sense?

Let me check if I understood the idea right: We have a invoke with a hypothetical "readonly on unwind" / "no write on unwind" attribute. In the landing pad, there is a non-analyzable write and the pointer has previously escaped, and then later there is a read from the argument. The non-analyzable write blocks AA, while the "readonly on unwind" guarantee could make a sufficiently smart AA see that this write cannot write into the argument memory. Is that the idea? I feel like "readonly on unwind" isn't the right way to model that situation, rather the argument could be marked as invariant in the unwind path using one of our existing ways to denote invariance.

But I suspect I still didn't quite get what you have in mind here. An example would help.

>
> The argument about invoke vs. call instruction call sites only holds for
>> sret args anyway, so maybe what you are designing here is too sret
>> specific.
>>
> Not sure I follow, why would that argument only hold for sret?

```
static void I_will_unwind(int *A) {
   *A = 42;
   may_unwind();
   *A = 4711;
   unwind();
}
void someone_might_catch_me_as_I_also_unwind(int *A) {
   /* call */ I_will_unwind(A);
}
```

Maybe I misunderstood your idea but doesn't the above show how
we have only call instruction call sites and we still cannot
assume the store is dead on the unwind path? If you check
transitively throughout the entire call chain it's different,
but that is not how I read your first mail. I figured it works
for sret because the memory does not outlive the caller.

Ah yes, this was imprecise in the original mail. We need that a) it's only used with call (if we don't want to analyze the unwind paths to be more precise) and b) is noreadonunwind itself. Where the latter might be because it's based on an argument with that attribute, or because it's an alloca, which is always noreadonunwind.

Regards,
Nikita

Johannes Doerfert via llvm-dev

unread,
Jan 4, 2022, 12:15:58 PM1/4/22
to Nikita Popov, llvm-dev
I see. Again just as an idea, what if we make the "overwriting stores"
explicit instead of using attributes.

```
s = "foobar".to_string();
// other code
virtual_memset(s, sizeof(s), 0);
ret void
```

Now DSE will do the work for us.
It is not clear if we could do something similar for the other cases
though.

Whatever we do, I can see how this is information that is worth encoding.

Maybe I am confused but I thought something like this pseudo-code
could be optimized, readonly_on_return is similar.

```
int a = 42;
invoke foo(/* readonly_on_unwind */ &a);
lp:
  return a; // a == 42
cont:
  return a; // a unknown
```


>
>>> The argument about invoke vs. call instruction call sites only holds for
>>>> sret args anyway, so maybe what you are designing here is too sret
>>>> specific.
>>>>
>>> Not sure I follow, why would that argument only hold for sret?
>> ```
>> static void I_will_unwind(int *A) {
>> *A = 42;
>> may_unwind();
>> *A = 4711;
>> unwind();
>> }
>> void someone_might_catch_me_as_I_also_unwind(int *A) {
>> /* call */ I_will_unwind(A);
>> }
>> ```
>>
>> Maybe I misunderstood your idea but doesn't the above show how
>> we have only call instruction call sites and we still cannot
>> assume the store is dead on the unwind path? If you check
>> transitively throughout the entire call chain it's different,
>> but that is not how I read your first mail. I figured it works
>> for sret because the memory does not outlive the caller.
>>
> Ah yes, this was imprecise in the original mail. We need that a) it's only
> used with call (if we don't want to analyze the unwind paths to be more
> precise) and b) is noreadonunwind itself. Where the latter might be because
> it's based on an argument with that attribute, or because it's an alloca,
> which is always noreadonunwind.

Right, something along those lines.

~ Johannes

Reid Kleckner via llvm-dev

unread,
Jan 4, 2022, 5:40:51 PM1/4/22
to Nikita Popov, llvm-dev
On Mon, Dec 13, 2021 at 12:13 PM Nikita Popov via llvm-dev <llvm...@lists.llvm.org> wrote:
I think the answer to this is "never", because sret is considered an ABI attribute -- though to be honest I'm not really clear in which way it actually affects the call ABI.

IMO the main effect of the sret attribute is to copy the struct pointer into the return register on function exit. Consider the difference here:

struct Big { int words[32]; };
Big foo() {  return Big{}; }
void bar(Big *b) { *b = Big{}; }

`foo` includes a copy from RDI to RAX, but bar does not.

This is what makes it an ABI attribute, and not an inferable attribute.

Nikita Popov via llvm-dev

unread,
Jan 5, 2022, 8:42:05 AM1/5/22
to Johannes Doerfert, llvm-dev
Yeah, doing this kind of memset would encode the necessary information for the return case -- I don't think it would be a good fit for the unwind case, because it would require you to make all the unwind paths in the function explicit. I think we already have this "virtual_memset" and it's called llvm.lifetime.end -- just that nobody really understands its semantics when it comes to non-alloca objects. In this case we'd have to have the lifetime.start and lifetime.end in separate functions, which would probably be all kinds of trouble :)
Okay, I guess there's a possible point of confusion here: The intention here is that the attribute encodes a requirement on accesses *after* the call. readonly_on_unwind would allow only reading "a" in "lp", but does not prevent foo() from modifying "a" even if it ultimately unwinds. With that in mind, I don't think the attribute would enable any additional optimization here.

So maybe the proposed attribute is better named as "noreadafterunwind" rather than "noreadonunwind".

Regards,
Nikita

Nikita Popov via llvm-dev

unread,
Jan 6, 2022, 9:42:25 AM1/6/22
to Johannes Doerfert, llvm-dev
Something I realized while looking through existing code is that there are two different cases of "not visible after unwind/return" that usually get handled: For allocas, this statement is absolute, and independent of whether the alloca is captured. For returns from allocation functions, this only holds if the pointer does not escape. If it does escape, then a write before unwind/return might be visible to the caller.

For the sret unwind case, we can also make an absolute statement. But for the Rust move argument case above, we can't. The pointer can be captured (by moving it further) and accessed by the caller. One case we can treat like an alloca, the other like a malloc().

Not really sure what to make of this yet.

Regards,
Nikita

Johannes Doerfert via llvm-dev

unread,
Jan 6, 2022, 12:47:09 PM1/6/22
to Nikita Popov, llvm-dev

On 1/5/22 07:41, Nikita Popov wrote:
>> Maybe I am confused but I thought something like this pseudo-code
>> could be optimized, readonly_on_return is similar.
>>
>> ```
>> int a = 42;
>> invoke foo(/* readonly_on_unwind */ &a);
>> lp:
>> return a; // a == 42
>> cont:
>> return a; // a unknown
>> ```
>>
> Okay, I guess there's a possible point of confusion here: The intention
> here is that the attribute encodes a requirement on accesses *after* the
> call. readonly_on_unwind would allow only reading "a" in "lp", but does not
> prevent foo() from modifying "a" even if it ultimately unwinds. With that
> in mind, I don't think the attribute would enable any additional
> optimization here.
>
> So maybe the proposed attribute is better named as "noreadafterunwind"
> rather than "noreadonunwind".

So I misunderstood the entire idea :D

I am not sure how I feel about such a new "category" of attributes.
Can we try to throw around some ideas before we commit to it?

One of the weirdest things is that the semantics are somewhat
described in terms of the caller. Making it callee-centric might
help, e.g., `poison_on_unwind(%a)` to indicate the value in the
unwind case is "not read" or rather, replaced with poison.

You noted in the other mail that we don't want to make unwind paths
explicit (which I agree with, FWIW). However, in the caller we already
have them explicitly, no? So what materializing the "virtual memset"/
lifetime.end stuff there? Frontends (like Rust) can do it based on the
callee type so that should not be a problem.

Thoughts?

Nikita Popov via llvm-dev

unread,
Jan 7, 2022, 5:46:03 AM1/7/22
to Johannes Doerfert, llvm-dev
Yeah, the caller-dependent semantics are unusual. poison_on_unwind is an interesting idea, though also a bit odd in that attributes usually describe a constraint.
 
You noted in the other mail that we don't want to make unwind paths
explicit (which I agree with, FWIW). However, in the caller we already
have them explicitly, no? So what materializing the "virtual memset"/
lifetime.end stuff there? Frontends (like Rust) can do it based on the
callee type so that should not be a problem.

We need the information in the callee though, so that optimizations that reason about unwinds (DSE, MemCpyOpt, LICM etc) running on the callee can benefit. They wouldn't have information about what is going on in the caller (apart from what we can provide via function attributes).

At this point I'm starting to lean towards not introducing a separate attribute for this, and only tweaking sret semantics to specify it can't be read on unwind. That's the original motivation, and I'm not sure solving something more general is really worthwhile, as this turned out to be trickier than I expected.

For the Rust move arguments, I think that the proper modeling is likely a stronger form of "noalias" rather than this kind of noreadafterunwind/noreadafterreturn attributes. Normally, "noalias" only means that it is noalias for the duration of the call. Rust move arguments remain "noalias" after the call. That is, even after we return or unwind, accesses to the memory can only happen through pointers based on the original noalias argument (which would have to be captured at that point). I just realized that this is similar (the same?) as the semantics for "noalias" on return values, which is not the same as "noalias" on arguments. Not sure what to call this concept though. really_noalias :)

Regards,
Nikita

Johannes Doerfert via llvm-dev

unread,
Jan 7, 2022, 4:06:26 PM1/7/22
to Nikita Popov, llvm-dev

Or we lift these things into the inter-procedural space ;)


>
> At this point I'm starting to lean towards not introducing a separate
> attribute for this, and only tweaking sret semantics to specify it can't be
> read on unwind. That's the original motivation, and I'm not sure solving
> something more general is really worthwhile, as this turned out to be
> trickier than I expected.

Making sret imply this seems fine to me.


> For the Rust move arguments, I think that the proper modeling is likely a
> stronger form of "noalias" rather than this kind of
> noreadafterunwind/noreadafterreturn attributes. Normally, "noalias" only
> means that it is noalias for the duration of the call. Rust move arguments
> remain "noalias" after the call. That is, even after we return or unwind,
> accesses to the memory can only happen through pointers based on the
> original noalias argument (which would have to be captured at that point).
> I just realized that this is similar (the same?) as the semantics for
> "noalias" on return values, which is not the same as "noalias" on
> arguments. Not sure what to call this concept though. really_noalias :)

noalias on args becomes restrict, and noalias on return value
semantics will become noalias?

Anyway, we should keep this in mind for later.

Nikita Popov via llvm-dev

unread,
Jan 20, 2022, 3:24:17 AM1/20/22
to Johannes Doerfert, llvm-dev
For the record, this variant is up at https://reviews.llvm.org/D116998.

Regards,
Nikita
Reply all
Reply to author
Forward
0 new messages