[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

448 views
Skip to first unread message

David Blaikie via llvm-dev

unread,
Apr 18, 2021, 12:37:12 PM4/18/21
to llvm-dev, Roman Lebedev, Paul Robinson, Florian Hahn
While trying to reproduce some debug info thing (I don't have the exact example at the moment - but I think it was more aggressive than the example I have now, but something like this:

__attribute__((optnone)) int f1() {
  return 3;
}
int main() {
  return f1();
}

(actually I think in my case I had a variable to hold the return value from f1, with the intent that this variable's location couldn't use a constant - a load from a volatile variable would probably have provided similar functionality in this case)

LLVM (& specifically Sparse Conditional Constant Propagation, llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes this code noting that f1 always returns 3, so rather than using the return value from the call to f1, it ends up hardcoding the return value:

define dso_local i32 @main() local_unnamed_addr #1 {

entry:

  %call = tail call i32 @_Z2f1v()

  ret i32 3

}


I consider this a bug - in that optnone is used to implement -O0 for LTO, so it seemed to me that the correct behavior is for an optnone function to behave as though it were compiled in another object file outside the purview of optimizations - interprocedural or intraprocedural.

So I sent https://reviews.llvm.org/D100353 to fix that.

Florian pointed out that this wasn't quite specified in the LangRef, which says this about optnone:

This function attribute indicates that most optimization passes will skip this function, with the exception of interprocedural optimization passes. Code generation defaults to the “fast” instruction selector. This attribute cannot be used together with the alwaysinline attribute; this attribute is also incompatible with the minsize attribute and the optsize attribute.

This attribute requires the noinline attribute to be specified on the function as well, so the function is never inlined into any caller. Only functions with the alwaysinline attribute are valid candidates for inlining into the body of this function.


So the spec of optnone is unclear (or arguably explicitly disallows) whether interprocedural optimizations should treat optnone functions in any particular way.

So I was going to update the wording to rephrase this to say "Interprocedural optimizations should treat this function as though it were defined in an isolated module/object." (perhaps "interprocedural optimizations should treat optnone functions as opaque" or "as though they were only declarations")

The choice of this direction was based on my (possibly incorrect or debatable) understanding of optnone, that it was equivalent to the function being in a separate/non-lto object. (this seems consistent with the way optnone is used to implement -O0 under lto - you could imagine a user debugging a binary, using -O0 for the code they're interested in debugging, and potentially using an interactive debugger to change some state in the function causing it to return a different value - which would get quite confusing if the return value was effectively hardcoded into the caller)

What're folks thoughts on this?

- Dave

Roman Lebedev via llvm-dev

unread,
Apr 18, 2021, 12:43:35 PM4/18/21
to David Blaikie, llvm-dev, Florian Hahn
There's 'noipa' attribute in GCC, currently it is not supported by clang.
Theoretically, how would one implement it?

With your proposal, clang `noipa` attribute could be lowered
to `optnone` on the whole function, To me that seems like
too much of a hammer, should that be the path forward.

Would it not be best to not conflate the two,
and just introduce the `noipa` attribute?

Roman

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

David Blaikie via llvm-dev

unread,
Apr 18, 2021, 1:07:01 PM4/18/21
to Roman Lebedev, llvm-dev, Florian Hahn
On Sun, Apr 18, 2021 at 9:43 AM Roman Lebedev <lebed...@gmail.com> wrote:
There's 'noipa' attribute in GCC, currently it is not supported by clang.
Theoretically, how would one implement it?

If we wanted to do this really robustly, I guess we might have to introduce some sort of "here's the usual way to check if this is a definition/get the body of the function" (which for noipa it says "there is no body/don't look here") and "no, really, I need the definition" (for actual code generation).

Though I'm not advocating for that - I'm OK with a more ad-hoc/best-effort implementation targeting the -O0/debugging assistant __attribute__((optnone)) kind of use case - happy to fix cases as they come up to improve the user experience for these situations.

Maybe we could get away with generalizing this by having an optnone (or noipa) function appear "interposable" even though it doesn't have a real interposable linkage? That should hinder/disable any IPA.

Hmm, looks like GlobalValue::isDefinitionExact would be best to return false in this case (whatever we end up naming it) /maybe/ mayBeDerefined should return false too.

Yeah, I guess if we can implement such a robust generalization, then it'd probably be OK/easy enough to implement both noipa and optnone implies noipa the same as it implies noinline (well, I guess noipa would subsume the noinline implication - if the function isn't exact, the inliner won't inline it so there wouldn't be any need for the explicit noinline)
 
With your proposal, clang `noipa` attribute could be lowered
to `optnone` on the whole function, To me that seems like
too much of a hammer, should that be the path forward.

I agree that lowering noipa to optnone would be a very aggressive form of noipa - likely if we want to support noipa it would be to support it separately and maybe either lower -O0 (& maybe __attribute__((optnone))) to both optnone+noipa+noinline (since optnone already implies noinline) or make optnone imply ipa/be a superset of it implicitly (if we do have noipa it's probably best to have "optnone requires noipa" the same way "optnone requires noinline" rather than an implicit superset sort of thing).

I think that'd certainly be appropriate for -O0, and I'd argue it'd be appropriate for __attribute__((optnone)) because I think it'd be what people expect/is consistent with the motivation for the attribute (for debuggability - so you wouldn't want a caller to not fill in parameters/pass in garbage because it knows the implementation doesn't matter, or not use the result because it knows what the result should be).
 
Would it not be best to not conflate the two,
and just introduce the `noipa` attribute?

I think we'd still want to conflate them for user-facing functionality, even if they were separable at the IR level.

- Dave
 

David Blaikie via llvm-dev

unread,
Apr 18, 2021, 9:40:35 PM4/18/21
to Roman Lebedev, llvm-dev, Florian Hahn
Prototyping the idea of "isDefinitionExact" returning false for optnone (whether or not we split it out into noipo or not) I've tripped over something it seems I created 5 years ago:

I added some IPC support for optnone to GlobalsModRef: https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1 - so it wouldn't conclude properties of an optnone function.

But I then made a follow-up commit (without a lot of context as to why, unfortunately :/ ) that allowed GlobasModRef to use existing attributes on an optnone function: https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1

But it seems making the function definition inexact, breaks the unit testing added in the latter commit. I suppose then it's an open question whether existing attributes on an inexact definition should be used at all? (I don't know what motivated me to support them for optnone)

Oh, and here's a change from Chandler around the same time similarly blocking some ipo for optnone: https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c - in this case preventing FunctionAttrs from deriving the attributes for an optnone function. That functionality looks like it can be subsumed by the inexact approach - applying inexact to optnone and removing the change in Chandler's patch still passes the tests. (hmm, tested - not quite, but more work to do there)

Johannes Doerfert via llvm-dev

unread,
Apr 18, 2021, 11:29:21 PM4/18/21
to David Blaikie, Roman Lebedev, llvm-dev, Florian Hahn
I'm very much in favor of `noipa`. It comes up every few months
and it would be widely useful. I'd expose it via Clang and -O0 could
set it as well (for the LTO case).

When it comes to inexact definitions, optnone functions, and existing
attributes,
I'd be in favor of 1) always allowing the use of existing attributes,
and 2) not deriving new ones for an inexact or optnone definition.

This is how the Attributor determines if it a function level attribute could
be derived or if we should only stick with the existing information:

    /// Determine whether the function \p F is IPO amendable
    ///
    /// If a function is exactly defined or it has alwaysinline attribute
    /// and is viable to be inlined, we say it is IPO amendable
    bool isFunctionIPOAmendable(const Function &F) {
      return F.hasExactDefinition() ||
InfoCache.InlineableFunctions.count(&F);
    }

So, if the above check doesn't hold we will not add new attributes but
we will
still use existing ones. This seems to me the right way to allow
users/frontends
to provide information selectively.

That said, right now the Attributor will not propagate any information
from an
optnone function or derive new information. Nevertheless, I'd be in
favor to allow
existing information to be used for IPO.

~ Johannes

David Blaikie via llvm-dev

unread,
Apr 18, 2021, 11:52:09 PM4/18/21
to Johannes Doerfert, llvm-dev, Florian Hahn
On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <johannes...@gmail.com> wrote:
I'm very much in favor of `noipa`. It comes up every few months
and it would be widely useful.

Out of curiosity, what sort of uses do you have in mind for it?
 
I'd expose it via Clang and -O0 could
set it as well (for the LTO case).

When it comes to inexact definitions, optnone functions, and existing
attributes,
I'd be in favor of 1) always allowing the use of existing attributes,

I'm not sure what you mean by this ^ - could you rephrase/elaborate?
 
and 2) not deriving new ones for an inexact or optnone definition.

Also this ^ I'm similarly confused/unclear about.
 
This is how the Attributor determines if it a function level attribute could
be derived or if we should only stick with the existing information:

     /// Determine whether the function \p F is IPO amendable
     ///
     /// If a function is exactly defined or it has alwaysinline attribute
     /// and is viable to be inlined, we say it is IPO amendable
     bool isFunctionIPOAmendable(const Function &F) {
       return F.hasExactDefinition() ||
InfoCache.InlineableFunctions.count(&F);
     }

So, if the above check doesn't hold we will not add new attributes but
we will
still use existing ones. This seems to me the right way to allow
users/frontends
to provide information selectively.

Yep, that sounds right to me (if you put attributes on an optnone/noipa function, they should be usable/used - but none should be discovered/added later by inspection of the implementation of such a function) - currently doesn't seem to be the case for the (old pass manager?) FunctionAttrs pass, so I have to figure some things out there.
 
That said, right now the Attributor will not propagate any information
from an
optnone function or derive new information. Nevertheless, I'd be in
favor to allow
existing information to be used for IPO.

*nod* I think I'm with you there.

- Dave
 

Johannes Doerfert via llvm-dev

unread,
Apr 19, 2021, 1:30:16 AM4/19/21
to David Blaikie, llvm-dev, Florian Hahn

On 4/18/21 10:51 PM, David Blaikie wrote:
> On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> johannes...@gmail.com> wrote:
>
>> I'm very much in favor of `noipa`. It comes up every few months
>> and it would be widely useful.
>
> Out of curiosity, what sort of uses do you have in mind for it?

Most times people basically want `noinline` to also mean "no
interprocedural optimization", but without `optnone`. So, your
function is optimized but actually called and the call result
is used, no constants are propagated etc.

Example:

```
__attribute__((noipa))
void foo() { return 1 + 2; }
void bar() { return foo(); }
```
should become

```
__attribute__((noipa))
void foo() { return 3; }
void bar() { return foo(); }
```
which it does not right now.


>
>> I'd expose it via Clang and -O0 could
>> set it as well (for the LTO case).
>>
>> When it comes to inexact definitions, optnone functions, and existing
>> attributes,
>> I'd be in favor of 1) always allowing the use of existing attributes,
>>
> I'm not sure what you mean by this ^ - could you rephrase/elaborate?
>
>
>> and 2) not deriving new ones for an inexact or optnone definition.
>>
> Also this ^ I'm similarly confused/unclear about.

So if you have a call of F, and F has attribute A, we can use
that fact at the call site, regardless of the definition of F.
F could be `optnone` or with non-exact linkage, but the information
attached to it is still usable.

If we go for the above we can never derive/attach information
for a non-exact linkage definitions. That way we prevent IPO from
using information that might be invalid if the definition is replaced.

It is all about where you disturb the ipo deduction in this case, I think
it is more beneficial to not attach new things but an argument could be
made to allow that but no propagation. Both have benefits, its' not 100%
clear what is more desirable at the end of the day.


>
>
>> This is how the Attributor determines if it a function level attribute
>> could
>> be derived or if we should only stick with the existing information:
>>
>> /// Determine whether the function \p F is IPO amendable
>> ///
>> /// If a function is exactly defined or it has alwaysinline attribute
>> /// and is viable to be inlined, we say it is IPO amendable
>> bool isFunctionIPOAmendable(const Function &F) {
>> return F.hasExactDefinition() ||
>> InfoCache.InlineableFunctions.count(&F);
>> }
>>
>> So, if the above check doesn't hold we will not add new attributes but
>> we will
>> still use existing ones. This seems to me the right way to allow
>> users/frontends
>> to provide information selectively.
>>
> Yep, that sounds right to me (if you put attributes on an optnone/noipa
> function, they should be usable/used - but none should be discovered/added
> later by inspection of the implementation of such a function) - currently
> doesn't seem to be the case for the (old pass manager?) FunctionAttrs pass,
> so I have to figure some things out there.

That is what I tried to say above, I think.

In the end, I want to know that foo does not access memory but
bar could for all we know:

```
__attribute__((pure, optnone))         // or non-exact linkage
void pure_optnone() { /* empty */ }

__attribute__((optnone))               // or non-exact linkage
void optnone() { /* empty */ }

void foo() { pure_optnone(); }

void bar() { optnone(); }
```

~ Johannes

Reid Kleckner via llvm-dev

unread,
Apr 19, 2021, 12:29:51 PM4/19/21
to David Blaikie, llvm-dev, Florian Hahn
The thread is long and I haven't read it all, but I like the approach of:
- add a new noipa LLVM IR attribute (feel free to bikeshed the name)
- make clang optnone imply noipa (maybe in LLVM too, but I haven't thought hard about it)

Mehdi AMINI via llvm-dev

unread,
Apr 19, 2021, 5:35:44 PM4/19/21
to Johannes Doerfert, llvm-dev, Florian Hahn
On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert via llvm-dev <llvm...@lists.llvm.org> wrote:
I'm very much in favor of `noipa`. It comes up every few months
and it would be widely useful. I'd expose it via Clang and -O0 could
set it as well (for the LTO case).

When it comes to inexact definitions, optnone functions, and existing
attributes,
I'd be in favor of 1) always allowing the use of existing attributes,
and 2) not deriving new ones for an inexact or optnone definition.

+1 from me on this FWIW! 

-- 
Mehdi

David Blaikie via llvm-dev

unread,
Apr 19, 2021, 5:41:49 PM4/19/21
to Johannes Doerfert, llvm-dev, Florian Hahn
On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert

<johannes...@gmail.com> wrote:
>
>
> On 4/18/21 10:51 PM, David Blaikie wrote:
> > On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> > johannes...@gmail.com> wrote:
> >
> >> I'm very much in favor of `noipa`. It comes up every few months
> >> and it would be widely useful.
> >
> > Out of curiosity, what sort of uses do you have in mind for it?
>
> Most times people basically want `noinline` to also mean "no
> interprocedural optimization", but without `optnone`. So, your
> function is optimized but actually called and the call result
> is used, no constants are propagated etc.
>
> Example:
>
> ```
> __attribute__((noipa))
> void foo() { return 1 + 2; }
> void bar() { return foo(); }
> ```
> should become
>
> ```
> __attribute__((noipa))
> void foo() { return 3; }
> void bar() { return foo(); }
> ```
> which it does not right now.

I'm curious what the use case is you've come across (the justification
for the GCC implementation of noipa was mostly for compiler testing -
which is my interest in having these semantics (under optnone or
otherwise) - so just curious what other use cases I should have in
mind, etc)

> >> I'd expose it via Clang and -O0 could
> >> set it as well (for the LTO case).
> >>
> >> When it comes to inexact definitions, optnone functions, and existing
> >> attributes,
> >> I'd be in favor of 1) always allowing the use of existing attributes,
> >>
> > I'm not sure what you mean by this ^ - could you rephrase/elaborate?
> >
> >
> >> and 2) not deriving new ones for an inexact or optnone definition.
> >>
> > Also this ^ I'm similarly confused/unclear about.
>
> So if you have a call of F, and F has attribute A, we can use
> that fact at the call site, regardless of the definition of F.
> F could be `optnone` or with non-exact linkage, but the information
> attached to it is still usable.

+1 SGTM.

> If we go for the above we can never derive/attach information
> for a non-exact linkage definitions. That way we prevent IPO from
> using information that might be invalid if the definition is replaced.

Yup, sounds good.

> It is all about where you disturb the ipo deduction in this case, I think
> it is more beneficial to not attach new things but an argument could be
> made to allow that but no propagation.

Allow adding them, but never using them? Yeah, that doesn't seem
especially helpful/useful - the attributes are entirely for IPO, so if
you want to block IPO it seems best not to add them.

Got it,

I'll see about posting an implementation of noipa and switching
__attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
than optnone+noinline.

Happy if someone wants to add clang support for an
__attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
I'll do it, guess it's probably fairly cheap/easy).

- Dave

Johannes Doerfert via llvm-dev

unread,
Apr 19, 2021, 7:32:37 PM4/19/21
to David Blaikie, llvm-dev, Florian Hahn

I looked for `noipa` in my inbox, here are some results that
show different use cases people brought up since March 2020:

https://reviews.llvm.org/D75815#1939277
https://bugs.llvm.org/show_bug.cgi?id=46463
https://reviews.llvm.org/D93838#2472155
https://reviews.llvm.org/D97971#2608302

Another use case is runtime call detection in the presence of definitions.
So, we detect `malloc` and also various OpenMP runtime calls, which works
fine because those are usually declarations. However, sometimes they are
not and then we can easily end up with signatures that do not match what we
expect anymore. At least that happens if we link in the OpenMP GPU runtime
into an application.


>>>> I'd expose it via Clang and -O0 could
>>>> set it as well (for the LTO case).
>>>>
>>>> When it comes to inexact definitions, optnone functions, and existing
>>>> attributes,
>>>> I'd be in favor of 1) always allowing the use of existing attributes,
>>>>
>>> I'm not sure what you mean by this ^ - could you rephrase/elaborate?
>>>
>>>
>>>> and 2) not deriving new ones for an inexact or optnone definition.
>>>>
>>> Also this ^ I'm similarly confused/unclear about.
>> So if you have a call of F, and F has attribute A, we can use
>> that fact at the call site, regardless of the definition of F.
>> F could be `optnone` or with non-exact linkage, but the information
>> attached to it is still usable.
> +1 SGTM.
>
>> If we go for the above we can never derive/attach information
>> for a non-exact linkage definitions. That way we prevent IPO from
>> using information that might be invalid if the definition is replaced.
> Yup, sounds good.
>
>> It is all about where you disturb the ipo deduction in this case, I think
>> it is more beneficial to not attach new things but an argument could be
>> made to allow that but no propagation.
> Allow adding them, but never using them? Yeah, that doesn't seem
> especially helpful/useful - the attributes are entirely for IPO, so if
> you want to block IPO it seems best not to add them.

We could use them *inside* the function, but we can make that work
differently as well. IPO seems the more important target.

FWIW, I think `noipa` should not imply `noinline`, unsure if you
had that in mind or not.


> Happy if someone wants to add clang support for an
> __attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
> I'll do it, guess it's probably fairly cheap/easy).

Agreed, I won't volunteer right now, I doubt that I'll get to it
anytime soon. That said, I actually would like to use `noipa`, see
above.

~ Johannes

>>>>>>>> So I senthttps://reviews.llvm.org/D100353 to fix that.

David Blaikie via llvm-dev

unread,
Apr 21, 2021, 10:29:41 PM4/21/21
to Johannes Doerfert, llvm-dev, Florian Hahn
Implemented a first-pass at adding noipa IR/bitcode and the basic
functionality, noipa implying "may be unrefined"/not
is-definition-exact. https://reviews.llvm.org/D101011

On Mon, Apr 19, 2021 at 4:32 PM Johannes Doerfert

Ah, thanks for all the links/context!

Ah, right. Yeah, agreed.

Do you think it should require that noipa also carries noinline? (the
way optnone currently requires noinline) Or should we let the
non-inlining fall out naturally from the non-exact definition
property?

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 12:09:09 AM4/22/21
to David Blaikie, llvm-dev, Florian Hahn

On 4/21/21 9:29 PM, David Blaikie wrote:
>
>>> I'll see about posting an implementation of noipa and switching
>>> __attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
>>> than optnone+noinline.
>> FWIW, I think `noipa` should not imply `noinline`, unsure if you
>> had that in mind or not.
> Do you think it should require that noipa also carries noinline? (the
> way optnone currently requires noinline) Or should we let the
> non-inlining fall out naturally from the non-exact definition
> property?
>
So, non-exact definitions do not prevent inlining. You can even
create an internal copy and use that for IPO, think of it as
"inline-then-outline".

That said, I believe it is a mistake that `optnone` requires
`noinline`. There is no reason for it to do so on the IR level.
If you argue C-level `optnone` should imply `noinline`, that is
a something worth discussing, though on the IR level we can
decouple them. Use case, for example, the not-optimized version
is called from functions that are `optnone` themselves while
other call sites are inlined and your function is optimized. So
you can use the two attributes to do context sensitive `optnone`.

Circling back to `noipa`, I'm very much in favor of letting it
compose freely with the others, at least in the IR. So, it does
not require, nor imply `noinline` or `optnone`. Similarly,
`noinline` does not imply `noipa`, neither does `optnone`. The
latter might be surprising but I imagine I can use function
attributes of an `optnone` function at the call site but I will
not if the function is `noipa`.

Others might have different opinions though.

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 10:43:59 AM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
> That said, I believe it is a mistake that `optnone` requires
> `noinline`. There is no reason for it to do so on the IR level.
> If you argue C-level `optnone` should imply `noinline`, that is
> a something worth discussing, though on the IR level we can
> decouple them. Use case, for example, the not-optimized version
> is called from functions that are `optnone` themselves while
> other call sites are inlined and your function is optimized. So
> you can use the two attributes to do context sensitive `optnone`.

The original intent for `optnone` was to imitate the -O0 pipeline
to the extent that was feasible. The -O0 pipeline (as constructed
by Clang) runs just the always-inliner, not the regular inliner;
so, functions marked `optnone` should not be inlined. The way
to achieve that effect most simply is to have `optnone` require
`noinline` and that's what we did.

If we have `optnone` stop requiring `noinline` and teach the
inliner to inline an `optnone` callee only into an `optnone` caller,
then we are violating the intent that `optnone` imitate -O0, because
that inlining would not have happened at -O0.

--paulr

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 11:42:27 AM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 9:43 AM, paul.r...@sony.com wrote:
>> That said, I believe it is a mistake that `optnone` requires
>> `noinline`. There is no reason for it to do so on the IR level.
>> If you argue C-level `optnone` should imply `noinline`, that is
>> a something worth discussing, though on the IR level we can
>> decouple them. Use case, for example, the not-optimized version
>> is called from functions that are `optnone` themselves while
>> other call sites are inlined and your function is optimized. So
>> you can use the two attributes to do context sensitive `optnone`.
> The original intent for `optnone` was to imitate the -O0 pipeline
> to the extent that was feasible. The -O0 pipeline (as constructed
> by Clang) runs just the always-inliner, not the regular inliner;
> so, functions marked `optnone` should not be inlined. The way
> to achieve that effect most simply is to have `optnone` require
> `noinline` and that's what we did.
>
> If we have `optnone` stop requiring `noinline` and teach the
> inliner to inline an `optnone` callee only into an `optnone` caller,
> then we are violating the intent that `optnone` imitate -O0, because
> that inlining would not have happened at -O0.

I think I initially read this wrong, hence the part below.
After reading it again, I have one question: Why would the
inliner inline something that is not `always_inline` into
an `optnone` caller? That would violate the idea of `optnone`,
IMHO, regardless if the callee is `optnone` or not. That is
why I don't believe `noinline` on the callee is necessary
for your use case.

--- I misread and I wrote this, might be useful still ---

Let's look at an example. I show it in C but what I am arguing
about is still IR, as described earlier, C is different.

```
__attribute__((optnone))
void foo() { ... }
__attribute__((optnone, noinline))
void bar() { foo(); ... }
void baz() { foo(); bar(); ... }
```
Here, the user has utilized optnone and noinline to get different
kinds of distinct effects that you could all want:
 - foo is not optimized, not inlined into bar, but inlined into baz
 - bar is not optimized and not inlined into baz

I hope this makes sense.

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 12:05:35 PM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
The inliner should be ignoring `optnone` callers, so it would
never inline *anything* into an `optnone` caller. (Other than
an `alwaysinline` function.)

I had read this:

> >> I believe it is a mistake that `optnone` requires `noinline`.

and the case that came to mind is inlining an `optnone` callee
into a not-`optnone` caller. The inlined copy would then be
treated to further optimization, which violates the idea of
`optnone`.

Now, the inliner already knows to avoid `noinline` callees, so
attaching `noinline` to `optnone` functions was (at the time)
considered an optimal way to avoid the problematic case. We
could instead teach the inliner to skip `optnone` callees, and
that would allow us to eliminate the requirement that `optnone`
functions must also be `noinline`. I am unclear why redefining
`optnone` to _imply_ `noinline` (rather than _require_ `noinline`)
is better, but then I don't work much with attributes.

The notion of allowing an `optnone` caller to inline an `optnone`
callee sounds like it would also violate the intent of `optnone`
in that it should imitate -O0, where inlining is confined to
`alwaysinline` callees, and `optnone` is defined to conflict with
`alwaysinline` (because if you always inline something, you are
allowing it to have subsequent optimizations same as the caller,
which conflicts with `optnone`).

So, if you want to undo the _requirement_ that `optnone` must
have `noinline`, but then redefine `optnone` such that it can't
be inlined anywhere, you've done something that seems to have no
practical effect. Maybe that helps Attributor in some way, but
I don't see any other reason to be making this change.

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 12:23:14 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

My point is, it already does ignore `optnone` callers
and inlines only `alwaysinline` calls into them:

https://clang.godbolt.org/z/fznbjTEd5


>
> I had read this:
>
>>>> I believe it is a mistake that `optnone` requires `noinline`.
> and the case that came to mind is inlining an `optnone` callee
> into a not-`optnone` caller. The inlined copy would then be
> treated to further optimization, which violates the idea of
> `optnone`.

But that is a composition issue. If you do not want to
inline a `optnone` callee into an non-`optnone` caller,
then add `noinline` to the callee. If you don't mind if
it is inlined into non-`optnone` callers and optimized
in there, then don't. My last email contained an example
to show the different cases, you can mix and match IR
attributes to get what you want. Requiring them to be
tied is not improving anything but just restricting the
options.


> Now, the inliner already knows to avoid `noinline` callees, so
> attaching `noinline` to `optnone` functions was (at the time)
> considered an optimal way to avoid the problematic case. We
> could instead teach the inliner to skip `optnone` callees, and
> that would allow us to eliminate the requirement that `optnone`
> functions must also be `noinline`. I am unclear why redefining
> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`)
> is better, but then I don't work much with attributes.

The inliner will not inline calls into an `optnone` caller
if it is not necessary. As said before, that would violate
the `optnone` idea for the caller, no matter what the callee
looks like. So requiring  `noinline` on the callee seems
to me like a workaround or an oversight.

It is better to not require them together because you can
actually describe more distinct scenarios. Please take
another look at my example in the last email, it shows
what is possible if you split them. Furthermore, `optnone`
does by design imply `noinline` for the call sites in the
caller, or at least nobody argued that it shouldn't. Thus,
requiring `noinline` on the callee is simply unnecessary
as it does not add any value.


>
> The notion of allowing an `optnone` caller to inline an `optnone`
> callee sounds like it would also violate the intent of `optnone`
> in that it should imitate -O0, where inlining is confined to
> `alwaysinline` callees, and `optnone` is defined to conflict with
> `alwaysinline` (because if you always inline something, you are
> allowing it to have subsequent optimizations same as the caller,
> which conflicts with `optnone`).

Nobody said `optnone` callers should inline calls that are
not always_inline, at least so far I have not seen that
argument be made anywhere. I'll just skip this paragraph.


>
> So, if you want to undo the _requirement_ that `optnone` must
> have `noinline`, but then redefine `optnone` such that it can't
> be inlined anywhere, you've done something that seems to have no
> practical effect. Maybe that helps Attributor in some way, but
> I don't see any other reason to be making this change.

I do not want to say `optnone` cannot be inlined. `noinline`
says it cannot be inlined. If you want it to not be inlined,
use `noinline`, if you want it to not be optimized in it's
own function, use `optnone`, if you want both, use both.

The practical effect was literally showcased in my last email,
please go back and look at the example.

I don't know why the Attributor has to do with this, I'm happy
to hear your thoughts on that though :)

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 12:44:17 PM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
> Let's look at an example. I show it in C but what I am arguing
> about is still IR, as described earlier, C is different.
>
> ```
> __attribute__((optnone))
> void foo() { ... }
> __attribute__((optnone, noinline))
> void bar() { foo(); ... }
> void baz() { foo(); bar(); ... }
> ```
> Here, the user has utilized optnone and noinline to get different
> kinds of distinct effects that you could all want:
>  - foo is not optimized, not inlined into bar, but inlined into baz

foo's non-inlined instance is not optimized; but, the instance that
is inlined into baz *is* optimized. How does that obey `optnone`?

>  - bar is not optimized and not inlined into baz
>
> I hope this makes sense.
>
> ~ Johannes

The use-case for `optnone` is to allow selectively not-optimizing
a function, which I've seen used only to permit better debugging
of that function. Inlining optimizes (some instances of) the
function, against the coder's express wishes, and interfering with
the better debugging enabled by not-optimizing. I don't see how
that is beneficial to the coder, or any other use-case. If you
have a practical use-case I would love to hear it.

Yes, I do see that separating the concerns allows this weird case
of a sometimes-optimized function, but I don't see any benefit.
Certainly it would be super confusing to the coder, and at the
Clang level I would strenuously oppose decoupling these.

Apologies for mentioning Attributor; I have no idea how it works,
and I was rather idly speculating why you want to decouple the
optnone and noinline attributes.

David Blaikie via llvm-dev

unread,
Apr 22, 2021, 12:49:44 PM4/22/21
to Johannes Doerfert, Florian Hahn, llvm-dev
There seems to be a bunch of confusion and probably some
conflation/ambiguity about whether we're talking about IR constructs
on the C attributes.

Johannes - I assume your claim is restricted mostly to the IR? That
having optnone not require or imply noinline improves orthogonality of
features and that there are reasonable use cases where one might want
optnone while allowing inlining (of the optnone function) or optnone
while disallowing inlining (of the optnone function)

Paul - I think you're mostly thinking about/interested in the specific
source level/end user use case that motivated the initial
implementation of optnone. Where, I tend to agree - inlining an
optnone function is not advantageous to the user. Though it's possible
Johannes 's argument could be generalized from IR to C and still
apply: orthogonal features are more powerful and the user can always
compose them together to get what they want. (good chance they're
using attributes behind macros for ease of use anyway - they're a bit
verbose to write by hand all the time)

There's also the -O0 use of optnone these days (clang puts optnone on
all functions when compiling with -O0 - the intent being to treat such
functions as though they were compiled in a separate object file
without optimizations (that's me projecting what I /think/ the mental
model should be) - which, similarly, I think will probably want to
keep the current behavior (no ipa/inlining and no optimization -
however that's phrased).

Essentially the high level use cases of optnone all look like "imagine
if I compiled this in a separate object file without LTO" - apparently
even noipa+optnone isn't enough for that, maybe (I need to test that
more, based on a comment from Johannes earlier that inexact
definitions don't stop the inliner... )? Sounds like maybe it's more
like what I'm thinking of is "what if this function had weak linkage"
(ie: could be replaced by a totally different one)?

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 12:54:21 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 11:44 AM, paul.r...@sony.com wrote:
>> Let's look at an example. I show it in C but what I am arguing
>> about is still IR, as described earlier, C is different.
>>
>> ```
>> __attribute__((optnone))
>> void foo() { ... }
>> __attribute__((optnone, noinline))
>> void bar() { foo(); ... }
>> void baz() { foo(); bar(); ... }
>> ```
>> Here, the user has utilized optnone and noinline to get different
>> kinds of distinct effects that you could all want:
>>  - foo is not optimized, not inlined into bar, but inlined into baz
> foo's non-inlined instance is not optimized; but, the instance that
> is inlined into baz *is* optimized. How does that obey `optnone`?

`optnone`  -> make sure the code in this symbol is not optimized.
`noinline` -> make sure the code in this symbol is not copied
              into another symbol.

Two separate ideas, if you want both, use both attributes,
nobody argues against that use case. See below for a "real world"
use case.


>>  - bar is not optimized and not inlined into baz
>>
>> I hope this makes sense.
>>
>> ~ Johannes
> The use-case for `optnone` is to allow selectively not-optimizing
> a function, which I've seen used only to permit better debugging
> of that function. Inlining optimizes (some instances of) the
> function, against the coder's express wishes, and interfering with
> the better debugging enabled by not-optimizing. I don't see how
> that is beneficial to the coder, or any other use-case. If you
> have a practical use-case I would love to hear it.

Now you bring in the C level. I explicitly, and multiple times,
said I argue on IR level. If you want C `__attribute__((optnone))`
to imply `noinline`, that would be fine with me. However, on
IR level there is no reason to tie them together.

Even on C it is not clear. Think of a context sensitive problem
in a large application. You want pristine code for some calling
contexts but fast code for others. Right now, there is no way to
do that, except maybe using `__attribute__((flatten))` on all
callees that need to be fast. However, once you decoupled the two
attributes you can say that for some call sites you don't want it
to be inlined but for others you do. The ones you don't want to
inline the function are probably `optnone` themselves, so there is
no inlining happening anyway, no need to say anything special for
them.


>
> Yes, I do see that separating the concerns allows this weird case
> of a sometimes-optimized function, but I don't see any benefit.
> Certainly it would be super confusing to the coder, and at the
> Clang level I would strenuously oppose decoupling these.

I'd assume coders are capable of understanding the difference
between `optnone` and `noinline` and how they compose. That said,
I am only arguing on the IR level anyway and the conversation what
`__attribute__((optnone))` should be is a different one.


>
> Apologies for mentioning Attributor; I have no idea how it works,
> and I was rather idly speculating why you want to decouple the
> optnone and noinline attributes.

No worries.

~ Johannes

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 1:04:35 PM4/22/21
to David Blaikie, Florian Hahn, llvm-dev

On 4/22/21 11:49 AM, David Blaikie wrote:
> There seems to be a bunch of confusion and probably some
> conflation/ambiguity about whether we're talking about IR constructs
> on the C attributes.
>
> Johannes - I assume your claim is restricted mostly to the IR? That
> having optnone not require or imply noinline improves orthogonality of
> features and that there are reasonable use cases where one might want
> optnone while allowing inlining (of the optnone function) or optnone
> while disallowing inlining (of the optnone function)

Yes, IR level is my concern. I'd be in favor of matching it in C
but I don't care enough to go through the discussions.

My latest use case: `s/optnone//g` should not result in a verifier
error when you just want to try out an optimization on a single function.


>
> Paul - I think you're mostly thinking about/interested in the specific
> source level/end user use case that motivated the initial
> implementation of optnone. Where, I tend to agree - inlining an
> optnone function is not advantageous to the user. Though it's possible
> Johannes 's argument could be generalized from IR to C and still
> apply: orthogonal features are more powerful and the user can always
> compose them together to get what they want. (good chance they're
> using attributes behind macros for ease of use anyway - they're a bit
> verbose to write by hand all the time)

I'd give the user the capability building blocks and let them work
with that, but again, this is not my main concern right now.


> There's also the -O0 use of optnone these days (clang puts optnone on
> all functions when compiling with -O0 - the intent being to treat such
> functions as though they were compiled in a separate object file
> without optimizations (that's me projecting what I /think/ the mental
> model should be) - which, similarly, I think will probably want to
> keep the current behavior (no ipa/inlining and no optimization -
> however that's phrased).
>
> Essentially the high level use cases of optnone all look like "imagine
> if I compiled this in a separate object file without LTO" - apparently
> even noipa+optnone isn't enough for that, maybe (I need to test that
> more, based on a comment from Johannes earlier that inexact
> definitions don't stop the inliner... )? Sounds like maybe it's more
> like what I'm thinking of is "what if this function had weak linkage"
> (ie: could be replaced by a totally different one)?

Yes, weak should do the trick. That said, if you want "separate
object file without LTO", go with `noinline` + `noipa`. This will
make the call edges optimization barriers. If you want to also not
optimize the function, add `optnone` as required.

FWIW, if all functions are `optnone`, `noinline` and `noipa` are
not needed, that is the -O0 case. More specifically, if your caller
is `optnone`, `noinline` and `noipa` are not needed (for that caller).

~ Johannes

David Blaikie via llvm-dev

unread,
Apr 22, 2021, 1:21:24 PM4/22/21
to Johannes Doerfert, Florian Hahn, llvm-dev
On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert
<johannes...@gmail.com> wrote:
>
>
> On 4/22/21 11:49 AM, David Blaikie wrote:
> > There seems to be a bunch of confusion and probably some
> > conflation/ambiguity about whether we're talking about IR constructs
> > on the C attributes.
> >
> > Johannes - I assume your claim is restricted mostly to the IR? That
> > having optnone not require or imply noinline improves orthogonality of
> > features and that there are reasonable use cases where one might want
> > optnone while allowing inlining (of the optnone function) or optnone
> > while disallowing inlining (of the optnone function)
>
> Yes, IR level is my concern. I'd be in favor of matching it in C
> but I don't care enough to go through the discussions.
>
> My latest use case: `s/optnone//g` should not result in a verifier
> error when you just want to try out an optimization on a single function.

I guess you meant adding optnone, rather than removing it? (removing
optnone shouldn't cause any verifier errors, does it?)

Starts to feel like a long list to get what seems like one wholistic
concept ("treat it as though it were a separate non-LTO object file" /
"treat it as though this function had weak linkage"), but it's
probably OK/a fine thing to do (not like there's a high cost to having
multiple attributes since the attribute lists are shared, etc) - just
psychologically for me, there seems to be one core concept and
stitching it together from several attributes makes me worry that
there are gaps (as there have been/what's motivating this discussion -
though it certainly sounds like there will be fewer gaps after this
work, for sure).

> FWIW, if all functions are `optnone`, `noinline` and `noipa` are
> not needed, that is the -O0 case. More specifically, if your caller
> is `optnone`, `noinline` and `noipa` are not needed (for that caller).

Right - though LTO is the case that motivated adding optnone for -O0,
so it would be respected even under LTO - so in that case we'd want
noinline and noipa, by the sounds of it.

- Dave

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 1:32:29 PM4/22/21
to David Blaikie, Florian Hahn, llvm-dev

On 4/22/21 12:21 PM, David Blaikie wrote:
> On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert
> <johannes...@gmail.com> wrote:
>>
>> On 4/22/21 11:49 AM, David Blaikie wrote:
>>> There seems to be a bunch of confusion and probably some
>>> conflation/ambiguity about whether we're talking about IR constructs
>>> on the C attributes.
>>>
>>> Johannes - I assume your claim is restricted mostly to the IR? That
>>> having optnone not require or imply noinline improves orthogonality of
>>> features and that there are reasonable use cases where one might want
>>> optnone while allowing inlining (of the optnone function) or optnone
>>> while disallowing inlining (of the optnone function)
>> Yes, IR level is my concern. I'd be in favor of matching it in C
>> but I don't care enough to go through the discussions.
>>
>> My latest use case: `s/optnone//g` should not result in a verifier
>> error when you just want to try out an optimization on a single function.
> I guess you meant adding optnone, rather than removing it? (removing
> optnone shouldn't cause any verifier errors, does it?)

Yes, right, my bad. Removing `noinline` or adding `optnone` can
get you in trouble.

If you want "separate non-LTO" behavior by design, put it in a
different file and don't compile that file with LTO ;)

Inside a single TU there is no "separate non-LTO" idea, it is
a single TU after all. To get the same effect we build it from
blocks that have a meaning in the single TU case. Sure, there
might be gaps left, that usually means there is something missing
in the single TU case as well so a new attribute is in order.


>> FWIW, if all functions are `optnone`, `noinline` and `noipa` are
>> not needed, that is the -O0 case. More specifically, if your caller
>> is `optnone`, `noinline` and `noipa` are not needed (for that caller).
> Right - though LTO is the case that motivated adding optnone for -O0,
> so it would be respected even under LTO - so in that case we'd want
> noinline and noipa, by the sounds of it.

If you run only one file with -O0 and then LTO it with other files
that do not have -O0, you probably want to add `noinline` + `noipa`
at the "entry points" that you want to debug. I think -O0 could
reasonably add all three arguments anyway, if you say O0 they all
make sense (to me). If you want more control, you need to seed them
manually.

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 1:37:46 PM4/22/21
to dbla...@gmail.com, johannes...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
> There seems to be a bunch of confusion and probably some
> conflation/ambiguity about whether we're talking about IR constructs
> on the C attributes.
>
> Johannes - I assume your claim is restricted mostly to the IR? That
> having optnone not require or imply noinline improves orthogonality of
> features and that there are reasonable use cases where one might want
> optnone while allowing inlining (of the optnone function) or optnone
> while disallowing inlining (of the optnone function)

Even at the IR level, I'd argue that inlining a function marked optnone
is violating the contract that the function should not be optimized;
because once it's inlined somewhere else, there's no control over the
optimization applied to the inlined instance.

Not to say we couldn't redefine the IR optnone that way, but it really
feels wrong to have 'optnone' actually mean 'optsometimes'.

> Paul - I think you're mostly thinking about/interested in the specific
> source level/end user use case that motivated the initial
> implementation of optnone. Where, I tend to agree - inlining an
> optnone function is not advantageous to the user. Though it's possible
> Johannes 's argument could be generalized from IR to C and still
> apply: orthogonal features are more powerful and the user can always
> compose them together to get what they want. (good chance they're
> using attributes behind macros for ease of use anyway - they're a bit
> verbose to write by hand all the time)

I'm obviously finding it hard to imagine a real use-case for that...
I mean, sure you can lay out cases and say in a rather theoretical
way, here's this interesting thing that happens when you do this.
Interesting things are interesting, but are they practical/useful?
Any non-speculative, real-world applications? The YAGNI principle
applies here.

(I believe the original inspiration was an MSVC feature, actually.)

As long as the existing Clang __attribute((optnone)) semantics don't
change (i.e., continued to imply noinline) it won't affect my users;
but I would *really* not want to change something like that on them,
without a bonafide use-case that could be readily explained.

Here's a real-world case that might help explain my resistance.
Sony has a downstream feature that allows suppressing debug-info for
inlined functions; the argument is that these are generally small,
easily verifiable, and debugging sessions that keep popping down into
them are annoying and distracting from looking at the real problem.

Our initial implementation depending on whether the function was
actually inlined. For one thing, it was easy to identify inlined
scopes, and just not emit them. However, this was a terrible user
experience, because whether step-in did or didn't happen was dependent
on how the optimizer happened to feel that day. Programmers had no
control over their debugging experience.

We changed this so that programmers could tell, by looking at their
source code, whether debug info would be suppressed. In effect it's
a command-line option that implicitly adds 'nodebug' to a given set
of cases (methods defined in-class, 'inline' keyword).

So, anything that smacks of "you get different things depending on
whether the compiler decided to inline your function" just makes me
twitch.

And that's what the "optnone doesn't mean noinline" proposal does.

> There's also the -O0 use of optnone these days (clang puts optnone on
> all functions when compiling with -O0 - the intent being to treat such
> functions as though they were compiled in a separate object file
> without optimizations (that's me projecting what I /think/ the mental
> model should be) - which, similarly, I think will probably want to
> keep the current behavior (no ipa/inlining and no optimization -
> however that's phrased).

The -O0 case was so that you can mix -O0 with LTO and have it stick.
Your as-if seems like a reasonable model for it.

Thanks,

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 1:57:07 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 12:37 PM, paul.r...@sony.com wrote:
>> There seems to be a bunch of confusion and probably some
>> conflation/ambiguity about whether we're talking about IR constructs
>> on the C attributes.
>>
>> Johannes - I assume your claim is restricted mostly to the IR? That
>> having optnone not require or imply noinline improves orthogonality of
>> features and that there are reasonable use cases where one might want
>> optnone while allowing inlining (of the optnone function) or optnone
>> while disallowing inlining (of the optnone function)
> Even at the IR level, I'd argue that inlining a function marked optnone
> is violating the contract that the function should not be optimized;
> because once it's inlined somewhere else, there's no control over the
> optimization applied to the inlined instance.
>
> Not to say we couldn't redefine the IR optnone that way, but it really
> feels wrong to have 'optnone' actually mean 'optsometimes'.

It does not. Take `noinline` as an example. A `noinline` function
is not inlined, so far so good. Now a caller of a `noinline`
function might be inlined all over the place anyway.
What I try to say is that function attributes apply to the function,
not to the rest of the world. If you want to say: do not optimize this
code ever, not here nor anywhere else, use `optnone` + `noinline`. If
you want to the function symbol to contain unoptimized code so
you can debug it, use `optnone`.

Let's make my context sensitive debugging example more concrete:

static void f1(int x) { ... }
static void f2(int x) { ...; f1(x); ... }
static void f3(int x) { ...; f2(x); ... }
static void f4(int x) { ...; f3(x); ... }

static void broken() { f4(B); }
static void working() {
  for (int i = 0; i < 1<<20; ++i)
    f4(A);
}

void entry() {
  working();
  broken();
}

So, let's assume we crash somewhere in f1 when we reach it from
broken but not from working. To debug we want to avoid optimizing
f1-4 and broken. To do that we can add `optnone` to all 5 functions
and `noinline` to broken. The effect will be that we have untouched
code in the call chain we want to debug while we potentially/probably
have reasonably fast code in the context of working which allows us
to actually run this fast.

Right now, you can get that effect if you use `__attribute__((flatten))`
on working, however it reverses the problem. You are required to "mark"
all context that should be fast, not the ones you want to debug. Both
can be useful (IMHO).


>
>> Paul - I think you're mostly thinking about/interested in the specific
>> source level/end user use case that motivated the initial
>> implementation of optnone. Where, I tend to agree - inlining an
>> optnone function is not advantageous to the user. Though it's possible
>> Johannes 's argument could be generalized from IR to C and still
>> apply: orthogonal features are more powerful and the user can always
>> compose them together to get what they want. (good chance they're
>> using attributes behind macros for ease of use anyway - they're a bit
>> verbose to write by hand all the time)
> I'm obviously finding it hard to imagine a real use-case for that...
> I mean, sure you can lay out cases and say in a rather theoretical
> way, here's this interesting thing that happens when you do this.
> Interesting things are interesting, but are they practical/useful?
> Any non-speculative, real-world applications? The YAGNI principle
> applies here.

What about the above? I can totally imagine something like this.


> (I believe the original inspiration was an MSVC feature, actually.)
>
> As long as the existing Clang __attribute((optnone)) semantics don't
> change (i.e., continued to imply noinline) it won't affect my users;
> but I would *really* not want to change something like that on them,
> without a bonafide use-case that could be readily explained.
>
> Here's a real-world case that might help explain my resistance.
> Sony has a downstream feature that allows suppressing debug-info for
> inlined functions; the argument is that these are generally small,
> easily verifiable, and debugging sessions that keep popping down into
> them are annoying and distracting from looking at the real problem.
>
> Our initial implementation depending on whether the function was
> actually inlined. For one thing, it was easy to identify inlined
> scopes, and just not emit them. However, this was a terrible user
> experience, because whether step-in did or didn't happen was dependent
> on how the optimizer happened to feel that day. Programmers had no
> control over their debugging experience.
>
> We changed this so that programmers could tell, by looking at their
> source code, whether debug info would be suppressed. In effect it's
> a command-line option that implicitly adds 'nodebug' to a given set
> of cases (methods defined in-class, 'inline' keyword).
>
> So, anything that smacks of "you get different things depending on
> whether the compiler decided to inline your function" just makes me
> twitch.
>
> And that's what the "optnone doesn't mean noinline" proposal does.

Let's take a step back for a second and assume we would have
always said `optnone` + `noinline` gives you exactly what you
get right now with `optnone`. I think we can explain that to
people, we can say, `optnone` will prevent optimization "inside
this symbol" and `noinline` will prevent the code to be copied
into another symbol. Every use case you have could be served by
adding these two attributes instead of the one you do now. Everyone
would be as happy as they are, all the benefits would be exactly
the same, no behavior change if you use the two together. That said,
it would open up the door for context sensitive debugging. Now you
can argue nobody will ever want to debug only a certain path through
their program, but I find that position requires a justification more
than the opposite which assumes people will find a way to benefit from
it.


>
>> There's also the -O0 use of optnone these days (clang puts optnone on
>> all functions when compiling with -O0 - the intent being to treat such
>> functions as though they were compiled in a separate object file
>> without optimizations (that's me projecting what I /think/ the mental
>> model should be) - which, similarly, I think will probably want to
>> keep the current behavior (no ipa/inlining and no optimization -
>> however that's phrased).
> The -O0 case was so that you can mix -O0 with LTO and have it stick.
> Your as-if seems like a reasonable model for it.

I totally think -O0 can imply all three attributes, optnone, noipa,
noinline.
That is not an issue as far as I'm concerned.

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 2:40:59 PM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
> > Not to say we couldn't redefine the IR optnone that way, but it really
> > feels wrong to have 'optnone' actually mean 'optsometimes'.
>
> It does not.

A point of phrasing: Please do not tell me how I feel.
I say it feels wrong, and your denial does not help the conversation.

The word "none" means "none." It does not mean "sometimes." Can we
agree on that much?

Redefining a term "xyz-none" to mean "xyz-sometimes" feels wrong.
If you want an attribute that means "optsometimes" then it should
be a new attribute, with a name that reflects its actual semantics.
I am not opposed to that, but my understanding is that we have been
arguing about the definition of the existing attribute.

> Take `noinline` as an example. A `noinline` function
> is not inlined, so far so good.

And if the compiler decides it is useful to make copies/clones of
the function, those aren't inlined either. The copies retain their
original attributes and semantics. (Perhaps the compiler can make
copies to take advantage of argument propagation, or some such. I
do not think this proposition is unreasonable.)

> Now a caller of a `noinline`
> function might be inlined all over the place anyway.
> What I try to say is that function attributes apply to the function,
> not to the rest of the world. If you want to say: do not optimize this
> code ever, not here nor anywhere else, use `optnone` + `noinline`. If
> you want to the function symbol to contain unoptimized code so
> you can debug it, use `optnone`.

You are making a severe distinction between the copy of the function
that happens not to be inlined, and the copies that have been inlined,
such that the inlined copies have lost their original properties.
But just as the copies of the `noinline` function retain `noinline`
and are not inlined, I argue that the `optnone` function copies ought
to retain `optnone` and not be optimized.

LLVM does not have a way to not-optimize part of a function, so we
achieve the goal by not inlining `optnone` functions.

I dispute that the inlined copies of an `optnone` function should
lose that much of their original characteristics, and the rest of
the disagreement follows from there. But I have a suggestion to
offer below.


> Let's take a step back for a second and assume we would have
> always said `optnone` + `noinline` gives you exactly what you
> get right now with `optnone`. I think we can explain that to
> people, we can say, `optnone` will prevent optimization "inside
> this symbol" and `noinline` will prevent the code to be copied
> into another symbol. Every use case you have could be served by
> adding these two attributes instead of the one you do now. Everyone
> would be as happy as they are, all the benefits would be exactly
> the same, no behavior change if you use the two together. That said,
> it would open up the door for context sensitive debugging. Now you
> can argue nobody will ever want to debug only a certain path through
> their program, but I find that position requires a justification more
> than the opposite which assumes people will find a way to benefit from
> it.

I don't think "inside this symbol" is meaningful to most programmers.
They see methods/functions, and the internal operation of compilers
(e.g., making copies of functions) are relatively mysterious. I say
this as someone who has spent many decades helping programmers use my
compilers.

I don't dispute that you can invent a scenario where it could be useful;
I reserve the right to be unpersuaded that it would occur often enough
that people would think of and make use of the feature.

> I totally think -O0 can imply all three attributes, optnone, noipa,
> noinline.

I totally think -O0 can imply { opt-sometimes, noipa, noinline }; and
this combination can be an upgrade path away from the existing optnone.

Can we proceed on that basis?

via llvm-dev

unread,
Apr 22, 2021, 2:41:40 PM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
I'll just say,

> If you want "separate non-LTO" behavior by design, put it in a
> different file and don't compile that file with LTO ;)

that is impractical in many build systems, and an excessive amount
of work if you want to selectively build a some code with -O0
because you're debugging something at the moment.

David Blaikie via llvm-dev

unread,
Apr 22, 2021, 2:54:43 PM4/22/21
to Paul Robinson, Florian Hahn, llvm-dev

Most of this is a pretty academic discussion - and probably more
heat/angst/difficulty than is needed right now, as much as I do care
about both perspectives (orthogonality of features V usability for the
common case).

I'm going to add noipa, and I'm going to wire it up to optnone in
clang. It's possible one of two things happen there: Either we wire up
noipa the same way noinline is (optnone /requires/ noinline now, and
so it'd /require/ noipa) or we change LLVM IR to remove that
constraint/tie between optnone and noinline, and add noipa in that way
too. (the third option of having optnone require one but not both of
these attributes isn't a state I'd want to get in) - though clang -O0
and clang __attribute__((optnone)) would both still lower to
optnone+noinline+noipa regardless of whether LLVM enforces the
connection between them or not.

- Dave

via llvm-dev

unread,
Apr 22, 2021, 3:06:23 PM4/22/21
to dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
Because I put my constructive suggestion at the end of a long
email, I'll repeat it with more clarity here:

`optnone` is what it is.

Define a new `opt-sometimes` that means what Johannes suggests.
(A better name is more than welcome!)
Clang's __attribute__((optnone)) can migrate to meaning
{ opt-sometimes, noipa, noinline }.

`optnone` can be retired, existing only in the bitcode upgrader
which replaces it with { opt-sometimes, noipa, noinline }.

Changing Clang's attribute to mean something else can be put off
to another day.
--paulr

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 3:25:58 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 2:06 PM, paul.r...@sony.com wrote:
> Because I put my constructive suggestion at the end of a long
> email, I'll repeat it with more clarity here:
>
> `optnone` is what it is.
>
> Define a new `opt-sometimes` that means what Johannes suggests.
> (A better name is more than welcome!)

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

If you look at the uses of `optnone` in LLVM, you will not
find the passes to look for `noinline` as well, nor do they
need to. The decision to (not) act is based on `optnone`,
which it not changed at all.

~ Johannes

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 3:29:14 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 1:41 PM, paul.r...@sony.com wrote:
> I'll just say,
>
>> If you want "separate non-LTO" behavior by design, put it in a
>> different file and don't compile that file with LTO ;)
> that is impractical in many build systems, and an excessive amount
> of work if you want to selectively build a some code with -O0
> because you're debugging something at the moment.

Fair, it was also not my suggestion how to approach this, that
came in the following paragraph.

Generally, I'd prefer if we do not pick sentences than end with
a smiley out of context, there is little gain in that.

~ Johannes

David Blaikie via llvm-dev

unread,
Apr 22, 2021, 3:35:16 PM4/22/21
to Johannes Doerfert, Florian Hahn, llvm-dev
On Thu, Apr 22, 2021 at 12:29 PM Johannes Doerfert
<johannes...@gmail.com> wrote:
>
>
> On 4/22/21 1:41 PM, paul.r...@sony.com wrote:
> > I'll just say,
> >
> >> If you want "separate non-LTO" behavior by design, put it in a
> >> different file and don't compile that file with LTO ;)
> > that is impractical in many build systems, and an excessive amount
> > of work if you want to selectively build a some code with -O0
> > because you're debugging something at the moment.
>
> Fair, it was also not my suggestion how to approach this, that
> came in the following paragraph.
>
> Generally, I'd prefer if we do not pick sentences than end with
> a smiley out of context, there is little gain in that.

Generally, in conversations that are already a bit heated (by
confusion and otherwise) comments like this come off to me as further
inflammatory (whereas humor in other situations can improve social
bonds/connection) - belittling an argument that's trying to be made in
good faith.

- Dave

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 3:46:11 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

On 4/22/21 1:40 PM, paul.r...@sony.com wrote:
>>> Not to say we couldn't redefine the IR optnone that way, but it really
>>> feels wrong to have 'optnone' actually mean 'optsometimes'.
>> It does not.
> A point of phrasing: Please do not tell me how I feel.
> I say it feels wrong, and your denial does not help the conversation.

I feel you are interpreting my words in a way that makes them
sound worse than I would imagine outside observers do interpret
them, especially as they come with context and not as standalone
as it looks in your reply.

That said, I do not wish to tell you how you feel, should feel,
or anything else in that direction for that matter. If my words
come across as such, apologies. I will try to work on that.


>
> The word "none" means "none." It does not mean "sometimes." Can we
> agree on that much?

We can. Unsure why you would imagine I do not know the meaning of
"none" or "sometimes" for that matter. I think we can establish I
know basic words to avoid these kind of questions in the future :)


> Redefining a term "xyz-none" to mean "xyz-sometimes" feels wrong.

Agreed. I do not believe I'm proposing to do that.


> If you want an attribute that means "optsometimes" then it should
> be a new attribute, with a name that reflects its actual semantics.

Agreed. I am always in favor of attributes that have a single
specific meaning and a suitable name. I don't have a use case
for "optsomtimes" just yet but generally speaking I'm all for
composeable attributes that do not conflate ideas.


> I am not opposed to that, but my understanding is that we have been
> arguing about the definition of the existing attribute.

I don't think we do, especially since I do not think I want to
change the definition of `optnone`, at least the part that
all use cases in LLVM that I'm aware of are looking at. So, passes
would still be skipped if a function is `optnone` as the description
in the lang ref says. What would be different is that you have the
option, not the obligation, to pair it with `noinline`. If you do,
you get the `noinline` effect. If you don't, you don't. The `optnone`
effect stays the same either way.


>> Take `noinline` as an example. A `noinline` function
>> is not inlined, so far so good.
> And if the compiler decides it is useful to make copies/clones of
> the function, those aren't inlined either. The copies retain their
> original attributes and semantics. (Perhaps the compiler can make
> copies to take advantage of argument propagation, or some such. I
> do not think this proposition is unreasonable.)
>
>> Now a caller of a `noinline`
>> function might be inlined all over the place anyway.
>> What I try to say is that function attributes apply to the function,
>> not to the rest of the world. If you want to say: do not optimize this
>> code ever, not here nor anywhere else, use `optnone` + `noinline`. If
>> you want to the function symbol to contain unoptimized code so
>> you can debug it, use `optnone`.
> You are making a severe distinction between the copy of the function
> that happens not to be inlined, and the copies that have been inlined,
> such that the inlined copies have lost their original properties.
> But just as the copies of the `noinline` function retain `noinline`
> and are not inlined, I argue that the `optnone` function copies ought
> to retain `optnone` and not be optimized.

If you want `optnone` functions to not be copied/inlined, use
`noinline`. We have an attribute for that and we literally require
it right now to get the effect you want. It is not `optnone` that
prevents copies which are then optimized, it is `noinline`. I do
not propose to change that one bit.


>
> LLVM does not have a way to not-optimize part of a function, so we
> achieve the goal by not inlining `optnone` functions.

Agreed.


>
> I dispute that the inlined copies of an `optnone` function should
> lose that much of their original characteristics, and the rest of
> the disagreement follows from there. But I have a suggestion to
> offer below.
>
>
>> Let's take a step back for a second and assume we would have
>> always said `optnone` + `noinline` gives you exactly what you
>> get right now with `optnone`. I think we can explain that to
>> people, we can say, `optnone` will prevent optimization "inside
>> this symbol" and `noinline` will prevent the code to be copied
>> into another symbol. Every use case you have could be served by
>> adding these two attributes instead of the one you do now. Everyone
>> would be as happy as they are, all the benefits would be exactly
>> the same, no behavior change if you use the two together. That said,
>> it would open up the door for context sensitive debugging. Now you
>> can argue nobody will ever want to debug only a certain path through
>> their program, but I find that position requires a justification more
>> than the opposite which assumes people will find a way to benefit from
>> it.
> I don't think "inside this symbol" is meaningful to most programmers.
> They see methods/functions, and the internal operation of compilers
> (e.g., making copies of functions) are relatively mysterious. I say
> this as someone who has spent many decades helping programmers use my
> compilers.

When I say symbol I mean function/method. So "inside this function
or method" is what I tried to say. People can deal with that concept.

> I don't dispute that you can invent a scenario where it could be useful;
> I reserve the right to be unpersuaded that it would occur often enough
> that people would think of and make use of the feature.

I don't claim people will jump on it, nor can I predict how many will
use it at all. What I'm saying it can be useful and that is by itself
a good enough reason (for me) to expose the functionality. Time, and
users, will tell us if they use it or not. Furthermore, I did say that
the C level can be untouched if we really want to, not that I'm in favor
of that but it is certainly a harder sell. The IR level change I am
advocating for is just a verifier condition, nothing else, it doesn't
even leak into the user space. I'm not sure why this is so controversial.


>
>> I totally think -O0 can imply all three attributes, optnone, noipa,
>> noinline.
> I totally think -O0 can imply { opt-sometimes, noipa, noinline }; and
> this combination can be an upgrade path away from the existing optnone.
>
> Can we proceed on that basis?

I don't know what `optsomtimes` is nor how it differentiates itself from
`optnone`. I'm also unsure why you would not go with `optnone`, `noinline`,
and `noipa` for -O0, isn't that exactly what you wanted to have all along?

~ Johannes

via llvm-dev

unread,
Apr 22, 2021, 7:00:45 PM4/22/21
to johannes...@gmail.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.

> The point is, I never suggested to change the meaning of
> `optnone` (in IR). I argue to change the requirement for
> it to always go with `noinline`. `optnone` itself is not
> changed, you get exactly the same behavior you got before,
> and `noinline` is also not changed. They are simply not
> required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
of *requiring* it.
- make the inliner check for optnone on the callee, instead of
simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.


I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.

Johannes Doerfert via llvm-dev

unread,
Apr 22, 2021, 8:10:04 PM4/22/21
to paul.r...@sony.com, dbla...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org

Works for me.

~ Johannes

Fangrui Song via llvm-dev

unread,
Apr 25, 2021, 5:38:26 PM4/25/21
to Johannes Doerfert, paul.r...@sony.com, f...@fhahn.com, llvm...@lists.llvm.org

I see that the clang attribute 'optnone' patch rC205255 (in 2014) added
both the IR 'optnone' and 'noinline' attributes.

If the clang attribute 'optnone' (for debugging purposes) is to be renamed,
I humbly suggest we may consider implementing __attribute__((optimize("O0")))
(limited to "O0" only; other values are not accepted).

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html says
"The optimize attribute should be used for debugging purposes only. It
is not suitable in production code." which matches our debugging only
purposes.

-O0 code already emits 'optnone' and 'noinline' for non-alwaysinline
functions, so we may not need a new attribute.

David Blaikie via llvm-dev

unread,
Apr 25, 2021, 6:03:47 PM4/25/21
to Fangrui Song, Florian Hahn, llvm-dev

I don't have any plans to rename or add clang attributes - though that
one could be added as an alias for whatever the optnone clang
attribute does.

Johannes Doerfert via llvm-dev

unread,
Apr 26, 2021, 10:18:03 AM4/26/21
to Fangrui Song, paul.r...@sony.com, f...@fhahn.com, llvm...@lists.llvm.org
I mentioned that somewhere else, maybe not on the list though, but we
are looking
into the option to select the optimization level per function.
Basically, remove
the limit to O0 in `__attribute__((optimize("OX"))`. I'll start a new
thread once
we are closer where we explain why exposing it to the user is only one
use case.
Long story short, `optimize("O0")` would be nice and we could use it for
debugging
as suggested ;)

~ Johannes

via llvm-dev

unread,
Apr 26, 2021, 11:48:02 AM4/26/21
to mas...@google.com, johannes...@gmail.com, f...@fhahn.com, llvm...@lists.llvm.org
> I see that the clang attribute 'optnone' patch rC205255 (in 2014) added
> both the IR 'optnone' and 'noinline' attributes.
>
> If the clang attribute 'optnone' (for debugging purposes) is to be
> renamed,
> I humbly suggest we may consider implementing
> __attribute__((optimize("O0")))
> (limited to "O0" only; other values are not accepted).
>
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html says
> "The optimize attribute should be used for debugging purposes only. It
> is not suitable in production code." which matches our debugging only
> purposes.
>
> -O0 code already emits 'optnone' and 'noinline' for non-alwaysinline
> functions, so we may not need a new attribute.

Renaming or adding a clang attribute should be proposed on its own
thread on cfe-dev, as it will not have the proper visibility buried
on llvm-dev at the end of a long thread like this one.

Reply all
Reply to author
Forward
0 new messages