[llvm-dev] How to prevent optimizing away a call + its arguments

1,269 views
Skip to first unread message

Kuba Mracek via llvm-dev

unread,
Jun 21, 2017, 8:09:19 PM6/21/17
to Dmitry Vyukov via llvm-dev, Joe Groff
Hi llvm-dev,

I have a C function:

__attribute__((__visibility__("default")))
__attribute__((used))
__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
asm volatile("" :::);
}

(the purpose is that this function will be used dynamically at runtime, perhaps by interposing the function, or via the debugger)

I really thought this will not get optimized out, but I've realized (the hard way) that LLVM will happily optimize a call to this function, and replace all arguments with undef, because it figures out that they're not really needed.

I'm going to fix this by passing the arguments explicitly as inputs to the asm, but is that expected? Is there any more reasonable way (attribute) of telling that the compiler should really not expect anything from the body of the function, not assume that it's not doing anything, and not optimizing out arguments?

Thanks,
Kuba

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Mehdi AMINI via llvm-dev

unread,
Jun 21, 2017, 8:25:10 PM6/21/17
to Kuba Mracek, Dmitry Vyukov via llvm-dev, Joe Groff
Hi Kuba,

Try:


-- 
Mehdi

Joerg Sonnenberger via llvm-dev

unread,
Jun 22, 2017, 10:35:33 AM6/22/17
to llvm...@lists.llvm.org
On Wed, Jun 21, 2017 at 05:25:04PM -0700, Mehdi AMINI via llvm-dev wrote:
> Hi Kuba,
>
> Try:
>
> __attribute__(optnone)
>
> See
> https://clang.llvm.org/docs/AttributeReference.html#optnone-clang-optnone

Actually, it should be enough to use:

__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {

asm volatile("":::"memory");
}

Creating a real barrier is important.

Joerg

David Blaikie via llvm-dev

unread,
Jun 22, 2017, 1:36:07 PM6/22/17
to Joerg Sonnenberger, llvm...@lists.llvm.org, Chandler Carruth
optnone should work, but really noinline should probably (Chandler: Can you confirm: is it reasonable to model noinline as "no interprocedural analysis across this function boundary" (so FunctionAttrs should do the same thing for noinline as it does for optnone, for example? ie: not derive any new attributes) - allowing the function to be optimized internally (unlike optnone) but not allowing interprocedural analysis inside the function to be used in callers (unlike optnone)) work as well?

David Majnemer via llvm-dev

unread,
Jun 22, 2017, 1:41:16 PM6/22/17
to David Blaikie, llvm-dev
noinline should, and does, not mean "do not do IPO"; we will still do IPCP. The easiest way to defeat IPO is to use __attribute__((weak)) as it makes isDefinitionExact false: https://godbolt.org/g/VVBcgF

Joerg Sonnenberger via llvm-dev

unread,
Jun 22, 2017, 1:45:18 PM6/22/17
to llvm...@lists.llvm.org
On Thu, Jun 22, 2017 at 05:35:51PM +0000, David Blaikie wrote:
> optnone should work, but really noinline should probably (Chandler: Can you
> confirm: is it reasonable to model noinline as "no interprocedural analysis
> across this function boundary" (so FunctionAttrs should do the same thing
> for noinline as it does for optnone, for example? ie: not derive any new
> attributes) - allowing the function to be optimized internally (unlike
> optnone) but not allowing interprocedural analysis inside the function to
> be used in callers (unlike optnone)) work as well?

I don't think it is reasonable to expect "noinline" to mean "must not do
IPA". There are different reasons for using "noinline": ensuring a stack
frame, forcing outlining of "cold" code etc. Many of those reasons are
perfectly fine to still allow IPA. Debug hooks fall into two categories:
making sure that the call happens (noinline should allow that) and
making sure that the debugger can actually do something at this point
(noinline should not have to allow that).

David Blaikie via llvm-dev

unread,
Jun 22, 2017, 2:03:55 PM6/22/17
to Joerg Sonnenberger, llvm...@lists.llvm.org
On Thu, Jun 22, 2017 at 10:45 AM Joerg Sonnenberger via llvm-dev <llvm...@lists.llvm.org> wrote:
On Thu, Jun 22, 2017 at 05:35:51PM +0000, David Blaikie wrote:
> optnone should work, but really noinline should probably (Chandler: Can you
> confirm: is it reasonable to model noinline as "no interprocedural analysis
> across this function boundary" (so FunctionAttrs should do the same thing
> for noinline as it does for optnone, for example? ie: not derive any new
> attributes) - allowing the function to be optimized internally (unlike
> optnone) but not allowing interprocedural analysis inside the function to
> be used in callers (unlike optnone)) work as well?

I don't think it is reasonable to expect "noinline" to mean "must not do
IPA". There are different reasons for using "noinline": ensuring a stack
frame, forcing outlining of "cold" code etc. Many of those reasons are
perfectly fine to still allow IPA. Debug hooks fall into two categories:
making sure that the call happens (noinline should allow that)

noinline (& in fact, even optnone) doesn't make sure the call happens - various forms of IPA can cause a call to go away without actually inlining. 

(simplest example, that even the inliner got wrong (& I fixed recently, which is why any of this comes to mind/I have any context on it) - the inliner removed a call to an optnone+readnone function without consulting the inliner heuristic (this was in the alwaysinliner) because it assumed the operation was so cheap no inliner heuristic would ever disagree, basically ;) )

But some other optimization could/would still remove a noinline+readnone function because it's a trivially dead instruction (assuming the result isn't used). So noinline doesn't preserve the call - because some IPA can, in some cases, be as powerful as inlining-ish.

Davide Italiano via llvm-dev

unread,
Jun 22, 2017, 2:11:28 PM6/22/17
to David Blaikie, llvm-dev
On Thu, Jun 22, 2017 at 11:03 AM, David Blaikie via llvm-dev

I agree, but still I don't think it's `noinline` job to prevent this
from happening. It sounds weird (and probably a POLA violation) having
`noinline` preventing interprocedural constant propagation.
About `optnone`, I'm surprised is not powerful enough to prevent this
from happening, modulo bugs of course. Do you have other examples?

--
Davide

David Blaikie via llvm-dev

unread,
Jun 22, 2017, 2:17:12 PM6/22/17
to Davide Italiano, llvm-dev

There were bugs. I fixed them. :) (specifically it was a combination of FunctionAttrs proving readnone on an optnone function - fixed. And the alwaysinliner killing trivially dead calls (so any function call with readnone, even without alwaysinline, could be 'inlined' (removed) by the alwaysinliner) - also fixed)
 

--
Davide

Davide Italiano via llvm-dev

unread,
Jun 22, 2017, 2:39:23 PM6/22/17
to David Blaikie, llvm-dev


On Jun 22, 2017 10:36 AM, "David Blaikie via llvm-dev" <llvm...@lists.llvm.org> wrote:
optnone should work, but really noinline should probably (Chandler: Can you confirm: is it reasonable to model noinline as "no interprocedural analysis across this function boundary" (so FunctionAttrs should do the same thing for noinline as it does for optnone, for example? ie: not derive any new attributes) - allowing the function to be optimized internally (unlike optnone) but not allowing interprocedural analysis inside the function to be used in callers (unlike optnone)) work as well?


IMHO, No. The only semantic of noinline is that the inliner(s) pass shouldn't do anything with that function. Inhibiting IPO seems like could be a valid usecase but that would need a different attribute (e.g. noipo).

--
Davide

Kuba Mracek via llvm-dev

unread,
Jun 22, 2017, 2:40:32 PM6/22/17
to Joerg Sonnenberger, Joerg Sonnenberger via llvm-dev
> Actually, it should be enough to use:
>
> __attribute__((noinline))
> void please_do_not_optimize_me_away(int arg1, void *arg2) {
> asm volatile("":::"memory");
> }
>
> Creating a real barrier is important.

This doesn't work – the call still gets turned into please_do_not_optimize_me_away(undef, undef).

> __attribute__((optnone))

optnone works, but I'm actually surprised by this. I would expect that it would only affect the generated code of that function...

Is it guaranteed to work? Or is my safest bet still to use:

__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {

asm volatile("" :: "r" (arg1), "r" (arg2) : "memory");
}

(The other benefit compared to optnone is that this will actually generate a nice empty function. Using optnone generates code that stores the arguments to the stack.)

Kuba

David Blaikie via llvm-dev

unread,
Jun 22, 2017, 2:51:35 PM6/22/17
to Kuba Mracek, Joerg Sonnenberger, Joerg Sonnenberger via llvm-dev
On Thu, Jun 22, 2017 at 11:40 AM Kuba Mracek via llvm-dev <llvm...@lists.llvm.org> wrote:
> Actually, it should be enough to use:
>
> __attribute__((noinline))
> void please_do_not_optimize_me_away(int arg1, void *arg2) {
>  asm volatile("":::"memory");
> }
>
> Creating a real barrier is important.

This doesn't work – the call still gets turned into please_do_not_optimize_me_away(undef, undef).

> __attribute__((optnone))

optnone works, but I'm actually surprised by this.  I would expect that it would only affect the generated code of that function...

Is it guaranteed to work? 

Modulo bugs, yes - optnone should have the same behavior as if you put the function definition in another file and compiled that file with -O0.
 

Matthias Braun via llvm-dev

unread,
Jun 22, 2017, 2:55:43 PM6/22/17
to Kuba Mracek, Joerg Sonnenberger via llvm-dev
It looks like what you are trying to do here is define a weak function. Marking the function as weak should have the desired effect, though last time I looked apple paltforms only had limited support for weak functions...

- Matthias

Joerg Sonnenberger via llvm-dev

unread,
Jun 22, 2017, 3:07:08 PM6/22/17
to llvm...@lists.llvm.org
On Thu, Jun 22, 2017 at 06:03:39PM +0000, David Blaikie wrote:
> On Thu, Jun 22, 2017 at 10:45 AM Joerg Sonnenberger via llvm-dev <
> llvm...@lists.llvm.org> wrote:
>
> > On Thu, Jun 22, 2017 at 05:35:51PM +0000, David Blaikie wrote:
> > > optnone should work, but really noinline should probably (Chandler: Can
> > you
> > > confirm: is it reasonable to model noinline as "no interprocedural
> > analysis
> > > across this function boundary" (so FunctionAttrs should do the same thing
> > > for noinline as it does for optnone, for example? ie: not derive any new
> > > attributes) - allowing the function to be optimized internally (unlike
> > > optnone) but not allowing interprocedural analysis inside the function to
> > > be used in callers (unlike optnone)) work as well?
> >
> > I don't think it is reasonable to expect "noinline" to mean "must not do
> > IPA". There are different reasons for using "noinline": ensuring a stack
> > frame, forcing outlining of "cold" code etc. Many of those reasons are
> > perfectly fine to still allow IPA. Debug hooks fall into two categories:
> > making sure that the call happens (noinline should allow that)
>
>
> noinline (& in fact, even optnone) doesn't make sure the call happens -
> various forms of IPA can cause a call to go away without actually inlining.

I'm not saying it does that. But I am saying that is why someone might
want to use it. That's why I gave the example with the memory clobber --
that is known to work for both GCC and Clang and fits here in the sense
that (1) it can't be duplicated (2) it contains a side effect. Now
whether this is the semantic we want to have for noinline is a different
question.

Joerg Sonnenberger via llvm-dev

unread,
Jun 22, 2017, 3:10:23 PM6/22/17
to Joerg Sonnenberger via llvm-dev
On Thu, Jun 22, 2017 at 11:31:41AM -0700, Kuba Mracek wrote:
> > Actually, it should be enough to use:
> >
> > __attribute__((noinline))
> > void please_do_not_optimize_me_away(int arg1, void *arg2) {
> > asm volatile("":::"memory");
> > }
> >
> > Creating a real barrier is important.
>
> This doesn't work – the call still gets turned into please_do_not_optimize_me_away(undef, undef).

If you also want it to preserve the arguments (that wasn't clear to me),
just add them as arguments to the asm statement?

Joerg

Reply all
Reply to author
Forward
0 new messages