[llvm-dev] [RFC] remove the llvm.expect intrinsic

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 12:20:14 PM4/22/16

to llvm-dev, cfe...@lists.llvm.org

I've proposed removing the llvm.expect intrinsic:
http://reviews.llvm.org/D19300

The motivation for this change is in:
http://reviews.llvm.org/D19299

For reference:
1. We created an intrinsic that's only reason for existing is to improve perf, but the intrinsic can harm optimization by interfering with transforms in other passes.

2. To solve that, we created a pass to always transform the intrinsic into metadata at a very early stage in LLVM. But now every program is paying a compile-time tax (running the LowerExpectIntrinsic pass) for a feature that is rarely used.

A possible front-end replacement transformation for a source-level "builtin_expect()" is in D19299: I think a front-end can always directly create metadata for this kind of programmer hint rather than using an intermediate representation (the intrinsic). Likewise, if there's some out-of-tree IR pass that is creating an llvm.expect, it should be able to create branch weight metadata directly instead.

Please let me know if you see any problems with this proposal or the patches.

For reference, here's the original post-commit review thread for llvm.expect:
https://marc.info/?l=llvm-commits&m=130997676129580&w=4

Philip Reames via llvm-dev

unread,

Apr 22, 2016, 12:40:26 PM4/22/16

to Sanjay Patel, llvm-dev, cfe...@lists.llvm.org

On 04/22/2016 09:20 AM, Sanjay Patel via llvm-dev wrote:

I've proposed removing the llvm.expect intrinsic:
http://reviews.llvm.org/D19300

The motivation for this change is in:
http://reviews.llvm.org/D19299

For reference:
1. We created an intrinsic that's only reason for existing is to improve perf, but the intrinsic can harm optimization by interfering with transforms in other passes.

I don't follow this at all. Given expects are eagerly lowered to metadata, where's the interaction? In particular, the expect lowering overrules any metadata on the associated branch/switch. This is exactly what you'd expect for a user annotation interacting with PGO data.

2. To solve that, we created a pass to always transform the intrinsic into metadata at a very early stage in LLVM. But now every program is paying a compile-time tax (running the LowerExpectIntrinsic pass) for a feature that is rarely used.

Er, what cost? Given this is a single linear pass over the IR - and could actually be made essentially free by checking to see if the module has any uses of expect - I'm suspicious of this compile time argument. Have you actually seen this in profiles?

A possible front-end replacement transformation for a source-level "builtin_expect()" is in D19299: I think a front-end can always directly create metadata for this kind of programmer hint rather than using an intermediate representation (the intrinsic). Likewise, if there's some out-of-tree IR pass that is creating an llvm.expect, it should be able to create branch weight metadata directly instead.

This seems like a reasonable proposal. The expect intrinsic does give us a mechanism to express value profiling predictions, but we don't appear to actually use that today. My bias would be to leave it in place, but I'm not going to object strongly if the consensus goes the other way.

Please let me know if you see any problems with this proposal or the patches.

For reference, here's the original post-commit review thread for llvm.expect:
https://marc.info/?l=llvm-commits&m=130997676129580&w=4

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 1:27:49 PM4/22/16

to Philip Reames, llvm-dev, cfe...@lists.llvm.org

On Fri, Apr 22, 2016 at 10:39 AM, Philip Reames <list...@philipreames.com> wrote:

On 04/22/2016 09:20 AM, Sanjay Patel via llvm-dev wrote:

I've proposed removing the llvm.expect intrinsic:
http://reviews.llvm.org/D19300

The motivation for this change is in:
http://reviews.llvm.org/D19299

For reference:
1. We created an intrinsic that's only reason for existing is to improve perf, but the intrinsic can harm optimization by interfering with transforms in other passes.

I don't follow this at all. Given expects are eagerly lowered to metadata, where's the interaction? In particular, the expect lowering overrules any metadata on the associated branch/switch. This is exactly what you'd expect for a user annotation interacting with PGO data.

I agree that a user annotation should override PGO data. This is why I initially proposed that builtin_expect() would extend on '[un]predictable' metadata rather than 'prof' metadata in D19299.

But I think that's a separate discussion now; we use 'prof' metadata for this purpose today, and I'm not trying to change that in these patches. If the user-specified representation of builtin_expect() is overwritten by PGO data, I think that's a bug independent of the current proposal/patches.

2. To solve that, we created a pass to always transform the intrinsic into metadata at a very early stage in LLVM. But now every program is paying a compile-time tax (running the LowerExpectIntrinsic pass) for a feature that is rarely used.

Er, what cost? Given this is a single linear pass over the IR - and could actually be made essentially free by checking to see if the module has any uses of expect - I'm suspicious of this compile time argument. Have you actually seen this in profiles?

No - I expect the actual overhead is in the noise. The real objection to the LowerExpectIntrinsic pass is that it is completely unnecessary. This was raised in the original review. We shouldn't have two mechanisms to represent exactly the same thing. This is also why my initial implementation for builtin_unpredictable() was rejected:
http://reviews.llvm.org/D12341

I assumed there was some good reason for LowerExpectIntrinsic to exist, so I copied that design. As noted in that review, my assumption was wrong. And it was suggested then that we should remove LowerExpectIntrinsic but nobody had gotten around to it. With these patches, I'd like to finally fix this.

Reid Kleckner via llvm-dev

unread,

Apr 22, 2016, 1:37:27 PM4/22/16

to Sanjay Patel, Nick Lewycky, llvm-dev, cfe-dev

Clang still appears to use llvm.expect. I think if you can show that it's trivial to update clang with a patch, then yeah, moving back down to one representation for this sounds reasonable.

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 1:40:06 PM4/22/16

to Reid Kleckner, llvm-dev, cfe-dev, Nick Lewycky

Hi Reid -

The intent of D19299 is to remove all Clang refs to llvm.expect. Do you see any holes after applying that patch?

Mehdi Amini via llvm-dev

unread,

Apr 22, 2016, 1:40:58 PM4/22/16

to Sanjay Patel, llvm-dev, cfe...@lists.llvm.org

On Apr 22, 2016, at 10:27 AM, Sanjay Patel via llvm-dev <llvm...@lists.llvm.org> wrote:

On Fri, Apr 22, 2016 at 10:39 AM, Philip Reames <list...@philipreames.com> wrote:

On 04/22/2016 09:20 AM, Sanjay Patel via llvm-dev wrote:
I've proposed removing the llvm.expect intrinsic:
http://reviews.llvm.org/D19300

The motivation for this change is in:
http://reviews.llvm.org/D19299

For reference:
1. We created an intrinsic that's only reason for existing is to improve perf, but the intrinsic can harm optimization by interfering with transforms in other passes.
I don't follow this at all. Given expects are eagerly lowered to metadata, where's the interaction? In particular, the expect lowering overrules any metadata on the associated branch/switch. This is exactly what you'd expect for a user annotation interacting with PGO data.

I agree that a user annotation should override PGO data.

PGO is also a user input: the user is basically saying "I want the code to be optimized for *this* use case".

So interestingly I would have thought the opposite: PGO overrides the source code annotation.

Here are a couple of reasons why:

- libraries can be used by different client and what is common in one case might not for another.

- code evolves, and user can fail to revisit assumption about the common case

- the user can be wrong, PGO should not (?).

--

Mehdi

Mehdi Amini via llvm-dev

unread,

Apr 22, 2016, 1:42:41 PM4/22/16

to Sanjay Patel, llvm-dev, cfe...@lists.llvm.org

On Apr 22, 2016, at 10:39 AM, Mehdi Amini <mehdi...@apple.com> wrote:

On Apr 22, 2016, at 10:27 AM, Sanjay Patel via llvm-dev <llvm...@lists.llvm.org> wrote:

On Fri, Apr 22, 2016 at 10:39 AM, Philip Reames <list...@philipreames.com> wrote:

On 04/22/2016 09:20 AM, Sanjay Patel via llvm-dev wrote:
I've proposed removing the llvm.expect intrinsic:
http://reviews.llvm.org/D19300

The motivation for this change is in:
http://reviews.llvm.org/D19299

For reference:
1. We created an intrinsic that's only reason for existing is to improve perf, but the intrinsic can harm optimization by interfering with transforms in other passes.
I don't follow this at all. Given expects are eagerly lowered to metadata, where's the interaction? In particular, the expect lowering overrules any metadata on the associated branch/switch. This is exactly what you'd expect for a user annotation interacting with PGO data.

I agree that a user annotation should override PGO data.

PGO is also a user input: the user is basically saying "I want the code to be optimized for *this* use case".
So interestingly I would have thought the opposite: PGO overrides the source code annotation.
Here are a couple of reasons why:
- libraries can be used by different client and what is common in one case might not for another.
- code evolves, and user can fail to revisit assumption about the common case
- the user can be wrong, PGO should not (?).

For this last point, whatever information prevails in the end, it may be valuable to report to the user some optimization hints about the mismatch between the PGO measurement and the annotation.

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 1:48:24 PM4/22/16

to Mehdi Amini, llvm-dev, cfe-dev

On Fri, Apr 22, 2016 at 11:41 AM, Mehdi Amini <mehdi...@apple.com> wrote:

PGO is also a user input: the user is basically saying "I want the code to be optimized for *this* use case".
So interestingly I would have thought the opposite: PGO overrides the source code annotation.
Here are a couple of reasons why:
- libraries can be used by different client and what is common in one case might not for another.
- code evolves, and user can fail to revisit assumption about the common case
- the user can be wrong, PGO should not (?).

For this last point, whatever information prevails in the end, it may be valuable to report to the user some optimization hints about the mismatch between the PGO measurement and the annotation.

I like that idea. My feeling is that builtin_expect() should be treated like a super-power. If it's wrong, it's the programmer's fault. But I understand your point that PGO is another user input.

Note that the GCC docs (because there's no corresponding clang doc for this afaict) do warn:
"You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform."

Reid Kleckner via llvm-dev

unread,

Apr 22, 2016, 1:55:11 PM4/22/16

to Sanjay Patel, llvm-dev, cfe-dev, Nick Lewycky

Sorry, I didn't realize that was the clang side.

I think it's kind of ugly that the frontend has to pattern match and lower this intrinsic in three places: if, switch, and general case. The existing intrinsic is nice because it directly represents what the user can write, and lets the optimizer decide how to best represent it. As David Li mentioned, we may want to do value profiling in the future, and if we remove this intrinsic, we will have to come and revisit this code.

I think a single early pass to lower this intrinsic is a low cost to pay for that simplicity.

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 4:50:59 PM4/22/16

to Reid Kleckner, David Li, llvm-dev, cfe-dev, Nick Lewycky

I, of course, thought the ~100 lines added by D19299 was a reasonable trade for the ~800 lines removed in D19300.

David Li (and anyone else following along), do you still like those patches after hearing this objection or should I abandon?

Richard Smith via llvm-dev

unread,

Apr 22, 2016, 5:31:25 PM4/22/16

to Sanjay Patel, llvm-dev, cfe-dev, David Li, Nick Lewycky

As I recall, the reason we chose to emit this via an intrinsic from Clang rather than directly as profile information was to catch cases where the builtin is *not* the immediate operand of a control flow construct, but would be after some simple optimization. For instance:

bool b = __builtin_expect(n > 5, true);

if (b) { ... }

// ...

if (b) { ... }

or

if (x && __builtin_expect(y, true)) { ... }

It sounds like subsequent changes to LLVM have already made most of these cases not work, so some of the motivation for the current approach may have evaporated. What optimizations do we still perform before lowering the expect intrinsics?

_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Xinliang David Li via llvm-dev

unread,

Apr 22, 2016, 5:34:19 PM4/22/16

to Mehdi Amini, llvm-dev, David Blaikie via cfe-dev

yes, PGO data should overrule, and it is a common practice. The annotation can be stale, and the fixed branch prob implied by the annotation does not help build up the profile counts/frequency that match the real profile.

David

Xinliang David Li via llvm-dev

unread,

Apr 22, 2016, 5:49:45 PM4/22/16

to Sanjay Patel, llvm-dev, cfe-dev, David Li, Nick Lewycky

Your patch does make the handling of __builtin_unpredictable and __builtin_expect 'unified' in some way.

However given the point brought up by Reid, and the rationale by Richard, as well as consideration of the easiness for enhancement in the future, I think keep the current mechanism is a better approach.

The test cases from Richard are broken with current lowering, but it is just bugs to be fixed.

David

On Fri, Apr 22, 2016 at 1:50 PM, Sanjay Patel via cfe-dev <cfe...@lists.llvm.org> wrote:

Martin J. O'Riordan via llvm-dev

unread,

Apr 22, 2016, 6:18:39 PM4/22/16

to Xinliang David Li, Sanjay Patel, llvm-dev, David Li, Nick Lewycky

Sorry for jumping in so late into this discussion, but I genuinely believe that removing this is a bad idea.

My reason for saying this is going to sound very strange, but I think that we need to understand a bit more about how this is being handled.

A while back one of my customers asked me if there was a method for advising the compiler how an if-statement was likely to resolve, and I said “sure” and told them to use ‘__builtin_expect’.

At the time I thought that the compiler was probably predicting a higher probability or true/false than was actually the case for a particular instance, as my customer was telling me that the compiler was optimising in favour of the less probable case (something the customer knew, but the information could not be determined from the code).

This was fair enough, these things happen and the customer (or profile directed feedback) has more knowledge than the default inferences in the compiler.

They added the ‘__builtin_expect’ to provide their domain specific expert knowledge, and the performance did indeed improve as expected, with the compiler preferring the most likely branch that the programmer had indicated.

The surprise though, was when we experimentally changed the outcome of the ‘__builtin_expect’ to say the exact opposite of what the actual case was. That is, to invert the truth. The program was more performant with the “wrong” truth than it was with no statement of truth. When we told the compiler that a particular outcome was more probable than when in fact it was less, the performance was better than when we said nothing. And when we told it the actual probable outcome, it was more performance still. So telling the outcome of the branch as being more likely true, or more likely false, was better than not telling the compiler anything at all.

I must admit, this was a considerable surprise for me, but it does mean that there is something changing in that area of the code that responds differently when ‘__builtin_expect’ is used versus the inferred probabilities.

It is not something that I have investigated further as it is not a specific area that I can prioritise my efforts, but I think that it is something I have to raise awareness off in the context of this thread where removing this builtin is being proposed. At the moment I have a lot of programs that are benefitting from the explicit use of this builtin, even when the programmer directed outcome is wrong. So before we can remove this builtin, we need to explain why there is a difference on behaviour when it is present and stating a particular outcome versus it being omitted and the inferred outcome being the same.

MartinO

Sanjay Patel via llvm-dev

unread,

Apr 22, 2016, 6:45:58 PM4/22/16

to Martin....@movidius.com, llvm-dev, cfe-dev, David Li, Nick Lewycky

Hi Martin -

1. Nobody is proposing to remove __builtin_expect(); we're just debating the best way to lower that programmer hint. The question is whether we need an *llvm.expect* intrinsic or if some kind of metadata will be adequate.

2. The fact that using __builtin_expect() improved perf regardless of whether you chose the right or wrong prediction isn't too far-fetched IMO - it means that a branch (and HW prediction) was the better choice in that situation versus a select / conditional move. The compiler generated the branch, and the HW branch predictor did the rest.

3. The original motivation for this proposed cleanup is a bug where __builtin_expect() is being ignored:
https://llvm.org/bugs/show_bug.cgi?id=27344

Martin J. O'Riordan via llvm-dev

unread,

Apr 22, 2016, 6:58:32 PM4/22/16

to Sanjay Patel, llvm-dev, cfe-dev, David Li, Nick Lewycky

Thanks for the clarification Sanjay - I had somehow misunderstood the intent, but your response clear this up.

Sorry for the confusion,

MartinO

Sanjay Patel via llvm-dev

unread,

Apr 29, 2016, 5:49:03 PM4/29/16

to Martin....@movidius.com, llvm-dev, cfe-dev, David Li, Nick Lewycky

A linkfest summary of the current state of __builtin_expect()...

Abandoned:
http://reviews.llvm.org/D19299
http://reviews.llvm.org/D19300

Committed:
http://reviews.llvm.org/D19435
http://reviews.llvm.org/D19488

The motivating bug for all of the above is resolved fixed:
https://llvm.org/bugs/show_bug.cgi?id=27344

The failing examples that I noted in this thread and elsewhere are open:
https://llvm.org/bugs/show_bug.cgi?id=27541

https://llvm.org/bugs/show_bug.cgi?id=27542

I also linked this existing bug to the above:
https://llvm.org/bugs/show_bug.cgi?id=19230

And we still need to fix places in the optimizer where we drop branch weight metadata, eg:
http://reviews.llvm.org/rL267624
http://reviews.llvm.org/rL267813

Reply all

Reply to author

Forward