This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Improve likelihood attribute branch weights
ClosedPublic

Authored by Mordante on Sep 26 2020, 10:23 AM.

Details

Summary

Bruno De Fraine discovered some issues with D85091. The branch weights generated for logical not and ternary conditional were wrong. The logical and and logical or differed from the code generated of __builtin_predict.

Adjusted the generated code for the likelihood to match __builtin_predict. The patch is based on Bruno's suggestions.

Diff Detail

Event Timeline

Mordante requested review of this revision.Sep 26 2020, 10:23 AM
Mordante created this revision.
bdf accepted this revision.Sep 30 2020, 2:40 AM
bdf added a subscriber: bdf.

Looks good. Good idea to add tests to verify that we match __builtin_except().

This revision is now accepted and ready to land.Sep 30 2020, 2:40 AM
aaron.ballman added inline comments.Sep 30 2020, 5:25 AM
clang/lib/CodeGen/CodeGenFunction.cpp
1639

Why LH_None in this case? (Also, given that it's the default value, should you skip passing this at all? Or should the parameter not have a default value, perhaps?)

1694–1700

This comment is now a bit stale.

Mordante marked 2 inline comments as done.Oct 3 2020, 5:48 AM
Mordante added inline comments.
clang/lib/CodeGen/CodeGenFunction.cpp
1639

This should be LH_None since this is the likelihood of C ? T : F and there's no likelihood for this expression. When used in if(C ? T : F) the if can have an likelihood attribute for the if. That's why the other two calls in this block have a likelihood. I'll add comment to make it clear why this is correct. I felt using the explicit value here instead of the defaulted value was clearer. This would be the only call in this function not to explicitly use a value.

Since EmitBranchOnBoolExpr is also used at other places I prefer to keep the default argument.

1694–1700

Agreed, I'll remove it.

aaron.ballman accepted this revision.Oct 3 2020, 7:06 AM

LGTM modulo comment nits.

clang/lib/CodeGen/CodeGenFunction.cpp
1639

Thank you for the explanation!

This revision was landed with ongoing or failed builds.Oct 4 2020, 5:24 AM
This revision was automatically updated to reflect the committed changes.
Mordante marked 2 inline comments as done.
jmorse added a subscriber: jmorse.Oct 5 2020, 9:59 AM

Hi -- We (Sony) are running into a bit of difficulty with the test for this change, as it relies on the configuration of the -O1 optimisation pipeline. Would it be possible to reduce down to a frontend test, and then tests for whatever passes are to interpret the IR produced by clang?

Hi -- We (Sony) are running into a bit of difficulty with the test for this change, as it relies on the configuration of the -O1 optimisation pipeline. Would it be possible to reduce down to a frontend test, and then tests for whatever passes are to interpret the IR produced by clang?

Can you explain the kind of issues you're having?
Currently the code, like PGO, requires at least -O1 to be effective. At -O0 the attributes don't have any effect.
I looked at other CodeGen tests using __builtin_expect. They are using -O1 -disable-llvm-passes, would that solve your issue?

jmorse added a comment.EditedOct 8 2020, 5:54 AM

Can you explain the kind of issues you're having?

At the shallowest level, our -O1 produces different IR and fails the test, which is more or less our problem; however my understanding is that tests in the LLVM project / subprojects should aim to test as little amount of code as possible. Relying on all of -O1 makes it a brittle test -- changes to any optimisation pass enabled in -O1 could cause this test to fail spuriously.

Instead, I believe the test should be in two parts:

  • One checking clang produces the correct /unoptimised/ IR output
  • One or more checking that the consuming IR passes do-the-right-thing

An example is (some of) the TBAA tests -- as you suggest, they're using -O1 -disable-llvm-passes to check that clang produces stores with !tbaa metadata attached. Then over in the LLVM optimisation passes there are tests checking that GVN / LICM etc correctly consume TBAA metadata. Note that I'm unfamiliar with how branch weights work, but I believe the principle is the same.

bdf added a comment.Oct 8 2020, 6:26 AM

Can you explain the kind of issues you're having?

At the shallowest level, our -O1 produces different IR and fails the test, which is more or less our problem; however my understanding is that tests in the LLVM project / subprojects should aim to test as little amount of code as possible. Relying on all of -O1 makes it a brittle test -- changes to any optimisation pass enabled in -O1 could cause this test to fail spuriously.

Instead, I believe the test should be in two parts:

  • One checking clang produces the correct /unoptimised/ IR output
  • One or more checking that the consuming IR passes do-the-right-thing

As I see, the intent of the test is not so much to verify a certain expected output, but more to verify that two styles of likelihood hints in C code produce the same code structure and branch weights. Theses styles are likely/unlikely-annotations, and use of __builtin_expect in the if condition. But the processing of these two is quite different:

  • for likely/unlikely annotations, branch weights are added immediately in the initial CodeGen
  • __builtin_expect is first translated straightforward to an expect intrinsic, then processed by a later lower-expect pass

To make the test less brittle, would it be possible to explicitly select only the optimization passes that are needed?

In D88363#2319242, @bdf wrote:

Can you explain the kind of issues you're having?

At the shallowest level, our -O1 produces different IR and fails the test, which is more or less our problem; however my understanding is that tests in the LLVM project / subprojects should aim to test as little amount of code as possible. Relying on all of -O1 makes it a brittle test -- changes to any optimisation pass enabled in -O1 could cause this test to fail spuriously.

Instead, I believe the test should be in two parts:

  • One checking clang produces the correct /unoptimised/ IR output
  • One or more checking that the consuming IR passes do-the-right-thing

As I see, the intent of the test is not so much to verify a certain expected output, but more to verify that two styles of likelihood hints in C code produce the same code structure and branch weights. Theses styles are likely/unlikely-annotations, and use of __builtin_expect in the if condition. But the processing of these two is quite different:

  • for likely/unlikely annotations, branch weights are added immediately in the initial CodeGen
  • __builtin_expect is first translated straightforward to an expect intrinsic, then processed by a later lower-expect pass

To make the test less brittle, would it be possible to explicitly select only the optimization passes that are needed?

Indeed verifying the output of the likelihood attributes against __builtin_expect is exactly what's required. But I think I can make the test less brittle by using the following command. This only runs the lower expect pass, which lowers the __builtin_expect.
RUN: %clang_cc1 -O1 -disable-llvm-passes -emit-llvm %s -o - -triple=x86_64-linux-gnu | opt --lower-expect -S | FileCheck %s

I'll work on a patch to solve the issue.

I created D89204 which hopefully fixes Sony's issue.