[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
ClosedPublic

Authored by spatel on Apr 25 2016, 10:35 AM.

Details

Summary

This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

As noted in the bug report, I have a couple of questions about how to implement this. I've taken my best guesses at those questions in this patch:

  1. Should SimplifyCFG use metadata and not create a select in the first place for an obviously predictable branch? Or should CGP be responsible for undoing that transform?

I decided that CGP should undo the transform for the same reason that earlier patches have used the same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place.

  1. Since we're relying on branch weight metadata, we need a TLI hook to determine just how lopsided that data must be before favoring branches over a select. What's a good default value for that ratio?

I selected >99% taken or not taken as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. I proposed to change that in D19435.

Diff Detail

Repository
rL LLVM
spatel retitled this revision from to [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch.Apr 25 2016, 10:35 AM
spatel updated this object.
spatel added a subscriber: llvm-commits.
This revision is now accepted and ready to land.Apr 25 2016, 1:14 PM
hfinkel added inline comments.Apr 25 2016, 1:28 PM
lib/CodeGen/CodeGenPrepare.cpp
4558 ↗(On Diff #54874)

Does this handle both extremes (i.e. almost always true and almost always false)?

spatel added inline comments.Apr 25 2016, 1:51 PM
lib/CodeGen/CodeGenPrepare.cpp
4558 ↗(On Diff #54874)

Yes - since we're using the max value of true or false, it will work.

It was just laziness for me to not include that test case. Let me add it and update the patch. Thanks!

hfinkel accepted this revision.Apr 25 2016, 2:05 PM
hfinkel added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
4558 ↗(On Diff #54874)

Ah, sounds good. LGTM then too.

spatel updated this revision to Diff 54903.Apr 25 2016, 2:11 PM

Patch updated:

  1. Added test case to show that likely true and likely false are both handled.
  2. Made auto-generated checks a bit more flexible by using regex.
davidxl added inline comments.Apr 25 2016, 2:24 PM
include/llvm/Target/TargetLowering.h
268 ↗(On Diff #54874)

hard coding default value like this make it hard to do performance experiment -- suggest an internal option to control

lib/CodeGen/CodeGenPrepare.cpp
4551 ↗(On Diff #54874)

Why is unpredictable meta data is not looked at here?

lib/IR/Metadata.cpp
1263 ↗(On Diff #54874)

This fix can be committed as a different patch?

test/CodeGen/X86/cmov-into-branch.ll
99 ↗(On Diff #54874)

Worth adding a test case where probability is < 1% ?

davidxl added inline comments.Apr 25 2016, 2:26 PM
lib/CodeGen/CodeGenPrepare.cpp
4558 ↗(On Diff #54903)

Hal, this is an example that profile meta data is directly looked at instead of via BPI (I don't have an issue with it -- but just a note this scenario exists).

hfinkel added inline comments.Apr 25 2016, 2:33 PM
include/llvm/Target/TargetLowering.h
268 ↗(On Diff #54903)

Agreed. A command-line option to override the default would be significantly more convenient.

lib/CodeGen/CodeGenPrepare.cpp
4558 ↗(On Diff #54903)

I was thinking that :-) -- but this use did not seem to fit into BPI's interface. If/when we start preserving BPI, we'd need to update here, however.

spatel added inline comments.Apr 25 2016, 2:45 PM
include/llvm/Target/TargetLowering.h
268 ↗(On Diff #54903)

Yes - will fix and update the patch.

lib/CodeGen/CodeGenPrepare.cpp
4551 ↗(On Diff #54903)

Good point - we use it in splitBranchCondition() but forgot to update this path.
I'll fix that ahead of this patch.

lib/IR/Metadata.cpp
1263 ↗(On Diff #54903)

Sure - I'll check-in the extra check + comment + variable name changes and update this.

test/CodeGen/X86/cmov-into-branch.ll
118 ↗(On Diff #54903)

Hmm - do you want to check for something different than 'weighted_select1()' above?

davidxl added inline comments.Apr 25 2016, 4:04 PM
test/CodeGen/X86/cmov-into-branch.ll
118 ↗(On Diff #54903)

Ok - I mis-read the case -- what needs to be covered is already covered. A minor suggestion -- the tests (select2, select3) are intended to test that meta data can drive decision to create branch -- but not necessarily to test the actual default 99%,1% threshold. For this reason, it might be better to make !1 and !2 prof data more extreme so that the tests are more robust.

flyingforyou accepted this revision.Apr 26 2016, 3:26 AM

LGTM.

test/CodeGen/X86/cmov-into-branch.ll
101 ↗(On Diff #54903)

Please, remove trailing white space.

spatel marked 6 inline comments as done.Apr 26 2016, 8:43 AM
spatel added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
4551 ↗(On Diff #54903)
lib/IR/Metadata.cpp
1263 ↗(On Diff #54903)
test/CodeGen/X86/cmov-into-branch.ll
118 ↗(On Diff #54903)

I actually want these tests to perform double-duty:

  1. They check that profile data is driving the codegen.
  2. They check the exact boundary where that decision changes.

If we make the prof data more extreme, then someone can come by in the future and change the default threshold without breaking any tests. I don't think we want that. We want a change in the default value to be justified and testable.

spatel updated this revision to Diff 55014.Apr 26 2016, 8:57 AM
spatel marked an inline comment as done.

Patch updated:

  1. Added a cl::opt and code comment for the default min percentage threshold to enable the transform.
  2. Rebased to account for rL267491 and rL267504 .
  3. Removed trailing whitespace in test file.
davidxl accepted this revision.Apr 26 2016, 9:14 AM

lgtm

This revision was automatically updated to reflect the committed changes.