This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Speculatively flatten CFG based on profiling metadata
ClosedPublic

Authored by reames on Sep 22 2015, 2:18 PM.

Details

Summary

If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path.

The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences.

Note that this builds on the "implies" function introduced in http://reviews.llvm.org/D13040. Please ignore its definition here and direct comments to that review. Based on Sanjoy's feedback, I'm going to introduce this logic into InstSimplify and refactor both patches based on that.

Diff Detail

Event Timeline

reames updated this revision to Diff 35416.Sep 22 2015, 2:18 PM
reames retitled this revision from to [SimplifyCFG] Speculatively flatten CFG based on profiling metadata.
reames updated this object.
reames added reviewers: hfinkel, sanjoy, majnemer.
reames added a subscriber: llvm-commits.
reames updated this revision to Diff 35428.Sep 22 2015, 3:32 PM

Update based on http://reviews.llvm.org/D13074 (i.e. pulling the implication logic into InstSimplify for easy testing)

hfinkel added inline comments.Sep 23 2015, 1:22 PM
lib/Transforms/Utils/SimplifyCFG.cpp
2364

either check -> any of the checks

2393

I'd rather that this 100 ratio be a command-line parameter for easier experimentation.

2414

10 should be a command-line parameter too.

reames added inline comments.Sep 23 2015, 5:26 PM
lib/Transforms/Utils/SimplifyCFG.cpp
2364

Happy to make the requested change, but given this applies to a pair of branches, not quite clear why you think the new wording is more clear.

2393

Will do.

2414

Will do.

hfinkel added inline comments.Sep 23 2015, 5:40 PM
lib/Transforms/Utils/SimplifyCFG.cpp
2364

The comment starts with, "If we have a series of tests leading to a frequently executed fallthrough... we can test all the conditions at once...". You're right, but maybe we should just add something like, "Here, we deal with this one pair of tests at a time." So that it becomes clear that you're now talking about only two checks.

sanjoy added inline comments.Sep 23 2015, 5:53 PM
lib/Transforms/Utils/SimplifyCFG.cpp
2382

I'd drop the TryTo. LLVM usually names such functions directly (InlineFunction, UnrollLoop etc.).

2405

Why is this needed? Since you're changing PBI's condition to be BI's condition, and BI's condition implies PBI's condition, I don't see what gets speculatively executed.

2507

auto * is clearer, since the type is obvious from the dyn_cast.

reames added inline comments.Sep 28 2015, 10:41 AM
lib/Transforms/Utils/SimplifyCFG.cpp
2405

The problem is that instructions from BI's block are speculated above PBI. If you had some code like:
if (i > length) return;
%p = load a[i];
if (i + 1 > length) return;

After the transform, we'd end up with:
%p = load a[i];
if (i + 1 > length) {if (i > length) return; else return;}

We can't necessarily speculate the load above the first condition.

2507

Agreed, but for the record, I just moved this line.

reames updated this revision to Diff 35895.Sep 28 2015, 10:57 AM

Address review comments by Hal and Sanjoy.

sanjoy edited edge metadata.Sep 28 2015, 12:10 PM

Minor nits inline. With those addressed, this LGTM (I'd still wait for @hfinkel to take a look).

lib/Transforms/Utils/SimplifyCFG.cpp
2354

Use dyn_cast_or_null to directly cast to Constant.

2385

Nit: spacing is a little off here.

2405

SGTM. :)

2412

Should CtxI be nullptr here? We're speculating it above PBI, so I think we cannot assume control dependence on PBI.

2417

Minor nit: InvokeInst should have been caught with isa<TerminatorInst>. In fact, we can pretty much assert(!isa<InvokeInst>(I)) since the only terminator allowed in BB is BI.

2424

[This is optional to fix, since the actual rewriting is fairly simple]

I'd put a small "before" and "after" IR snippet here, using the C++ identifiers in IR, like:

// Before:
//
//  br %whichCond, label %bb, label %fail1  ;; == PBB
//
// bb:
//  br %cond, label %success, label %fail2  ;; == BB
//

etc.

I think that will make the graph rewriting clearer.

2427

I'd call these FailBI and FailPBI.

reames updated this revision to Diff 35913.Sep 28 2015, 3:07 PM
reames edited edge metadata.

Respond to Sanjoy's review comments. Given he gave a contingent LGTM, I'm going to let this sit for a day in case anyone has final comments, then submit.

This revision was automatically updated to reflect the committed changes.