Page MenuHomePhabricator

[SimplifyCFG] Speculatively flatten CFG based on profiling metadata

Authored by reames on Sep 22 2015, 2:18 PM.



If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path.

The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences.

Note that this builds on the "implies" function introduced in Please ignore its definition here and direct comments to that review. Based on Sanjoy's feedback, I'm going to introduce this logic into InstSimplify and refactor both patches based on that.

Diff Detail


Event Timeline

reames updated this revision to Diff 35416.Sep 22 2015, 2:18 PM
reames retitled this revision from to [SimplifyCFG] Speculatively flatten CFG based on profiling metadata.
reames updated this object.
reames added reviewers: hfinkel, sanjoy, majnemer.
reames added a subscriber: llvm-commits.
reames updated this revision to Diff 35428.Sep 22 2015, 3:32 PM

Update based on (i.e. pulling the implication logic into InstSimplify for easy testing)

hfinkel added inline comments.Sep 23 2015, 1:22 PM
2353 ↗(On Diff #35428)

either check -> any of the checks

2382 ↗(On Diff #35428)

I'd rather that this 100 ratio be a command-line parameter for easier experimentation.

2403 ↗(On Diff #35428)

10 should be a command-line parameter too.

reames added inline comments.Sep 23 2015, 5:26 PM
2353 ↗(On Diff #35428)

Happy to make the requested change, but given this applies to a pair of branches, not quite clear why you think the new wording is more clear.

2382 ↗(On Diff #35428)

Will do.

2403 ↗(On Diff #35428)

Will do.

hfinkel added inline comments.Sep 23 2015, 5:40 PM
2353 ↗(On Diff #35428)

The comment starts with, "If we have a series of tests leading to a frequently executed fallthrough... we can test all the conditions at once...". You're right, but maybe we should just add something like, "Here, we deal with this one pair of tests at a time." So that it becomes clear that you're now talking about only two checks.

sanjoy added inline comments.Sep 23 2015, 5:53 PM
2371 ↗(On Diff #35428)

I'd drop the TryTo. LLVM usually names such functions directly (InlineFunction, UnrollLoop etc.).

2394 ↗(On Diff #35428)

Why is this needed? Since you're changing PBI's condition to be BI's condition, and BI's condition implies PBI's condition, I don't see what gets speculatively executed.

2493 ↗(On Diff #35428)

auto * is clearer, since the type is obvious from the dyn_cast.

reames added inline comments.Sep 28 2015, 10:41 AM
2394 ↗(On Diff #35428)

The problem is that instructions from BI's block are speculated above PBI. If you had some code like:
if (i > length) return;
%p = load a[i];
if (i + 1 > length) return;

After the transform, we'd end up with:
%p = load a[i];
if (i + 1 > length) {if (i > length) return; else return;}

We can't necessarily speculate the load above the first condition.

2493 ↗(On Diff #35428)

Agreed, but for the record, I just moved this line.

reames updated this revision to Diff 35895.Sep 28 2015, 10:57 AM

Address review comments by Hal and Sanjoy.

sanjoy edited edge metadata.Sep 28 2015, 12:10 PM

Minor nits inline. With those addressed, this LGTM (I'd still wait for @hfinkel to take a look).

2354 ↗(On Diff #35895)

Use dyn_cast_or_null to directly cast to Constant.

2385 ↗(On Diff #35895)

Nit: spacing is a little off here.

2405 ↗(On Diff #35895)

SGTM. :)

2412 ↗(On Diff #35895)

Should CtxI be nullptr here? We're speculating it above PBI, so I think we cannot assume control dependence on PBI.

2417 ↗(On Diff #35895)

Minor nit: InvokeInst should have been caught with isa<TerminatorInst>. In fact, we can pretty much assert(!isa<InvokeInst>(I)) since the only terminator allowed in BB is BI.

2424 ↗(On Diff #35895)

[This is optional to fix, since the actual rewriting is fairly simple]

I'd put a small "before" and "after" IR snippet here, using the C++ identifiers in IR, like:

// Before:
//  br %whichCond, label %bb, label %fail1  ;; == PBB
// bb:
//  br %cond, label %success, label %fail2  ;; == BB


I think that will make the graph rewriting clearer.

2427 ↗(On Diff #35895)

I'd call these FailBI and FailPBI.

reames updated this revision to Diff 35913.Sep 28 2015, 3:07 PM
reames edited edge metadata.

Respond to Sanjoy's review comments. Given he gave a contingent LGTM, I'm going to let this sit for a day in case anyone has final comments, then submit.

This revision was automatically updated to reflect the committed changes.