[CodeGen] Peel off the dominant case in switch statement in lowering
ClosedPublic

Authored by xur on Tue, Oct 24, 4:03 PM.

Details

Summary

This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold.
This will help the branch prediction and avoids the extra compares when lowering into chain of branches.
A previous patch of the same idea was on CFG simplification pass:
https://reviews.llvm.org/D37940
This patch is implemented in SelectionDAG.

Diff Detail

Repository
rL LLVM
xur created this revision.Tue, Oct 24, 4:03 PM
hans added a comment.Wed, Oct 25, 3:31 AM

Thanks! I think doing it here rather than in SimplifyCFG is much better.

I think this can be made both more powerful and simpler, though.

First, I think we want to peel a little later, at least after adjacent cases have been merged into clusters. For example if we have case 0: case 1: case 2: foo; and the three together dominate the switch, we probably want to peel that whole cluster instead of only peeling one case. We could even do this after clustering for jump tables, but maybe we don't want to do that. I.e. if one case dominates heavily, maybe it's worth peeling the case separately rather than peeling the jump table out of the switch. On the other hand, if it dominates completely, that jump table should be well predicted...

In any case, to do the peeling I think we should re-use the current lowering mechanism. This should be done at some point before creating the WorkList. Loop through Clusters to find one that dominates. If we find it, call lowerWorkItem with a single work item which represents that cluster to lower it, update the blocks etc. Then remove the peeled cluster from Clusters and proceed as usual.

What do you think?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
139 ↗(On Diff #120144)

Could we have just one flag instead of two separate ones? I.e. one flag that enables peeling and sets the threshold (and setting it to 0 would turn it off or something)?

141 ↗(On Diff #120144)

How was the value 66 chosen?

xur updated this revision to Diff 120702.Fri, Oct 27, 2:47 PM

Integrated the review comments and suggestion from Hans.

Thanks,

-Rong

hans added a comment.Wed, Nov 1, 10:39 AM

Thanks! I think that's looking better.

Sorry for the slow reply; I was travelling the last couple of days. Some comments inline.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
138 ↗(On Diff #120702)

Perhaps the option name could make it more clear that this is about switch peeling. Maybe "-switch-peel-threshold"?

9839 ↗(On Diff #120702)

No need for 'inline'. Can you add a comment that explains what this does?

9847 ↗(On Diff #120702)

Please comment on what the return value is, that is updates Clusters, and what the PelledProb value is.

9849 ↗(On Diff #120702)

Maybe call it peelDominantCaseCluster instead, since it's operating on the CaseClusters.

9865 ↗(On Diff #120702)

Why are you skipping clusters where Low != High?

9866 ↗(On Diff #120702)

This requires CC.Prob so be strictly greater than the DominantCasePercentThreshold, not just greater or equal. Is that intentional? I'd assume that if the threshold is 80% probability, and we have a case with 80% probability, we want to peel it.

9880 ↗(On Diff #120702)

Is PeeledSwitchMBB the MBB that contains the peeled switch (i.e. the switch minus the most dominant case), or the one containing the peeled case? I wonder if we could find better names, or add comments to make this clear.

9883 ↗(On Diff #120702)

Maybe move this down to where it's used (when creating W).

9917 ↗(On Diff #120702)

Could we just update SwitchMBB instead, if that's the one that has the rest of the switch? (I'm a little confused by what SwitchMBB actually is at this point).

9994 ↗(On Diff #120702)

When would DefaultMBB get replaced? If it can, it needs a test case.

141 ↗(On Diff #120144)

I'm still curious about this.

test/CodeGen/Generic/switch-lower-peel-top-case.ll
13 ↗(On Diff #120702)

Checking the debug printouts doesn't seem very nice. Can't we check the resulting machine-IR and make sure it has the correct probabilities?

xur marked 7 inline comments as done.Wed, Nov 1, 1:23 PM

Hans, Thanks for the detailed reviews. Please check if the updated patch addresses your reviews.

One change not in the reviews is I move the code after rangeify clusters. This is together with the suggested change that removing Low != High check.

I also change the code a bit so that if the single case is removed, generate BR to the default directly.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9847 ↗(On Diff #120702)

Also changed the parameter for better clarity.

9865 ↗(On Diff #120702)

I was concerning that this would create two jumps for the range. But
this turns out not to be the case. I moved this condition.

9866 ↗(On Diff #120702)

you are right! changed it to "<"

9880 ↗(On Diff #120702)

Containing the peeled switch statement. I added a comment here.

9917 ↗(On Diff #120702)

My first try was to update SwitchMBB directly, but it did not work. It seems that we need SwitchMBB as the placehold to decide if we defer the case handling (i.e visitSwitchCase() or push_back to SwitchCases.)

9994 ↗(On Diff #120702)

We replace the unreachable default with most popular destination. This actually exposed by an existing test case.

141 ↗(On Diff #120144)

We have a case of ~70% that we would like to peel in our internal test. I lower a few percent as the buffering.

xur updated this revision to Diff 121176.Wed, Nov 1, 1:24 PM
xur marked 2 inline comments as done.

Integrated with Hans's review comments.

hans added a comment.Wed, Nov 1, 2:24 PM

Starting to look really good.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9881 ↗(On Diff #121176)

nit: "If there is only case is peeled." sentence looks weird.

9890 ↗(On Diff #121176)

Actually, if there's only one cluster, maybe we shouldn't be peeling. I mean, it doesn't make much sense right? And maybe that would help downstream code that could then assume that there are still cases left after peeling.

test/CodeGen/Generic/switch-lower-peel-top-case.ll
12 ↗(On Diff #121176)

This could probably be simplified a bit. If %n were an i32 instad of i16 you don't need the sext. You could drop local_unnamed_addr.

The case labels look pretty random; maybe chose cases with more purpose, and rename the target blocks to match. It would also be good to test the case where two adjacent cases are merged into a cluster and peeled.

xur updated this revision to Diff 121204.Wed, Nov 1, 4:04 PM
xur marked 2 inline comments as done.

Here is the patch that Integrated with the review comments from Hans.

Thanks!

-Rong

hans added inline comments.Wed, Nov 1, 4:17 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9994 ↗(On Diff #120702)

I still don't understand why it matters here if DefaultMBB has changed?

test/CodeGen/Generic/switch-lower-peel-top-case.ll
12 ↗(On Diff #121176)

The part about case values, and test for peeling two adjacent cases still applies.

xur added inline comments.Wed, Nov 1, 4:38 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9994 ↗(On Diff #120702)

When the DefaultMBB is changed (replaced with the most popular case), the branch probability not longer consistent. If I scale the original default case branch probability, it could create a insane probability (greater than 100%).

9881 ↗(On Diff #121176)

No need with the new patch.

9890 ↗(On Diff #121176)

Ah. Right. Current logic handles it well. I will have a check that bailout earlier.

test/CodeGen/Generic/switch-lower-peel-top-case.ll
12 ↗(On Diff #121176)

This is extracted from a real program. But I agree that it should be simplified by removing the unrelated details. I will change.

12 ↗(On Diff #121176)

Sorry I might misread your earlier comments. Do you mean I should change the case value?

hans added inline comments.Fri, Nov 3, 8:55 AM
test/CodeGen/Generic/switch-lower-peel-top-case.ll
12 ↗(On Diff #121176)

Yes.

"The case labels look pretty random; maybe chose cases with more purpose, and rename the target blocks to match. It would also be good to test the case where two adjacent cases are merged into a cluster and peeled."

xur updated this revision to Diff 121797.Mon, Nov 6, 3:04 PM

Add a test case that tests merging adjacent cases.

Thanks!

hans added inline comments.Mon, Nov 6, 4:20 PM
test/CodeGen/Generic/switch-lower-peel-top-case.ll
72 ↗(On Diff #121797)

Actually, I think this test would pass even without switch peeling.
Since there are only three clusters, it will do straight-line comparisons, with the most probable one (i.e. 85-86) first.
I'd suggest adding a few more cases, so that the remaining cases are lowered with a binary tree. That will show more clearly that the switch is getting peeled.

xur updated this revision to Diff 121924.Tue, Nov 7, 10:03 AM

Added more switch cases to make the lowering generate binary search compares.
Thanks to Hans's suggestion.

I also moved the test case to from test/CodeGenGeneric/ test/CodeGen/X86/ as it now checks the IR for x86.

hans accepted this revision.Tue, Nov 7, 10:07 AM

lgtm

Thanks for working on this!

This revision is now accepted and ready to land.Tue, Nov 7, 10:07 AM
This revision was automatically updated to reflect the committed changes.