This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold.
This will help the branch prediction and avoids the extra compares when lowering into chain of branches.
A previous patch of the same idea was on CFG simplification pass:
https://reviews.llvm.org/D37940
This patch is implemented in SelectionDAG.
Details
Diff Detail
Event Timeline
Thanks! I think doing it here rather than in SimplifyCFG is much better.
I think this can be made both more powerful and simpler, though.
First, I think we want to peel a little later, at least after adjacent cases have been merged into clusters. For example if we have case 0: case 1: case 2: foo; and the three together dominate the switch, we probably want to peel that whole cluster instead of only peeling one case. We could even do this after clustering for jump tables, but maybe we don't want to do that. I.e. if one case dominates heavily, maybe it's worth peeling the case separately rather than peeling the jump table out of the switch. On the other hand, if it dominates completely, that jump table should be well predicted...
In any case, to do the peeling I think we should re-use the current lowering mechanism. This should be done at some point before creating the WorkList. Loop through Clusters to find one that dominates. If we find it, call lowerWorkItem with a single work item which represents that cluster to lower it, update the blocks etc. Then remove the peeled cluster from Clusters and proceed as usual.
What do you think?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
139 | Could we have just one flag instead of two separate ones? I.e. one flag that enables peeling and sets the threshold (and setting it to 0 would turn it off or something)? | |
141 | How was the value 66 chosen? |
Thanks! I think that's looking better.
Sorry for the slow reply; I was travelling the last couple of days. Some comments inline.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
138 | Perhaps the option name could make it more clear that this is about switch peeling. Maybe "-switch-peel-threshold"? | |
141 | I'm still curious about this. | |
9838 | No need for 'inline'. Can you add a comment that explains what this does? | |
9846 | Please comment on what the return value is, that is updates Clusters, and what the PelledProb value is. | |
9848 | Maybe call it peelDominantCaseCluster instead, since it's operating on the CaseClusters. | |
9864 | Why are you skipping clusters where Low != High? | |
9865 | This requires CC.Prob so be strictly greater than the DominantCasePercentThreshold, not just greater or equal. Is that intentional? I'd assume that if the threshold is 80% probability, and we have a case with 80% probability, we want to peel it. | |
9879 | Is PeeledSwitchMBB the MBB that contains the peeled switch (i.e. the switch minus the most dominant case), or the one containing the peeled case? I wonder if we could find better names, or add comments to make this clear. | |
9882 | Maybe move this down to where it's used (when creating W). | |
9923 | Could we just update SwitchMBB instead, if that's the one that has the rest of the switch? (I'm a little confused by what SwitchMBB actually is at this point). | |
10001 | When would DefaultMBB get replaced? If it can, it needs a test case. | |
test/CodeGen/Generic/switch-lower-peel-top-case.ll | ||
14 | Checking the debug printouts doesn't seem very nice. Can't we check the resulting machine-IR and make sure it has the correct probabilities? |
Hans, Thanks for the detailed reviews. Please check if the updated patch addresses your reviews.
One change not in the reviews is I move the code after rangeify clusters. This is together with the suggested change that removing Low != High check.
I also change the code a bit so that if the single case is removed, generate BR to the default directly.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
141 | We have a case of ~70% that we would like to peel in our internal test. I lower a few percent as the buffering. | |
9846 | Also changed the parameter for better clarity. | |
9864 | I was concerning that this would create two jumps for the range. But | |
9865 | you are right! changed it to "<" | |
9879 | Containing the peeled switch statement. I added a comment here. | |
9923 | My first try was to update SwitchMBB directly, but it did not work. It seems that we need SwitchMBB as the placehold to decide if we defer the case handling (i.e visitSwitchCase() or push_back to SwitchCases.) | |
10001 | We replace the unreachable default with most popular destination. This actually exposed by an existing test case. |
Starting to look really good.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
9881 | nit: "If there is only case is peeled." sentence looks weird. | |
9890 | Actually, if there's only one cluster, maybe we shouldn't be peeling. I mean, it doesn't make much sense right? And maybe that would help downstream code that could then assume that there are still cases left after peeling. | |
test/CodeGen/Generic/switch-lower-peel-top-case.ll | ||
13 | This could probably be simplified a bit. If %n were an i32 instad of i16 you don't need the sext. You could drop local_unnamed_addr. The case labels look pretty random; maybe chose cases with more purpose, and rename the target blocks to match. It would also be good to test the case where two adjacent cases are merged into a cluster and peeled. |
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
9881 | No need with the new patch. | |
9890 | Ah. Right. Current logic handles it well. I will have a check that bailout earlier. | |
10001 | When the DefaultMBB is changed (replaced with the most popular case), the branch probability not longer consistent. If I scale the original default case branch probability, it could create a insane probability (greater than 100%). | |
test/CodeGen/Generic/switch-lower-peel-top-case.ll | ||
13 | This is extracted from a real program. But I agree that it should be simplified by removing the unrelated details. I will change. | |
13 | Sorry I might misread your earlier comments. Do you mean I should change the case value? |
test/CodeGen/Generic/switch-lower-peel-top-case.ll | ||
---|---|---|
13 | Yes. "The case labels look pretty random; maybe chose cases with more purpose, and rename the target blocks to match. It would also be good to test the case where two adjacent cases are merged into a cluster and peeled." |
test/CodeGen/Generic/switch-lower-peel-top-case.ll | ||
---|---|---|
72 | Actually, I think this test would pass even without switch peeling. |
Added more switch cases to make the lowering generate binary search compares.
Thanks to Hans's suggestion.
I also moved the test case to from test/CodeGenGeneric/ test/CodeGen/X86/ as it now checks the IR for x86.
Perhaps the option name could make it more clear that this is about switch peeling. Maybe "-switch-peel-threshold"?