When there's a tie between partitionings of jump tables, consider also cases that result in no jump tables, but in one or a few cases. The motivation is that many contemporary processors perform case switches fairly quickly.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/include/llvm/Target/TargetLowering.h | ||
---|---|---|
1032 ↗ | (On Diff #73347) | Having both this and getMinimumJumpTableEntries seems pretty confusing. |
1040 ↗ | (On Diff #73347) | I'll come back with more comments, but wanted to mention this one especially. First, the name seems backwards. But the functionality of this seems very confusing. What does generating fewer jump tables "all else being equal" mean? Why not just turn off jump tables in that case? |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
8550 ↗ | (On Diff #73347) | I really don't like this. The purpose of this algorithm is to find jump tables. If you don't want to find jump tables, don't run it. If you think it's generating too large or sparse jump tables, tweak those parameters. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8550 ↗ | (On Diff #73347) | I'd like to have the option to choose more of fewer jump tables when the number of partitions is the same. It's not clear to me why only one possibility was cast in the code originally, but it doesn't seem to me to be always the best choice. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8550 ↗ | (On Diff #73347) |
It's because the whole purpose of the code is to find jump tables. Consider the @optimal_jump_table2 test in test/CodeGen/X86/switch.ll. The cases in the switch can be partitioned as {0,1,2,9},{14,15} or {0,1,2},{9,14,15}. The code ensures that the partitioning with the jump table is chosen. If you think that's the wrong lowering for your target, maybe increasing the minimum jump table entries is what you want. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8550 ↗ | (On Diff #73347) | Consider that function jt1 in test/CodeGen/AArch64/max-jump-table.ll ends up with partitions {0..7}, {8..12},{13..16} with -max-jump-table=8, whereas arguably having its jump table partitioned as {0..7},{8..15} and an extra link in the if-then-else chain would be better performing. I find the hard wired choice of always favoring more jump tables arbitrary. The algorithm works rather well as is if favoring fewer jump tables. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8550 ↗ | (On Diff #73347) | It's not arbitrary; it makes the algorithm work, as shown by the example in my previous comment. Your example is interesting, though. I guess the question is, how do we teach the algorithm that {0..7},{8..15},{16} is better than {0..7},{8..12},{13..16}? The "pick the partitioning with more jump tables" rule is there to break ties in situations like my previous example: {0,1,2,9},{14,15} vs {0,1,2},{9,14,15}. However, a partition with a single case (cluster), {16} in your example is clearly as good as a jump table -- in fact it's better. To model this, instead of having NumTables[i], we could have a PartitioningScore[i]. In addition to scoring 1 when the partition has >= MinJumpTableEntries, we could score 2 for a single-element partition (and perhaps also 1 for a two-element partition). What do you think? |
llvm/include/llvm/Target/TargetLowering.h | ||
---|---|---|
1030 ↗ | (On Diff #74682) | Is deleting this function body intentional? Dito for the "set" method below. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
8484 ↗ | (On Diff #74682) | The comment needs to say what the vector index means. How about: "ParitionsScore[i]: Score of partitions for Clusters[i..N-1]. The score is used to break ties when choosing between two partitionings resulting in the same number of partitions." Note that this needs to be the sum of the scores for the individual partitions. I've commented about this below. |
8490 ↗ | (On Diff #74682) | This base case means putting Clusters[N-1] into a partition of its own, so the score should be that of a one-element partition. |
8498 ↗ | (On Diff #74682) | Same here: this case means putting Clusters[i] into a partition of its own. The total score should be that of a one-element partition plus PartitionScore[i + 1]. |
8511 ↗ | (On Diff #74682) | Hmm, I wonder if we should start simple and just check for the jump table and one-element cases. (That name isn't great either -- SmallJumpTableEntries sounds like it's the number of elements in a small jump table, but that's not the case.) |
8513 ↗ | (On Diff #74682) | Hmm, the computation of this score is subtle. A one-element partition gets a higher score than a jump table, but only because it's also <= SmallJumpTableEntries. I'd prefer to spell things out here with an if-else chain and symbolic constants for the scores instead. Also, we need the score for the whole partitioning, not just this partition, so I think this needs to be something like Score = ThisScore + (j == N - 1 ? 0 : PartitionScore[j + 1]); |
llvm/include/llvm/Target/TargetLowering.h | ||
---|---|---|
1030 ↗ | (On Diff #74682) | Yes, since I moved them to lib/CodeGen/TargetLoweringBase.cpp as I created the cl::opt min-jump-table-entries to allow easy experimentation with this value. |
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
8484 ↗ | (On Diff #74682) | Accumulating the scores seemed to distort the relative merits of the partitions. I got more consistent results by isolating the score to just the individual partition. |
8490 ↗ | (On Diff #74682) | Right. |
8511 ↗ | (On Diff #74682) | It seemed to me that, if a target maintainer changes min-jump-table-entries, it's an important metric. It made sense to use a threshold that'd be related to this metric. |
8513 ↗ | (On Diff #74682) | I'll spell out in a comment what is intended. But I'm still debating with myself if the scoring should be cumulative or not. |
8550 ↗ | (On Diff #73347) | I really like your idea. I'll try to translate it into code and run some tests before updating the diff. Watch this space. |
Apologies for the slow reply.
I think this is getting close.. some comments inline.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8485 ↗ | (On Diff #75081) | I would call these scores consistently instead of weights. |
8488 ↗ | (On Diff #75081) | I'd drop "Enum" from the name, just "PartitionScore" sounds like a good name to me. |
8489 ↗ | (On Diff #75081) | Hmm, "None" doesn't seem like a great name. Maybe "Default" is better? This would be used for partitions that are too small to be a table, but not small enough to get a better score. I'm not sure what to name that though.. How about something like: NoTable = 0 |
8498 ↗ | (On Diff #75081) | Since this is a one-element partition, shouldn't it be PartitionScoreEnum::One? |
8506 ↗ | (On Diff #75081) | I think this should be "+ PartitionScoreEnum::One" instead of "+ 1" since it's the score for a single-element partition we want to add. |
8509 ↗ | (On Diff #75081) | Why change > to >= here? |
8524 ↗ | (On Diff #75081) | I think it should be 0 instead of PartitionScoreEnum::None here. The "j == N - 1" is used to check whether the current partition is the last one, which means there is no score to add from later partitions. |
8528 ↗ | (On Diff #75081) | It would be clearer to just have another else if-bransh instead of combining two cases into one here. |
It might also be worth splitting this patch into two: one for this new way of breaking ties between equal number of partitions, and another one for the minimum jump table entries change.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8489 ↗ | (On Diff #75081) | Sorry, I must have slipped while typing :-) I meant FewCases and Table would be = 1, and SingleCase = 2. |
8509 ↗ | (On Diff #75081) | But the code is considering it above, under "Baseline: Put Clusters[i] into a partition on its own." |
Thanks! I think the functionality is really good, and this is almost ready to go in. I only have a few stylistic nits left.
Maybe the patch title could be more descriptive, because we're not just reducing the number of jump tables here, we're specifically making the code smarter about how to break ties when partitioning a switch in search for jump tables. So maybe: "Switch lowering: break ties smarter when partitioning switch for jump tables"
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8482 ↗ | (On Diff #75484) | Since this isn't just the score of one partition, but all partitions starting with the i'th cluster, the name should reflect that, e.g. PartitionsScore or PartitioningScore. |
8485 ↗ | (On Diff #75484) | The wording here doesn't seem very clear. What does "Score relative to generating a jump table to score partitions." mean? We've already commented on how this works for the PartitionsScore vector above, so maybe we can just refer to that: "For PartitionsScore, a small number of comparisons is considered as good as a jump table and ... ". |
8524 ↗ | (On Diff #75484) | I'd suggest checking IsOne first, then IsSmall and IsTable last to avoid the "IsSmall && !IsOne" situation below. Also, I'm not sure how much the bool variables buy us here; it might be just as clear to just put the conditions in the if statements directly. Up to you. |
llvm/lib/CodeGen/TargetLoweringBase.cpp | ||
48 ↗ | (On Diff #75484) | This seems unrelated so should be done in another patch. |
Looks good to me!
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
8482 ↗ | (On Diff #75742) | Nit: the comment needs an update to match the variable name. |