This is an archive of the discontinued LLVM Phabricator instance.

Switch lowering: improve partitioning of jump tables
ClosedPublic

Authored by evandro on Oct 3 2016, 2:54 PM.

Details

Summary

When there's a tie between partitionings of jump tables, consider also cases that result in no jump tables, but in one or a few cases. The motivation is that many contemporary processors perform case switches fairly quickly.

Diff Detail

Repository
rL LLVM

Event Timeline

evandro updated this revision to Diff 73347.Oct 3 2016, 2:54 PM
evandro retitled this revision from to Add support to tune jump tables.
evandro updated this object.
evandro added reviewers: hans, t.p.northover.
evandro set the repository for this revision to rL LLVM.
evandro added subscribers: spop, llvm-commits.
hans added inline comments.Oct 3 2016, 3:48 PM
llvm/include/llvm/Target/TargetLowering.h
1032 ↗(On Diff #73347)

Having both this and getMinimumJumpTableEntries seems pretty confusing.

1040 ↗(On Diff #73347)

I'll come back with more comments, but wanted to mention this one especially. First, the name seems backwards.

But the functionality of this seems very confusing. What does generating fewer jump tables "all else being equal" mean? Why not just turn off jump tables in that case?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8550 ↗(On Diff #73347)

I really don't like this. The purpose of this algorithm is to find jump tables. If you don't want to find jump tables, don't run it. If you think it's generating too large or sparse jump tables, tweak those parameters.

evandro marked an inline comment as done.Oct 3 2016, 4:14 PM
evandro added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8550 ↗(On Diff #73347)

I'd like to have the option to choose more of fewer jump tables when the number of partitions is the same. It's not clear to me why only one possibility was cast in the code originally, but it doesn't seem to me to be always the best choice.

hans added inline comments.Oct 10 2016, 1:59 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8550 ↗(On Diff #73347)

It's not clear to me why only one possibility was cast in the code originally, but it doesn't seem to me to be always the best choice.

It's because the whole purpose of the code is to find jump tables.

Consider the @optimal_jump_table2 test in test/CodeGen/X86/switch.ll. The cases in the switch can be partitioned as {0,1,2,9},{14,15} or {0,1,2},{9,14,15}. The code ensures that the partitioning with the jump table is chosen.

If you think that's the wrong lowering for your target, maybe increasing the minimum jump table entries is what you want.

evandro added inline comments.Oct 11 2016, 9:22 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8550 ↗(On Diff #73347)

Consider that function jt1 in test/CodeGen/AArch64/max-jump-table.ll ends up with partitions {0..7}, {8..12},{13..16} with -max-jump-table=8, whereas arguably having its jump table partitioned as {0..7},{8..15} and an extra link in the if-then-else chain would be better performing.

I find the hard wired choice of always favoring more jump tables arbitrary. The algorithm works rather well as is if favoring fewer jump tables.

hans added inline comments.Oct 12 2016, 1:47 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8550 ↗(On Diff #73347)

It's not arbitrary; it makes the algorithm work, as shown by the example in my previous comment.

Your example is interesting, though. I guess the question is, how do we teach the algorithm that {0..7},{8..15},{16} is better than {0..7},{8..12},{13..16}?

The "pick the partitioning with more jump tables" rule is there to break ties in situations like my previous example: {0,1,2,9},{14,15} vs {0,1,2},{9,14,15}.

However, a partition with a single case (cluster), {16} in your example is clearly as good as a jump table -- in fact it's better.

To model this, instead of having NumTables[i], we could have a PartitioningScore[i]. In addition to scoring 1 when the partition has >= MinJumpTableEntries, we could score 2 for a single-element partition (and perhaps also 1 for a two-element partition).

What do you think?

evandro updated this revision to Diff 74682.Oct 14 2016, 7:29 AM
evandro retitled this revision from Add support to tune jump tables to Moderate the number of jump tables.
evandro updated this object.

Revamp the patch to add the heuristic proposed by @hans.

hans added inline comments.Oct 17 2016, 3:46 PM
llvm/include/llvm/Target/TargetLowering.h
1030 ↗(On Diff #74682)

Is deleting this function body intentional? Dito for the "set" method below.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8484 ↗(On Diff #74682)

The comment needs to say what the vector index means.

How about: "ParitionsScore[i]: Score of partitions for Clusters[i..N-1]. The score is used to break ties when choosing between two partitionings resulting in the same number of partitions."

Note that this needs to be the sum of the scores for the individual partitions. I've commented about this below.

8490 ↗(On Diff #74682)

This base case means putting Clusters[N-1] into a partition of its own, so the score should be that of a one-element partition.

8498 ↗(On Diff #74682)

Same here: this case means putting Clusters[i] into a partition of its own. The total score should be that of a one-element partition plus PartitionScore[i + 1].

8511 ↗(On Diff #74682)

Hmm, I wonder if we should start simple and just check for the jump table and one-element cases. (That name isn't great either -- SmallJumpTableEntries sounds like it's the number of elements in a small jump table, but that's not the case.)

8513 ↗(On Diff #74682)

Hmm, the computation of this score is subtle. A one-element partition gets a higher score than a jump table, but only because it's also <= SmallJumpTableEntries.

I'd prefer to spell things out here with an if-else chain and symbolic constants for the scores instead.

Also, we need the score for the whole partitioning, not just this partition, so I think this needs to be something like Score = ThisScore + (j == N - 1 ? 0 : PartitionScore[j + 1]);

evandro marked 2 inline comments as done.Oct 18 2016, 8:17 AM
evandro added inline comments.
llvm/include/llvm/Target/TargetLowering.h
1030 ↗(On Diff #74682)

Yes, since I moved them to lib/CodeGen/TargetLoweringBase.cpp as I created the cl::opt min-jump-table-entries to allow easy experimentation with this value.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8484 ↗(On Diff #74682)

Accumulating the scores seemed to distort the relative merits of the partitions. I got more consistent results by isolating the score to just the individual partition.

8490 ↗(On Diff #74682)

Right.

8511 ↗(On Diff #74682)

It seemed to me that, if a target maintainer changes min-jump-table-entries, it's an important metric. It made sense to use a threshold that'd be related to this metric.

8513 ↗(On Diff #74682)

I'll spell out in a comment what is intended.

But I'm still debating with myself if the scoring should be cumulative or not.

8550 ↗(On Diff #73347)

I really like your idea. I'll try to translate it into code and run some tests before updating the diff.

Watch this space.

evandro marked 8 inline comments as done.Oct 18 2016, 8:21 AM
evandro updated this revision to Diff 75081.Oct 18 2016, 3:53 PM
evandro marked 3 inline comments as done.
hans edited edge metadata.Oct 21 2016, 8:39 AM

Apologies for the slow reply.

I think this is getting close.. some comments inline.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8485 ↗(On Diff #75081)

I would call these scores consistently instead of weights.

8488 ↗(On Diff #75081)

I'd drop "Enum" from the name, just "PartitionScore" sounds like a good name to me.

8489 ↗(On Diff #75081)

Hmm, "None" doesn't seem like a great name. Maybe "Default" is better?

This would be used for partitions that are too small to be a table, but not small enough to get a better score. I'm not sure what to name that though..

How about something like:

NoTable = 0
FewCases = 1
Table = 2
SingleCase = 2

8498 ↗(On Diff #75081)

Since this is a one-element partition, shouldn't it be PartitionScoreEnum::One?

8506 ↗(On Diff #75081)

I think this should be "+ PartitionScoreEnum::One" instead of "+ 1" since it's the score for a single-element partition we want to add.

8509 ↗(On Diff #75081)

Why change > to >= here?

8524 ↗(On Diff #75081)

I think it should be 0 instead of PartitionScoreEnum::None here.

The "j == N - 1" is used to check whether the current partition is the last one, which means there is no score to add from later partitions.

8528 ↗(On Diff #75081)

It would be clearer to just have another else if-bransh instead of combining two cases into one here.

hans added a comment.Oct 21 2016, 8:44 AM

It might also be worth splitting this patch into two: one for this new way of breaking ties between equal number of partitions, and another one for the minimum jump table entries change.

evandro marked 8 inline comments as done.Oct 21 2016, 12:54 PM
evandro added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8489 ↗(On Diff #75081)

OK, but my understanding is that a few cases are as good as a table.

8509 ↗(On Diff #75081)

Otherwise single case partitions will not be considered.

hans added inline comments.Oct 21 2016, 12:59 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8489 ↗(On Diff #75081)

Sorry, I must have slipped while typing :-) I meant FewCases and Table would be = 1, and SingleCase = 2.

8509 ↗(On Diff #75081)

But the code is considering it above, under "Baseline: Put Clusters[i] into a partition on its own."

evandro marked 8 inline comments as done.Oct 21 2016, 2:08 PM
evandro updated this revision to Diff 75484.Oct 21 2016, 2:14 PM
evandro edited edge metadata.
hans added a comment.Oct 25 2016, 9:42 AM

Thanks! I think the functionality is really good, and this is almost ready to go in. I only have a few stylistic nits left.

Maybe the patch title could be more descriptive, because we're not just reducing the number of jump tables here, we're specifically making the code smarter about how to break ties when partitioning a switch in search for jump tables. So maybe: "Switch lowering: break ties smarter when partitioning switch for jump tables"

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8482 ↗(On Diff #75484)

Since this isn't just the score of one partition, but all partitions starting with the i'th cluster, the name should reflect that, e.g. PartitionsScore or PartitioningScore.

8485 ↗(On Diff #75484)

The wording here doesn't seem very clear. What does "Score relative to generating a jump table to score partitions." mean?

We've already commented on how this works for the PartitionsScore vector above, so maybe we can just refer to that:

"For PartitionsScore, a small number of comparisons is considered as good as a jump table and ... ".

8524 ↗(On Diff #75484)

I'd suggest checking IsOne first, then IsSmall and IsTable last to avoid the "IsSmall && !IsOne" situation below.

Also, I'm not sure how much the bool variables buy us here; it might be just as clear to just put the conditions in the if statements directly. Up to you.

llvm/lib/CodeGen/TargetLoweringBase.cpp
48 ↗(On Diff #75484)

This seems unrelated so should be done in another patch.

evandro marked 4 inline comments as done.Oct 25 2016, 11:24 AM
evandro updated this revision to Diff 75742.Oct 25 2016, 11:26 AM
evandro retitled this revision from Moderate the number of jump tables to Switch lowering: improve partitioning of jump tables.
evandro updated this object.
hans accepted this revision.Oct 25 2016, 11:34 AM
hans edited edge metadata.

Looks good to me!

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8482 ↗(On Diff #75742)

Nit: the comment needs an update to match the variable name.

This revision is now accepted and ready to land.Oct 25 2016, 11:34 AM
evandro marked an inline comment as done.Oct 25 2016, 11:59 AM
This revision was automatically updated to reflect the committed changes.