During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults.
Details
- Reviewers
- bogner - hans - bkramer 
- Commits
- rG2a0f2c5ef333: [CodeGen] Omit range checks from jump tables when lowering switches with…
 rL355490: [CodeGen] Omit range checks from jump tables when lowering switches with…
 rG6025fa8e3007: [CodeGen] Omit range checks from jump tables when lowering switches with…
 rL355483: [CodeGen] Omit range checks from jump tables when lowering switches with…
Diff Detail
Event Timeline
I don't know enough about switch lowering to review this, so adding some more potential reviewers.
But there are at least a couple of obvious changes needed before this can proceed:
- The patch must include tests that show the difference with this patch.
- The indentation and spacing are clearly off (use clang-format?), and variable naming doesn't conform with: http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
I don't think storing the MaxSwitchValue in the switch instructions itself is the right approach. Generally LLVM doesn't store analysis results in the instructions.
I also don't think the right way to do this is necessarily to change the jump table lowering itself, instead I'd suggest doing it at the IR level:
if we detect that the value range for the switch variable is small enough that adding a few more cases to the switch would cover the range completely, add those cases and mark the default bb unreachable. Then let the lowering code deal with exploiting the unreachable default bb. There is room for improvement there, see "FIXME: Exploit unreachable default more aggressively" in visitSwitch() but we should do that anyway.
What do you think?
Thanks for the comments. I will take care of these issues when I resubmit the patch after the changes suggested by Hans.
Thanks for your comments Hans.
I will move the storing of the MaxSwitchValue to the metadata of the instruction. I think that is the right place to store these things.
However, I beg to differ on the second point. If we do the changes in the IR by adding the requisite case statements, it will definitely generate the switch table that we expect to generate. However, when we mark the default branch as unreachable, the lowering code will redirect the unreachable default to the most popular switch block. And that would mean that the conditional branch for the default case would still be generated. Unless we have a way to completely snuff out the default case, which I believe is difficult since there are other parts of the lowering code that have a dependency on the handling of the default case, I do not see how we can do away with the conditional branch for the default case. Did I miss something there?
Also, when you point to the FIXME in the visitSwitch() do you mean that we can completely do away with an unreachable default case? I feel theoretically we should be able to. If there is an unreachable default, then we should be able to jump to the end of the switch block. But even that would mean a conditional jump to the end of the switch block that would handle the default case.
I feel that the only way we can handle this is during lowering. That decision about whether to completely skip the branch for the default case is only possible at that level. Even if we do the creation of the expanded jump table by adding new case statements to the IR, this one thing probably has to be done at the time of lowering only. Would you have other suggestions to handle this aspect?
If we expand the switch at the IR Level, the pass that does the transformation could just depend on whatever Analysis pass is necessary to determine the value range, and we wouldn't have to store it anywhere.
However, I beg to differ on the second point. If we do the changes in the IR by adding the requisite case statements, it will definitely generate the switch table that we expect to generate. However, when we mark the default branch as unreachable, the lowering code will redirect the unreachable default to the most popular switch block. And that would mean that the conditional branch for the default case would still be generated.
Right, we would have to address the FIXME first as discussed below.
Also, when you point to the FIXME in the visitSwitch() do you mean that we can completely do away with an unreachable default case? I feel theoretically we should be able to. If there is an unreachable default, then we should be able to jump to the end of the switch block. But even that would mean a conditional jump to the end of the switch block that would handle the default case.
No, the idea would be that if the default is unreachable, we can just omit the range check from the jump table completely.
SwitchToLookupTable in the SimplifyCFG pass already does the equivalent transformation: for switches used to select between constant values, it generates lookup tables, and if the default is unreachable it will omit the range check for the table completely.
I feel that the only way we can handle this is during lowering. That decision about whether to completely skip the branch for the default case is only possible at that level. Even if we do the creation of the expanded jump table by adding new case statements to the IR, this one thing probably has to be done at the time of lowering only. Would you have other suggestions to handle this aspect?
My suggestion is that the decision about whether to skip the branch for the default case should be driven by what the IR looks like: if the default basic block is unreachable, the branch can be skipped. I think this would be a clean design.
Thanks Hans for the detailed explanation. I was about to go ahead and implement your suggestion but I noticed one problem here.
If the default basic block is unreachable and we completely omit the conditional branch in all such cases, then if the switch value at runtime turns out to be not one of the cases given in the switch block, the execution would reach the vectored branch to the jump table which would fail to reach anywhere. So we need to omit the conditional branch only in such cases of the unreachable default block where the switch value is known to have a specific range of values. That means that we still need to carry that information till the lowering phase as my current implementation is doing right now. How we do that is the question.
If the default basic block is unreachable, that implies that one of the other cases must always be taken. Otherwise, the default wouldn't be unreachable after all.
For example:
int func(int x) {
    switch (x) {
    case 0: return f();
    case 1: return g();
    case 2: return h();
    case 3: return f();
    case 4: return h();
    }
}The default is unreachable here because it would "fall off" the end of the function without returning, which is undefined behaviour. So the switch lowering should be able to remove the range check for the jump table.
In the case above, if there were to be more statements beyond the switch block, isn't the control supposed to reach the end of the switch block and continue with the rest of the statements if the value of x doesn't match any of the cases?
Actually, I think, the behaviour in such cases is undefined, isn't it? And that's what is being exploited when the lowering is redirecting the unreachable default to the most popular destination. Another approach could have been to direct it to the end of the switch block.
On second thoughts, I realize that the default may not be unreachable in the case that I mentioned since the block beyond the switch block would then be treated as the default. Am I right?
Yes, if there are statements after the block, that would be treated as the default and it would not be unreachable.
Thanks Hans for the clarification. I will go ahead and implement your suggestion. Once done, I will re-post the patch for review.
Thanks Hans for the clarification. I will go ahead and implement your suggestion. Once done, I will re-post the patch for review.
Sounds great.
It probably makes sense to do separate patches for removing bounds check for unreachable default, and for adding the extra cases when the set of possible switch values is small.
In this update I have broken down the original patch into two parts and this is the first part, which handles the unreachable default. This part omits the generation of the conditional branch for checking the switch expression against the maximum value of the switch cases and jumping to the default block, if the default is unreachable.
It also removes the implementation that was redirecting the unreachable default to the most popular destination.
The next patch will handle the remaining part of the original patch which is to add a case statement for each known default value and direct it to the default block.
Sorry for the slow response.
This is looking promising. I have a few comments, and also this needs a good test.
| lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
|---|---|---|
| 2168 | Nit: I would call it OmitRangeCheck instead. And maybe this flag could be part of the JumpTableHeader instead, so we don't need to pass it around here? | |
| 10077 | Instead of checking for unreachable default here, I think we should check for it earlier, already when the WorkItems are created probably. | |
Never mind. I appreciate the effort on your part.
Should I be adding a test to the "test" directory or the "unittests" directory? Or is there somewhere else that I should add the test to?
test/CodeGen/X86/ e.g.; or some other arch.
*Ideally* please do use llvm/utils/update_llc_test_checks.py, and put the original tests
(i.e. without this patch) in a separate review, so this review shows how the tests *change*.
Obviously $ ninja check-llvm-codegen should pass.
I have made the changes that you mentioned.
I still need to add the test. I will do so once I have got the code right. In the meantime, could you please check if this looks fine for a checkin?
This is looking pretty good, so I'd suggest start writing tests. For testing, take a look at test/CodeGen/X86/switch.ll
I think we could also exploit unreachable default for the other lowering strategies (bit-tests, straight comparisons and binary search tree), but that can be done in later patches.
| lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
|---|---|---|
| 9570 | There seems to be an extra space between "Default" and "&&". Also, why do we need the SI->getNumCases() check? | |
Anyone have any thoughts about potential security implications for this? Normally, I wouldn't really care about assuming code never has undefined behavior, but an indirect jump to an arbitrary address is much easier to exploit than other sorts of undefined behavior.
It's an interesting question. An indirect jump like this does seem like a pretty powerful gadget.
On the other hand, if we really know that the default is unreachable, it should be safe, such as when the switch cases cover all possible inputs.
The scary case is really when default is unreachable because of UB, the typical case being a function ending without returning a value. But Clang is pretty good about warning about that, so I'm not too concerned.
The earlier code had a problem in that if we converted every switch that had an unreachable default, to omit the check for the default values beyond the maximum switch case value given in the code, then there were some corner cases where the code generated was semantically incorrect and resulted in runtime failures. These failures are happening while compiling LLVM code itself.
So I have added a boolean in the SwitchInst class to mark switches that have been widened, so that we omit the branch only for those switches. I need to explore the reasons for the behaviour that was observed when we were blindly removing the branch for all switches that had an unreachable default and will post a fix in a later patch.
Please review this and the other two patches 52707 and another one with the test case for this patch.
We need to figure out what those failures were. If the unreachable default was indeed reachable, that's a big problem.
So I have added a boolean in the SwitchInst class to mark switches that have been widened, so that we omit the branch only for those switches.
LLVM doesn't put this kind of optimization metadata in the IR. The right thing to do is to figure out why removing the branch for unreachable default didn't work, and fix it.
If you can provide more details about what didn't work, maybe I can help investigate. (Though I'm about to go on holiday soon, so probably not until January.)
There are three files that when compiled with this patch, generate wrong code, viz., AArch64LoadStoreOptimizer.cpp, AArch64InstrInfo.cpp and AArch64ConditionalCompares.cpp. Out of these we tried to isolate the problem with the last one. I figured out that if the functions SSACCmpConv::findConvertibleCompare() and SSACCmpConv::convert() are compiled without this patch, the code works fine. So the problem surfaces with these two routines only. There are a few switch cases in those two routines but I couldn't see anything exceptional with those except for a call to builtin_unreachable() in the default case for two of the switches and a [[clang::fallthrough]] in another. In all these three cases, I was unable to figure out how they could possibly break our assumptions. Does the builtin_unreachable() have any special semantic that we are not handling?
Does the error show with the regular lit tests, or do you have some internal test that fails?
My first guess would be that one of the "unreachable" defaults aren't actually unreachable for some input. But then they should trap in an asserts-enabled build..
@hans 
Is there a way to attach a pre-processed file to this review without affecting the files that are being reviewed?
Phabricator has an "upload file" function... or you can just send an email with an attachment to llvm-commits.
No, this shows up in an internal test. I figured out that the code actually has calls to llvm_unreachable() with an error message, which in a non-debug build, calls builtin_unreachable(). In a debug build, it would have printed a message. I'm not sure how to deal with this. Do you think it is safe to assume that such a behaviour is expected and the test must fail because the inputs are not properly handled or do you think we should handle the unreachable defaults with caution (read conservatively)? I read up some of the articles on the net on the builtin_unreachable() behaviour. They mention that the program should exit, albeit with an error. However, with our way of handling the unreachable defaults, the program crashes with a segmentation fault.
My take is that we should have a way to mark such unreachable defaults that are caused by a call to builtin_unreachable() and not omit the branch in those cases and allow the system dependent implementation of unreachable_default() to handle the manner in which, the program must exit.
There are three files that when compiled with this patch, generate wrong code, viz., AArch64LoadStoreOptimizer.cpp, AArch64InstrInfo.cpp and AArch64ConditionalCompares.cpp. Out of these we tried to isolate the problem with the last one. I figured out that if the functions SSACCmpConv::findConvertibleCompare() and SSACCmpConv::convert() are compiled without this patch, the code works fine. So the problem surfaces with these two routines only. There are a few switch cases in those two routines but I couldn't see anything exceptional with those except for a call to builtin_unreachable() in the default case for two of the switches and a [[clang::fallthrough]] in another. In all these three cases, I was unable to figure out how they could possibly break our assumptions. Does the builtin_unreachable() have any special semantic that we are not handling?
Does the error show with the regular lit tests, or do you have some internal test that fails?
My first guess would be that one of the "unreachable" defaults aren't actually unreachable for some input. But then they should trap in an asserts-enabled build..
No, this shows up in an internal test. I figured out that the code actually has calls to llvm_unreachable() with an error message, which in a non-debug build, calls __builtin_unreachable(). In a debug build, it would have printed a message.
Sounds like that's the bug that needs to be fixed, then.
I'm not sure how to deal with this. Do you think it is safe to assume that such a behaviour is expected and the test must fail because the inputs are not properly handled or do you think we should handle the unreachable defaults with caution (read conservatively)? I read up some of the articles on the net on the __builtin_unreachable() behaviour. They mention that the program should exit, albeit with an error. However, with our way of handling the unreachable defaults, the program crashes with a segmentation fault.
I'm not sure what articles you read, but the documented behaviour for __builtin_unreachable() is that if it's reached, the program has undefined behaviour, the point being that the compiler may assume it's really unreachable.
My take is that we should have a way to mark such unreachable defaults that are caused by a call to builtin_unreachable() and not omit the branch in those cases and allow the system dependent implementation of unreachable_default() to handle the manner in which, the program must exit.
No, unreachable means unreachable and the compiler should treat it as such. There's nothing that says the program needs to exit, either with a segfault or anything else, when hitting __builtin_unreachable().
That settles the matter. Thanks. I will go ahead and modify the patch without the "Widened" field in the IR and post it in a few hours again. If everything else is fine, I would like to upstream it this week before you go on vacation.
Sounds good.
And please include a test.
The one in D55742 doesn't really show the change we're discussing here, i.e. how unreachable defaults affect switch lowering.
Reversed the recording of the switch widening in the IR. Now we solely rely on the default being made unreachable.
The code itself looks good.
Please update the change description to something like "Omit range checks from jump tables when lowering switches with unreachable default", and please move the test into this patch.
| lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h | ||
|---|---|---|
| 286 | Maybe ORC instead of O? O is very similar to 0. | |
Updated the patch with test case. Incorporated comments by other reviewers. Simplified the test case.
| lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h | ||
|---|---|---|
| 286 | Done. | |
Thanks! Almost there.
| lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
|---|---|---|
| 9569 | nit: The indent is deeper than what LLVM code normally uses. I think it should be bool UnreachableDefault = 
    isa<UnreachableInst>(SI->getDefaultDest()->getFirstNonPHIOrDbg()); | |
| 9570 | This comment was never addressed. | |
| test/CodeGen/AArch64/switch-unreachable-default.ll | ||
| 54 | nit: you could drop the "; preds" comment too. | |
Resubmitting the patch with the corrections suggested earlier.
Sorry I missed that comment. I don't think we need that check (SI->getNumCases() > 0). There was some particular case that probably prompted me to add that check. But on second thoughts, that check is not required.
I have also corrected the other minor issues suggested by you. Resubmitting the patch.
Looks good to me. Do you have commit rights, or do you need someone to commit it for you?
(The change description is still a little out of date, but that can be fixed when committing, if not before.)
Thanks Hans for all your efforts. I have updated the description. Feel free to edit it if you think it can be made more concise.
I have never committed before. So I do not have commit rights. It would be great if you could kindly commit the patch for me.
Also, could you please review the patch D52707 for a commit? I have made the changes you mentioned before.
There were two tests that were failing when I tried to commit. This version fixes one of the tests and for the other test a fix in the original patch is provided.
Kindly review the same and approve.
The fix in the code is at line 2204 of lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
And the fix in the test case is in the file test/CodeGen/X86/switch-jump-table.ll
The patch failed another test because the branch to the unreachable default was omitted. The test cases have been modified. The affected tests are test/CodeGen/X86/pr38743.ll and test/CodeGen/X86/switch-jump-table.ll
ping
This needs another review and acceptance since a small change was made to the original implementation and two tests had to be changed.
Sorry for the long delay. I'm back from vacation now, and I'll commit this tomorrow morning unless anyone else objects.
| test/CodeGen/X86/switch-jump-table.ll | ||
|---|---|---|
| 5–6 | This comment describes the old behavior. After your patch, we don't do the "replace with most popular case label" thing anymore, but just omit the range check, so please update the comment. | |
Never mind. I was not aware that you are on vacation till the 15th. Hence was pinging so frequently. Sorry.
I have updated the comment in the test case. If everything is fine, then I can go ahead and commit this change.
I now have commit access. Just waiting for your go ahead.
ayonam updated this revision to Diff 181000.Thu, Jan 10, 12:55
Comment Actions
The patch failed another test because the branch to the unreachable default was omitted. The test cases have been modified. The >>affected tests are test/CodeGen/X86/pr38743.ll and test/CodeGen/X86/switch-jump-table.ll
I have no clue how the file test/CodeGen/X86/pr38743.ll got left out when I updated the revision on Jan 10. I am updating this revision with that file added.
@hans 
My sincere apologies for this rework.  The only changes are in test/CodeGen/X86/pr38743.ll.  Could you please quickly check if things are fine with that file and accept this patch?
Thanks and regards
Ayonam
Various tests are failing on the bots (http://lab.llvm.org:8011/one_line_per_build). For example: http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/20256
Also please reference the review in the commit message when committing next time, by including this line: "Differential revision: https://reviews.llvm.org/D52002". That way the code review gets an update about the commit.
Thanks. I added the line "Review ID: D52002". Missed the "Differential revision" part. Will take care from the next time.
Thanks! In general I suggest watching the bots carefully, and aggressively revert to green if something goes wrong.
Thanks. Actually this is my first checkin to any community software. Learning the ropes. Hope not to repeat in future.
Don't worry to much about it. Its just that once a bot is red it will not send additional notifications for new failures, which increase the chance that failures start adding up. I personally also do an asan+assert and a msan+assert build before pushing something, but its up to you if you want to do this (there are sanitizer bots which will catch it otherwise).
Maybe ORC instead of O?
O is very similar to 0.