This patch is for optimizing a switch statement where the switch expression can be analyzed to determine the maximum value it can take. In such cases, the default statement generates a conditional branch comparing the switch value with the highest value of case statement given. This branch can be avoided by adding the missing case statements between the highest value of case statement given and the maximum value the switch expression can take and then redirecting all these new case statements to the default block. Thereafter, the default block can be marked as unreachable, which would remove the conditional branch while lowering the switch statement. This part is dealt with in the Revision D52002 that is also up for review.
Diff Detail
Event Timeline
Hans,
What do those yellow lines on the side of the diff denote? Am I supposed to fix something there? Does that denote some kind of non-compliance to the coding standards that were detected?
The yellow bars are just Phabricator noting copy-pasted code; if you hover over the bar, you can get more details from the tooltip. You might not have to do anything about it in general, but in this case I think it's worth refactoring.
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4866 | Can this shift overflow? |
Thanks. I will refactor the second block of code since the first block has one additional operation of storing the PHIs in a vector.
As for the other comment in the code:
countMaxTrailingOnes will not exceed the number of bits in the switch expression type (local variable ValueWidth above). But then the '1' should probably be '1LL' and MaxSwitchValue should be of type 'long long' to ensure that all possible value widths are taken care of.
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4866 | countMaxTrailingOnes will not exceed the number of bits in the switch expression type (local variable ValueWidth above). But then the '1' should probably be '1LL' and MaxSwitchValue should be of type 'long long' to ensure that all possible value widths are taken care of. |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4866 | The LLVM IR "switch" instruction allows the operand to be any width. (And even in C, you can write a switch using __uint128_t.) You probably want to use APInt here. And actually, I'm not sure why you're shifting here in the first place; the maximum possible value is exactly "~Known.Zero". |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4866 | Yes, I didn't factor in the __uint128_t type. I will change the code to use the ~Known.Zero value for the Max Switch Value. Using the APInt type would be safer. I will make the necessary changes. |
I have implemented the review comments given earlier. The updated patch is attached herewith. Kindly review the same.
Thanks
Thanks for doing this! I have some comments, and as was pointed out, this needs tests.
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4851 | The way I think about this is that it "widens" the switch to cover all input values. So maybe call the function something like that? | |
4866 | I think the following code would be easier to follow if you do these "known value is not negative etc." checks and then return false if they don't pass. Same thing for the unreachable default check: if that's true, there's no point in continuing. | |
4877 | Instead of using floating point for this, how about something like "NumCases * 100 / MaxSwitchValue >= 70"? | |
4880 | I think we need to care about the holes too, because we're going to mark the current default unreachable.. | |
4891 | What's happening here? Why do we need to find a constant value on the phi node? We'll need to update any phi node in the default block when adding the new case, but there's no need for it to be constant. | |
4908 | I don't think this is enough. I think we need to fill the "holes" too. | |
4920 | This needs a comment explaining that the new default is unreachable because the switch now has cases for all possible inputs. | |
4928 | Let's add a statistic variable to count how often this transformation happens (see e.g. NumLinearMaps). |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4891 | It should just be a Value and not a Constant. I had a test case that had a constant on the PHI node and hence I had coded thus. I tried changing this to a Value and realized that there isn't a method that gives me all the Values that are created in a given Case block. There is a method that returns all constants that are created in a Case block. Would you know of any other method that would return me the Values? Or else I will have to write one of my own. |
The earlier patch had a block of code that was commented out. This patch removes that block of code. All other changes asked for in the earlier review comments are there. Kindly review the same.
Thanks
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4891 | What I'm saying is there should be no need to compute the resulting value at phi nodes at all. That stuff is used when replacing switches with a table of constant values, it does not apply here. |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4891 | We need to get all the PHI nodes resulting from this switch and add an incoming edge to all those PHI nodes for every case that we add to this switch. Else the block containing the PHIs will have more predecessors than the number of incoming edges to the PHIs. I'm trying to do essentially that. Could you please suggest a better way to achieve this? |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4891 | Yes, the phi nodes in DefaultBlock need to be updated, but this is not the way to do it. I think something like AddPredecessorToBlock(DefaultBlock, SI->getParent(), SI->getParent()) when adding the cases below should work. |
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4891 | Thanks Hans for this tip. I'll update the code and resubmit the patch. |
The earlier code had a problem in that if we converted every switch that had an unreachable default, to omit the check for the default values beyond the maximum switch case value given in the code, then there were some corner cases where the code generated was semantically incorrect and resulted in runtime failures. These failures are happening while compiling LLVM code itself.
So I have added a boolean in the SwitchInst class to mark switches that have been widened, so that we omit the branch only for those switches. I need to explore the reasons for the behaviour that was observed when we were blindly removing the branch for all switches that had an unreachable default and will post a fix in a later patch.
Please review this and the other two patches 52002 and another one with the test case for this patch.
Reversed the recording of the switch widening in the IR. Now we rely solely on the default being made unreachable.
A gentle reminder to the reviewers for review comments on this please. If there are no concerns, then can we move this to the accepted state please?
Thanks.
Could you please review this patch after the changes I made? If things are fine, then can this be accepted for commit?
ping
Sorry to bother you again. Will it be possible to review this change and accept it please?
Thanks and regards
Ayonam
Sorry for the slow response time. I'm back from vacation now.
lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
4850 | The function description is still a bit confusing. I understand what you mean by "remap the known default values into individual case statements", but requires some thinking to understand. I would suggest rephrasing it as "Possibly widen the switch by adding more cases so that all possible input values are covered, making the default case unreachable." | |
4860 | Since computeKnownBits takes some work to do, I would move the check and return for unreachable default right to the start of the function so we don't do any unnecessary work in that case. | |
4862 | Why do we only care about non-negative values? And more importantly, why the check for minimum number of leading zeros and maximum trailing ones... just reading that, it sounds like it's always true? Is there any situation when it's not true? | |
4883 | Why does MaxSwitchValue have to be greater than the number of cases? I'm not sure I'm following the logic here actually. Also, could we have cases that are greater than MaxSwitchValue, or will some other pass have removed them? |
The function description is still a bit confusing. I understand what you mean by "remap the known default values into individual case statements", but requires some thinking to understand.
I would suggest rephrasing it as "Possibly widen the switch by adding more cases so that all possible input values are covered, making the default case unreachable."