This is an archive of the discontinued LLVM Phabricator instance.

Switch optimization in IR for known maximum switch values
Needs ReviewPublic

Authored by ayonam on Sep 30 2018, 11:26 PM.

Details

Summary

This patch is for optimizing a switch statement where the switch expression can be analyzed to determine the maximum value it can take. In such cases, the default statement generates a conditional branch comparing the switch value with the highest value of case statement given. This branch can be avoided by adding the missing case statements between the highest value of case statement given and the maximum value the switch expression can take and then redirecting all these new case statements to the default block. Thereafter, the default block can be marked as unreachable, which would remove the conditional branch while lowering the switch statement. This part is dealt with in the Revision D52002 that is also up for review.

Diff Detail

Event Timeline

ayonam created this revision.Sep 30 2018, 11:26 PM

Hans,

What do those yellow lines on the side of the diff denote? Am I supposed to fix something there? Does that denote some kind of non-compliance to the coding standards that were detected?

The yellow bars are just Phabricator noting copy-pasted code; if you hover over the bar, you can get more details from the tooltip. You might not have to do anything about it in general, but in this case I think it's worth refactoring.

lib/Transforms/Utils/SimplifyCFG.cpp
4866

Can this shift overflow?

ayonam added a comment.Oct 2 2018, 3:43 AM

The yellow bars are just Phabricator noting copy-pasted code; if you hover over the bar, you can get more details from the tooltip. You might not have to do anything about it in general, but in this case I think it's worth refactoring.

Thanks. I will refactor the second block of code since the first block has one additional operation of storing the PHIs in a vector.

As for the other comment in the code:

countMaxTrailingOnes will not exceed the number of bits in the switch expression type (local variable ValueWidth above). But then the '1' should probably be '1LL' and MaxSwitchValue should be of type 'long long' to ensure that all possible value widths are taken care of.

lib/Transforms/Utils/SimplifyCFG.cpp
4866

countMaxTrailingOnes will not exceed the number of bits in the switch expression type (local variable ValueWidth above). But then the '1' should probably be '1LL' and MaxSwitchValue should be of type 'long long' to ensure that all possible value widths are taken care of.

efriedma added inline comments.Oct 2 2018, 11:30 AM
lib/Transforms/Utils/SimplifyCFG.cpp
4866

The LLVM IR "switch" instruction allows the operand to be any width. (And even in C, you can write a switch using __uint128_t.) You probably want to use APInt here.

And actually, I'm not sure why you're shifting here in the first place; the maximum possible value is exactly "~Known.Zero".

ayonam added inline comments.Oct 2 2018, 2:30 PM
lib/Transforms/Utils/SimplifyCFG.cpp
4866

Yes, I didn't factor in the __uint128_t type. I will change the code to use the ~Known.Zero value for the Max Switch Value.

Using the APInt type would be safer. I will make the necessary changes.

ayonam updated this revision to Diff 168977.Oct 10 2018, 4:10 AM

I have implemented the review comments given earlier. The updated patch is attached herewith. Kindly review the same.

Thanks

hans added a comment.Oct 10 2018, 7:27 AM

Thanks for doing this! I have some comments, and as was pointed out, this needs tests.

lib/Transforms/Utils/SimplifyCFG.cpp
4851

The way I think about this is that it "widens" the switch to cover all input values. So maybe call the function something like that?

4866

I think the following code would be easier to follow if you do these "known value is not negative etc." checks and then return false if they don't pass. Same thing for the unreachable default check: if that's true, there's no point in continuing.

4877

Instead of using floating point for this, how about something like "NumCases * 100 / MaxSwitchValue >= 70"?

4880

I think we need to care about the holes too, because we're going to mark the current default unreachable..

4891

What's happening here? Why do we need to find a constant value on the phi node?

We'll need to update any phi node in the default block when adding the new case, but there's no need for it to be constant.

4908

I don't think this is enough. I think we need to fill the "holes" too.

4920

This needs a comment explaining that the new default is unreachable because the switch now has cases for all possible inputs.

4928

Let's add a statistic variable to count how often this transformation happens (see e.g. NumLinearMaps).

ayonam marked 9 inline comments as done.Oct 21 2018, 2:48 PM
ayonam added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
4891

It should just be a Value and not a Constant. I had a test case that had a constant on the PHI node and hence I had coded thus.

I tried changing this to a Value and realized that there isn't a method that gives me all the Values that are created in a given Case block. There is a method that returns all constants that are created in a Case block. Would you know of any other method that would return me the Values? Or else I will have to write one of my own.

ayonam updated this revision to Diff 170357.Oct 21 2018, 2:52 PM

I have updated the patch with the changes mentioned by Hans. Please review the same.

ayonam updated this revision to Diff 170358.Oct 21 2018, 2:56 PM

The earlier patch had a block of code that was commented out. This patch removes that block of code. All other changes asked for in the earlier review comments are there. Kindly review the same.

Thanks

lattner resigned from this revision.Oct 21 2018, 8:52 PM
hans added inline comments.Oct 23 2018, 4:49 AM
lib/Transforms/Utils/SimplifyCFG.cpp
4891

What I'm saying is there should be no need to compute the resulting value at phi nodes at all. That stuff is used when replacing switches with a table of constant values, it does not apply here.

ayonam added inline comments.Oct 23 2018, 5:05 AM
lib/Transforms/Utils/SimplifyCFG.cpp
4891

We need to get all the PHI nodes resulting from this switch and add an incoming edge to all those PHI nodes for every case that we add to this switch. Else the block containing the PHIs will have more predecessors than the number of incoming edges to the PHIs. I'm trying to do essentially that. Could you please suggest a better way to achieve this?

hans added inline comments.Oct 23 2018, 5:20 AM
lib/Transforms/Utils/SimplifyCFG.cpp
4891

Yes, the phi nodes in DefaultBlock need to be updated, but this is not the way to do it. I think something like

AddPredecessorToBlock(DefaultBlock, SI->getParent(), SI->getParent())

when adding the cases below should work.

ayonam added inline comments.Oct 23 2018, 5:30 AM
lib/Transforms/Utils/SimplifyCFG.cpp
4891

Thanks Hans for this tip. I'll update the code and resubmit the patch.

ayonam updated this revision to Diff 178397.Dec 16 2018, 7:36 AM

@hans

The earlier code had a problem in that if we converted every switch that had an unreachable default, to omit the check for the default values beyond the maximum switch case value given in the code, then there were some corner cases where the code generated was semantically incorrect and resulted in runtime failures. These failures are happening while compiling LLVM code itself.

So I have added a boolean in the SwitchInst class to mark switches that have been widened, so that we omit the branch only for those switches. I need to explore the reasons for the behaviour that was observed when we were blindly removing the branch for all switches that had an unreachable default and will post a fix in a later patch.

Please review this and the other two patches 52002 and another one with the test case for this patch.

...

Please review this and the other two patches 52002 and another one with the test case for this patch.

Both the D52002 and this one, still do not have any tests..

...

Please review this and the other two patches 52002 and another one with the test case for this patch.

Both the D52002 and this one, still do not have any tests..

I have added the test case for review under D55742. Please review the same. Thanks.

ayonam updated this revision to Diff 178951.Dec 19 2018, 1:33 PM

Reversed the recording of the switch widening in the IR. Now we rely solely on the default being made unreachable.

ayonam updated this revision to Diff 179144.Dec 20 2018, 1:40 PM

Moved the test case from D55742 to this patch. Simplified the test case.

A gentle reminder to the reviewers for review comments on this please.

Thanks.

ayonam added a comment.EditedJan 3 2019, 9:34 AM

A gentle reminder to the reviewers for review comments on this please. If there are no concerns, then can we move this to the accepted state please?

Thanks.

dmgreen added a subscriber: dmgreen.Jan 5 2019, 6:29 AM

@hans

Could you please review this patch after the changes I made? If things are fine, then can this be accepted for commit?

ping

@hans

Sorry to bother you again. Will it be possible to review this change and accept it please?

Thanks and regards
Ayonam

hans added a comment.Jan 14 2019, 8:42 AM

Sorry for the slow response time. I'm back from vacation now.

lib/Transforms/Utils/SimplifyCFG.cpp
4850

The function description is still a bit confusing. I understand what you mean by "remap the known default values into individual case statements", but requires some thinking to understand.

I would suggest rephrasing it as "Possibly widen the switch by adding more cases so that all possible input values are covered, making the default case unreachable."

4860

Since computeKnownBits takes some work to do, I would move the check and return for unreachable default right to the start of the function so we don't do any unnecessary work in that case.

4862

Why do we only care about non-negative values?

And more importantly, why the check for minimum number of leading zeros and maximum trailing ones... just reading that, it sounds like it's always true? Is there any situation when it's not true?

4883

Why does MaxSwitchValue have to be greater than the number of cases?

I'm not sure I'm following the logic here actually. Also, could we have cases that are greater than MaxSwitchValue, or will some other pass have removed them?