This is an archive of the discontinued LLVM Phabricator instance.

Switch to Select optimisations
Needs ReviewPublic

Authored by kariddi on Sep 29 2014, 9:20 AM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Hello,
here are a bunch of optimisations to add to SimplifyCFG try to convert Switches, when possible and convenient, to selects.

Initially the optimization tries to determine if the switch instruction is just a value select operation. If it is it checks if the the value selection happens between only two values (if more than two values are selected then probably switch table is better).
If the check is successful it tries three approaches to convert Switches into selects.

The first is very simple and just checks if only two cases are present and converts the switch into a simple select.
If there are more than two cases it then tries to check if these cases are grouped in ranges of values that are not overlapped, like in:
switch(a) {

case 0...100:
  return 10;
case 101...200:
  return 20;

}
if this is the case it converts the switch into a comparison with a pivot + select.
As a third approach it tires to check if there is a bit pattern with which it is possible to distinguish between two possible groups of switch cases like in:
a &= 0x3
switch (a) {

case 0:
case 2:
  return 10;
case 1:
case 3:
  return 20;

}

where the mask 0x1 can distinguish between the two groups (if "a" masked with 0x1 is 1 then we are in the second case, otherwise the first).
If this succeeds it substitutes the switch with an "and" + compare with 0 + select.

Please help review :)

Diff Detail

Event Timeline

kariddi updated this revision to Diff 14170.Sep 29 2014, 9:20 AM
kariddi retitled this revision from to Switch to Select optimisations.
kariddi updated this object.
kariddi edited the test plan for this revision. (Show Details)
kariddi added subscribers: kariddi, Unknown Object (MLST).
majnemer added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
3445–3450

Can you run clang-format on this?

3456

LLVM convention is to have capitalized values. Pointers and references should lean to the right.

3458

for (SwitchInst::CaseIt I : SI->cases() is a little more concise.

3479–3480

I feel like this would be easier to follow if you used a lambda and captured Result.

3745

This would be more concise as if (SelectValue)

3754–3755

This looks wrong if the bitwidth of Bits exceeds an i64, consider using APInt::getAllOnesValue or APInt::getMaxValue.

kariddi updated this revision to Diff 14177.Sep 29 2014, 10:34 AM

Updates to address David's suggestions

kariddi added inline comments.Sep 29 2014, 10:36 AM
lib/Transforms/Utils/SimplifyCFG.cpp
3445–3450

Done.

3456

Removed because of using the Lambda syntax you suggested

3458

Done

3479–3480

Done.

3745

Done

3754–3755

Done, thanks for this :)

kariddi updated this revision to Diff 14178.Sep 29 2014, 10:44 AM

Whoops , I forgot to add the files with git before creating the diff :P

This is the revision that addresses David's suggestions.

hans added a subscriber: hans.Sep 29 2014, 1:19 PM
hans added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
3464

Why is this check needed? Doesn't GetCaseResults already check that the case leads to the common destination unconditionally?

3473

If I understand correctly, UniqueResults is a mapping from result to vector of cases. Could we use a DenseMap instead and save some code here?

3485

I guess the appending happened above, so maybe just "find the PHI"?

3491

nit: a period at the end of comments is preferred.

3497

This doesn't seem to take into account that there could be other instructions in the default destination before the unreachable terminator.

3504

again, doesn't GetCaseResults already check that the case leads to CommonDest?

3544

i assume the icmp should be with 10 here

3557

i would add an assert about ResultVector.size()

3606

Could these comparisons be done with sgt instead? I'd find that easier to read for code that checks whether a is bigger than b.

3624

I haven't had time to look at this one yet.

Do you think the bit pattern variant could be broken out to a separate patch?

3705

I don't think the "or more" case is supported, though?

kariddi added inline comments.Sep 29 2014, 2:47 PM
lib/Transforms/Utils/SimplifyCFG.cpp
3464

Yeah, you are right. Thanks for this , makes it much cleaner!

3473

The main problem of DenseMap is that it is unordered. We need to enforce the same order every run if we want repeatability of output. Using a vector here makes it easier and probably faster, because the typical size of the vector here is in that spot where actually linear scanning a vector is usually faster.

3485

Done

3491

Done

3497

Good spot! :) I added a check to also be sure the basic block contains only one instruction.

3504

Yep

3544

Done :)

3557

Done.

3606

I have no preference on the matter, so I guess its ok :)

3624

Mmm, this is a rather important optimization for the use case I'm trying to cover. Do you think it is really necessary to put it in a separate patch later on?

3705

Yeah, we support just one PHI node here :)

kariddi updated this revision to Diff 14189.Sep 29 2014, 2:49 PM

I addressed all Hans comments with the exception of using a DenseMap instead of a SmallVector for the results and splitting the bit patterns in a separate patch for now.

hans added inline comments.Sep 29 2014, 6:23 PM
lib/Transforms/Utils/SimplifyCFG.cpp
3473

That's a good point. I wish the code was simpler here, though. It seems like what we want is really just

UniqueResults[Results[0]].push_back(CaseVal);

Maybe if we break out the code to a helper function, "addUniqueResult" or something?

3494

Instead of checking the size, I would probably just grab for the first instruction and check if that's an UnreachableInst:

UnreachableInst *UI = dyn_cast<UnreachableInst>(DefaultDest->getFirstNonPHIOrDbg());

Actually, maybe this can just be baked into the if below:

if (!DefaultResult && !isa<UnreachableInst>(DefaultDest->getFirstNonPHIOrDbg())
3624

It's just that a bigger patch takes longer to review. I haven't looked at this part yet, but will try to do so as soon as I can.

kariddi added inline comments.Sep 30 2014, 3:11 AM
lib/Transforms/Utils/SimplifyCFG.cpp
3473

Sound like a good idea!

3494

Done!

3624

Thank you very much for your help hans :)

kariddi updated this revision to Diff 14207.Sep 30 2014, 3:12 AM

Implemented the latest Hans's suggestions

hans added a comment.Sep 30 2014, 2:54 PM

I think I understand the "bit pattern switch" a little better now, but it's not easy to understand from the code. I've tried to make some suggestions to make it more clear.

lib/Transforms/Utils/SimplifyCFG.cpp
80

If we're going to expose these typedef's throughout the whole file, they need less generic names.

3442

I would just take the "Constant *Result" as a parameter.

It would be nice to have a comment explaining what the function does, and that would also be a good place to the mapping that UniqueResults is.

3454

I know I'm old-school, but with less fancy C++, I think the code would become more clear:

for (auto &I : UniqueResults) {
  if (I.first == Result) {
    I.second.push_back(CaseVal)
    return;
  }
}
UniqueResults.push_back(std::make_pair(Result, SmallVector<ConstantInt*, 4>(1, CaseVal)));
3505

It would be nice with a comment about what "known ones and zeroes" mean. Maybe here, or when they're computed further down in the function.

Actually, maybe the parameters need better names. I commented about this further down.

3598

I think I commented on these before. It's just that with a name such as "first is bigger", I'd find it natural if the code to say that too, i.e.

FirstIsBigger = MinCaseFirst->getValue().sgt(MaxCaseSecond->getValue());
3651

These are the same values that were computed in the if statements above. Rather than repurposing these variables, I'd hoist this to above the if statement and try to give them better name and comment (it's not obvious to me what "unambigous patterns" mean).

Maybe (but good names are hard):

APInt FirstRangeSelectBits = FirstRangeKnownOneBits & SecondRangeKnownZeroBits;
APInt SecondRangeSelectBits = SecondRangeKnownOneBits & FirstRangeKnownZeroBits;
if (!DefaultCanTrigger && (FirstRangeSelectBits || SecondRangeSelectBits) {
  ...
}

Actually, since we don't do anything interesting if DefaultCanTrigger is true, we could just bail out early for that case - near the top:

if (DefaultCanTrigger)
  return nullptr;
3723

Since this ins't a bitmask, but a width, I would call it BitWidth, or CondWidth or something.

3725

Just to make the code easier to follow, I'd spell out "Compute known bits of the Condition." in a comment.

3728

I'd use the terminaly "known" instead of fixed or stuck. I'd also prefix the variable with Num since it's a count, not mask - something like:

const unsigend NumKnownBits = (KnownCondZeroBits | KnownCondOneBits).countPopulation();
3730

I'd just declare this when it's first used below.

3745

I find these names a little bit cryptic. What do you think about something like FirstRangeKnownZeroBits. Though it's a bit longer, it makes the contents clearer.

Also, since these declarators are all on separate lines anyway, I'd just write them out as one declaration per line.

3749

it's not clear to me what a "hit" is

kariddi added inline comments.Oct 1 2014, 3:37 AM
lib/Transforms/Utils/SimplifyCFG.cpp
3442

Done.

3454

I think you are right :) Old school seems cleaner here!

3505

Done.

3598

Done :)

3651

Fixed :)

3723

Done

3725

Done

3728

Done

3730

Done

3745

Done

3749

I changed the name to something that is a little bit more similar to what it actually is

kariddi updated this revision to Diff 14271.Oct 1 2014, 3:38 AM

Updated to comply with Hans's latest comments.

Thanks for all the effort put in reviewing this!

hans added inline comments.Oct 1 2014, 5:45 PM
lib/Transforms/Utils/SimplifyCFG.cpp
79

Thanks! These names are better. A comment explaining what the components of the pair represent would also be great.

3441

How about: "add CaseVal to the list of cases that generate Result".

3479

nit: "be sure it to be" ?

3480

It does occur in LLVM code, but I think the more common style is to not do explicit comparisons against nullptr, but rather do

if (!PHI) {

This applies throughout the patch and I think will simplify some lines.

3495

Just check for nullptr with !DefaultResult, and drop the extra parentheses.

3507

Thanks! I think the expanded comment and new variable names helps the readability of this code a lot.

3516

I would break this comment up into three comments and move them down to where the respective computation occurs in the for loop, i.e.

// Compute number of cases matching the known bits of the condition.
(code)

// Update known bits of case values.
(code)

// Compute min and max case value.
(code)
3548

Make ResultVector const-ref.

3562

Actually, I'd just do

if (DefaultResult)

here.

3596

Make ResultVector const-ref.

3640

Make ResultVector const-ref.

3670

the outer if-statement has already checked that either FirstResultGroupSelectBits or Second...Bits are non-zero, so this could just be an else branch, possibly with an assert.

3701

Can you explain (ideally by adding a comment) why we're not removing this successor?

3716

I would not declare this until it's first used, which seems to be after the InitializeUniqueCases call below.

3724

Since this is only used in the if statement below, I'm not sure it's worth a separate variable. I'd just check UniqueResults.size() in the if statement directly.

3733

I'd s/Zero/Zeroes/ and s/One/Ones/ in the second variable name.

3752

No point declaring the variable until it's used.

test/Transforms/SimplifyCFG/UnreachableEliminate.ll
51

Doesn't this change the test to not check the problem in the original bug? If you change one of the results that the switch go to to not be constant, that should defeat the switch-to-select transform.

test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
4

might as well spell out "def" as "default" here and below to simplify for the reader

19

i'd put this comment down by the llvm function it represents

35

I'd drop the #0. Also a comment saying what's being tested, i.e. that this switch can't be transformed to select because it has a default.

66

i'd drop the #0

68

i think we should check more explicitly what's in the select.

96

would be nice to have tests for both the default with unreachable instruction, and the clever case where we know enough bits to figure out the default case can never happen.

test/Transforms/SimplifyCFG/switch-to-select-range.ll
4

again i'd drop the #0 here and below

6

a comment about what the test is intended to test, i.e. why this can't become a select, would be good.

37

we should check the select explicitly

test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
14

again, i'd put the comment by the llvm function definition

27

we should check exactly what's being selected and compared here

kariddi added inline comments.Oct 2 2014, 5:23 AM
lib/Transforms/Utils/SimplifyCFG.cpp
79

Added

3441

Done!

3479

Modified :P

3480

I see. Sorry, I'm very used to the other way or writing it (because I somehow like it more because is more explicit) so I always forget of this rule :P

Fixed.

3495

Done

3516

Done.
I also kept the common outside of the loop, so that it works like an overview and the inner comments specify where that happens.

3548

Done

3562

Hmm, I prefer here changing DefaultCanTrigger to :

DefaultCanTrigger = DefaultResult;

and then use DefaultCanTrigger in the if, because that is used everywhere else we need to check if the default can trigger, so seems a little bit more coherent in my opinion. (Makes more clear to the reader what we want to test for)
What do you think?

3596

Done

3640

Done.

3670

Good catch!

3701

I think this is actually a bug ... the check should be with Destination.
I also added a fix for the PHI node.

3716

I think ResultType is not used anymore now.
Removed!

3724

Done

3733

Done.

3752

Legacy declaration :|
Done.

test/Transforms/SimplifyCFG/UnreachableEliminate.ll
51

Fixed!

test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
4

Ok.

19

Good idea.

35

Done

66

Done

68

Done

96

There is a problem actually with that.

If the default block is explicitly unreachable then the Unreachable case elimination (an optimization run earlier) substitutes the default with one of the cases.
This might move one of the case group results from the UniqueResult vector to the DefaultResult.
The UniqueResult vector reduces its size from 2 to 1 and the optimization bails out.

Actually there could be a way maybe to improve it to cover also this case ...
In the case we only have one result in UniqueResult then the SelectBitmask is just the KnownOnes for that group.
If we can determine that the cases in the result group are all and the only possible case values for the switch condition that have Ones in that position then we can use the KnownOnes as the bit mask.

So , in this example:
int foo3_explicit_without_default(int a) {

a &= 0x6;
switch (a) {
  case 0:
    return 1;
  case 2:
    return 2;
  case 4:
    return 1;
  case 6:
    return 2;
}
__builtin_unreachable();

}

We have two groups, one generating the value "2" and one generating the value "1". The default is unreachable.

after optimization becomes:

define i32 @foo3_explicit_without_default(i32 %a) nounwind readnone ssp uwtable {

%1 = and i32 %a, 6
switch i32 %1, label %4 [
  i32 6, label %3
  i32 2, label %2
]

; <label>:2 ; preds = %0

br label %4

; <label>:3 ; preds = %0

br label %4

; <label>:4 ; preds = %0, %3, %2

%.0 = phi i32 [ 2, %3 ], [ 2, %2 ], [ 1, %0 ]
ret i32 %.0

}

Which is not optimizable, because the group generating "1" has been moved to the default.
But , because:

  • KnownOnes for the cases generating "2" is 0x4, and
  • the possible bit patterns for the switch condition are only 0x0, 0x2, 0x4, 0x6 and
  • The cases in the group generating "2" are all and the only valid cases that match the KnownOnes mask for "2".

Then that mask can be used to differentiate between the default and the group generating "2".

Should I modify the algorithm to match this case in this patch? Or is it better to do it separately? (It would involve changing everything to support UniqueResults.size() < 2).

test/Transforms/SimplifyCFG/switch-to-select-range.ll
4

Done

6

Done

37

Done

test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
14

Done

27

Done

kariddi updated this revision to Diff 14320.Oct 2 2014, 5:24 AM

Latest corrections applied

hans added inline comments.Oct 2 2014, 2:00 PM
lib/Transforms/Utils/SimplifyCFG.cpp
3523

Isn't it the number of cases covered by the "pattern", rather than the other way around?

3562

That's fine.

3709

It's not clear from just reading the function what Destination is. It might be clearer to just use PHI->gerParent().

test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
96

Sounds like that should be done separately.

Can we just add a test for the case the default case is not explicitly unreachable, but we figure it out based on known bits of the condition and number of cases?

Actually, couldn't the unreachable case elimination have that logic instead of our transform?

I'on travel right now , so I will have to wait until I land before being able to produce the patch to address the latest comments, but I wanted to answer in advance ;-)

lib/Transforms/Utils/SimplifyCFG.cpp
3523

Ouch, my brain got reversed while writing this comment :-D

3709

I'll do that

test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
96

I'll do that. What do you mean with the unreachable optimization to have the logic though? Do you mean calling the ConvertToSelectBitpattern within it? If that is the case I immagine we'll have to duplicate some code from SwitchToSelect into it. Also, there might be some optimizations that run afterwards that might help even more in the success of the optimization, but I don't have the code in front of me right now, so, I'll wait to confirm it.

kariddi added inline comments.Oct 3 2014, 1:01 PM
test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
96

Let me know if the test you mean is the one in the latest revision.
The only difference with the explicit default case is that the result in the switch for the default is undef.

kariddi updated this revision to Diff 14399.Oct 3 2014, 1:07 PM

Addresses latest suggestions

hans added a comment.Oct 3 2014, 2:44 PM

The code for the first optimization, ConvertTwoCaseSwitch() looks good to me. I suggest you break that out to a separate patch so we can get that landed and make the rest of the code easier to review.

test/Transforms/SimplifyCFG/switch-to-select-bitpattern.ll
96

I just mean that checking known bits of the condition comparing that against the cases and figuring out if the default case if unreachable, seems like a more general optimization. Instead of just doing it in this case, where the switch is selecting between values, we should do it for all switches.

This way, we optimize more switches, and the code for the switch-to-select- transformation becomes simpler.

Seems like a good idea, to split out ConvertTwoCaseSwitch() out if that helps out committing all the initial infrastructure.

I'll make another patch with that and add you as a reviewer :)