This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select
ClosedPublic

Authored by bcl5980 on Mar 25 2022, 8:31 AM.

Details

Summary

When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks.
If yes we can use "and condition, ~mask" to simplify the switch

case 0 2 4 6 -> and condition, -7
https://alive2.llvm.org/ce/z/jjH_0N

case 0 2 8 10 -> and condition, -11
https://alive2.llvm.org/ce/z/K7E-2V

case 2 4 8 12 -> and (sub condition, 2), -11
https://alive2.llvm.org/ce/z/CrxbYg

Fix one case of https://github.com/llvm/llvm-project/issues/39957

Diff Detail

Event Timeline

bcl5980 created this revision.Mar 25 2022, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2022, 8:31 AM
bcl5980 requested review of this revision.Mar 25 2022, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2022, 8:31 AM
bcl5980 added a comment.EditedMar 25 2022, 8:34 AM

I have no too much confidence about this change, so this should be a scatch without detail comments and the magic number is also a rough value.
So can someone help to give me suggestion it is worth or not?

https://godbolt.org/z/zPhnrexca
This change should get result similar to MSVC except ztfos4

bcl5980 edited the summary of this revision. (Show Details)Mar 27 2022, 9:31 PM
bcl5980 updated this revision to Diff 418824.Mar 29 2022, 2:40 AM

fix the bitmask init value wrong
remove the MaxCasesPerResult limitation

bcl5980 updated this revision to Diff 418825.Mar 29 2022, 2:42 AM

update the correct mask0 case

RKSimon added inline comments.Mar 30 2022, 4:01 AM
llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
70

(style) please can you replace the mask# naming sequence with test names that are more descriptive?

bcl5980 updated this revision to Diff 419101.Mar 30 2022, 4:38 AM

update test name

bcl5980 marked an inline comment as done.Mar 30 2022, 4:38 AM
bcl5980 added a comment.EditedApr 2 2022, 12:12 AM

Ping.
Any suggestions for the patch?

@RKSimon Any suggestions for the patch?

I have not stepped through the logic in "ConvertTwoCaseSwitch" in detail, but this seems to be on the right track.
We need to update function names and code comments for the more general transform now, so we should clean this set of functions up as much as possible before this patch. See if this makes sense:
D123614

I have not stepped through the logic in "ConvertTwoCaseSwitch" in detail, but this seems to be on the right track.
We need to update function names and code comments for the more general transform now, so we should clean this set of functions up as much as possible before this patch. See if this makes sense:
D123614

OK, I will rebase after D123614 land.

bcl5980 updated this revision to Diff 422268.Apr 12 2022, 9:45 AM

rebase code

bcl5980 updated this revision to Diff 422281.Apr 12 2022, 10:19 AM

Update comments

bcl5980 added inline comments.Apr 12 2022, 10:34 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5646

@spatel I'm a little worry about here. I remove the early out this version. It will cause compile time increase if we have some very large switch with many cases to the same result but not the pattern we can fold. But I have no detail data to show how much extra compile this change will be involved. Do we have some common compile time tests?
Another way to do is enlarge MaxCasesPerResult. The patch first version adjust MaxCasesPerResult to 16. But 16 is also a magic number. Maybe we can add a config for it.
How about your suggestions?

spatel added inline comments.Apr 12 2022, 11:16 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5646

@nikic has a system that is used to check for compile time changes with benchmarks from the test suite:
https://llvm-compile-time-tracker.com/index.php

So we can run an experiment there.

But I think it is fine to use "16" as a limit for this analysis. You can give it a name and make it a debug option like the many others at the top of this file. For example:

static cl::opt<unsigned> MaxSpeculationDepth(
    "max-speculation-depth", cl::Hidden, cl::init(10),
    cl::desc("Limit maximum recursion depth when calculating costs of "
             "speculatively executed instructions"));
spatel added inline comments.Apr 12 2022, 12:35 PM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5646

We can reduce this patch by doing that part independently:
D123625

bcl5980 added inline comments.Apr 13 2022, 1:11 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5646

Thanks for the help. I will rebase after D123625

bcl5980 updated this revision to Diff 422476.Apr 13 2022, 5:07 AM

rebase main

Let's add the baseline tests as an NFC commit now, so it is easier to see the diffs.

The order of the transforms creates an interesting trade-off, so we need a test like this (and probably even more tests):

define i8 @same_value_two_case(i32 %i) {
entry:
  switch i32 %i, label %default [
  i32 -3, label %end
  i32 5, label %end
  ]

default:
  br label %end

end:
  %t0 = phi i8 [ 42, %default ], [ 3, %entry ], [ 3, %entry ]
  ret i8 %t0
}

This patch creates a difference that survives all the way through codegen - instcombine does not recognize the equivalence between the 2 patterns:
https://alive2.llvm.org/ce/z/mqo87Z

It's not clear to me if there is a universal better form (depends on target?) or even which one is better for IR. To avoid those questions, you can re-order the transforms, so we do not have to answer it in this patch (add a TODO comment though).

Let's add the baseline tests as an NFC commit now, so it is easier to see the diffs.

The order of the transforms creates an interesting trade-off, so we need a test like this (and probably even more tests):

define i8 @same_value_two_case(i32 %i) {
entry:
  switch i32 %i, label %default [
  i32 -3, label %end
  i32 5, label %end
  ]

default:
  br label %end

end:
  %t0 = phi i8 [ 42, %default ], [ 3, %entry ], [ 3, %entry ]
  ret i8 %t0
}

I'm sorry but what's the difference between this test with switch_to_and0 ? For the negative offset?

Let's add the baseline tests as an NFC commit now, so it is easier to see the diffs.

The order of the transforms creates an interesting trade-off, so we need a test like this (and probably even more tests):

define i8 @same_value_two_case(i32 %i) {
entry:
  switch i32 %i, label %default [
  i32 -3, label %end
  i32 5, label %end
  ]

default:
  br label %end

end:
  %t0 = phi i8 [ 42, %default ], [ 3, %entry ], [ 3, %entry ]
  ret i8 %t0
}

I'm sorry but what's the difference between this test with switch_to_and0 ? For the negative offset?

Ah, you're correct - I did not see that test diff. But we might want to include a test with negative offset, so we have coverage for that too.

Let's add the baseline tests as an NFC commit now, so it is easier to see the diffs.

The order of the transforms creates an interesting trade-off, so we need a test like this (and probably even more tests):

define i8 @same_value_two_case(i32 %i) {
entry:
  switch i32 %i, label %default [
  i32 -3, label %end
  i32 5, label %end
  ]

default:
  br label %end

end:
  %t0 = phi i8 [ 42, %default ], [ 3, %entry ], [ 3, %entry ]
  ret i8 %t0
}

I'm sorry but what's the difference between this test with switch_to_and0 ? For the negative offset?

Ah, you're correct - I did not see that test diff. But we might want to include a test with negative offset, so we have coverage for that too.

@spatel I feel so sorry that I check in the baseline test here rGe2d77a160c with a wrong issue number(show be #39957 but I write #54649).
Can I amend the commit message now?

bcl5980 updated this revision to Diff 422511.Apr 13 2022, 7:23 AM

rebase main

@spatel I feel so sorry that I check in the baseline test here rGe2d77a160c with a wrong issue number(show be #39957 but I write #54649).
Can I amend the commit message now?

I do not know if you can amend a commit message after a push (but my git knowledge is not very good).

I don't think it is a big problem, but you could revert and recommit.

Alternatively, you can post a correction on the Phab review thread:
https://reviews.llvm.org/rGe2d77a160c5b8141eca3db1fca6dafd97e78288d
or on github directly?
https://github.com/llvm/llvm-project/commit/e2d77a160c5b8141eca3db1fca6dafd97e78288d

Let's add the baseline tests as an NFC commit now, so it is easier to see the diffs.

The order of the transforms creates an interesting trade-off, so we need a test like this (and probably even more tests):

define i8 @same_value_two_case(i32 %i) {
entry:
  switch i32 %i, label %default [
  i32 -3, label %end
  i32 5, label %end
  ]

default:
  br label %end

end:
  %t0 = phi i8 [ 42, %default ], [ 3, %entry ], [ 3, %entry ]
  ret i8 %t0
}

This patch creates a difference that survives all the way through codegen - instcombine does not recognize the equivalence between the 2 patterns:
https://alive2.llvm.org/ce/z/mqo87Z

It's not clear to me if there is a universal better form (depends on target?) or even which one is better for IR. To avoid those questions, you can re-order the transforms, so we do not have to answer it in this patch (add a TODO comment though).

Try two patterns on current backend, this patch's implementation generate better asm on x86 and aarch64: https://godbolt.org/z/nsoWa7Kqr

Try two patterns on current backend, this patch's implementation generate better asm on x86 and aarch64: https://godbolt.org/z/nsoWa7Kqr

Thanks for checking that. I suspect that a target that has condition-logic instructions (like PowerPC) might be better off with the icmp pattern for some examples, but I agree that we can default to the 'and' pattern here. It is shorter IR for the case with no case minimum offset.
I added some more tests and tried to clean up the existing code a bit more. Please rebase and address minor issues. Then I think this patch will be good to go.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5684

I think it would be better to put these examples closer to the code that does the transform.
I moved the existing example comment with:
0ef46dc0f9f3

5688

Remove the TODO - if we are going to use the new code on the 2-case pattern, then it is really a TODO for instcombine and/or codegen to alter it if needed.

5692

Don't move this comment.

5710

I added:

ArrayRef<ConstantInt *> CaseValues = ResultVector[0].second;

to make this more readable. You can use that in the new code.

5714

Use complete sentence/punctuation for code comments:

// Find minimal value.
5720

Add period at end of sentence.

5725–5726

Add period at end of sentence.

Also, I'm not sure what "mutil" means in the patch title. Could this be "Try to fold switch with single result value and power-of-2 cases to mask+select" ?

bcl5980 updated this revision to Diff 422724.Apr 13 2022, 7:42 PM
bcl5980 retitled this revision from [SimplifyCFG] Fold mutil cases to And mask to [SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select.

rebase and update comments

spatel accepted this revision.Apr 14 2022, 7:44 AM

LGTM - I made a few more minor comments.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5711–5713

The bit numbering is not clear to me. "0,4" clears bit 2?
It might help to show the mask as a bit value (for example -5 is "0b1..1011").
Maybe better to spell out the comparison too:
"Cond & 0b1..1011 == 0 ? result : default"

5716

Move this variable declaration/initialization to just above the loop where it is filled in.

5724–5726

Remove unnecessary braces around 1-line loop - { }.

llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
68–223

It is independent of this patch, but we should put a TODO comment on this test or in the code because we do not produce the optimal code when there is no default:
https://alive2.llvm.org/ce/z/7qxumF

This revision is now accepted and ready to land.Apr 14 2022, 7:44 AM
bcl5980 updated this revision to Diff 422889.Apr 14 2022, 9:11 AM
bcl5980 marked 7 inline comments as done.

update comments

This revision was landed with ongoing or failed builds.Apr 14 2022, 9:17 AM
This revision was automatically updated to reflect the committed changes.

@spatel Thanks for the review. For me write comments and commit message is much harder than code because of my poor English. These comments are really helpful .