This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison
ClosedPublic

Authored by bcl5980 on May 28 2022, 1:58 AM.

Details

Summary

if (C1 is pow2) & ((C2 & ~(C1-1)) + C1) is pow2):

((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1));

https://alive2.llvm.org/ce/z/EJAl1R

((C1 << X) & C2) != 0 -> X  < (Log2(C2+C1) - Log2(C1));

https://alive2.llvm.org/ce/z/3bVRVz

And remove dead code.

Fix: https://github.com/llvm/llvm-project/issues/56124

Diff Detail

Event Timeline

bcl5980 created this revision.May 28 2022, 1:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2022, 1:58 AM
bcl5980 requested review of this revision.May 28 2022, 1:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2022, 1:58 AM
bcl5980 updated this revision to Diff 432708.EditedMay 28 2022, 2:01 AM

update the test result after code change.
Will land test baseline after test also be reviewed.

This patch is not sufficient to solve the bug in general. The IR for the bug report does not contain any icmp, so we can write alternate forms of the source from that bug report and fail to optimize again (and gcc may fail to canonicalize):
https://godbolt.org/z/n9oaWTa8s

Either we need to convert the IR to cmp+select or make this a codegen optimization.

I'm not sure what impact this will have in IR, but you could try to write transforms like this in instcombine and see if that exposes other problems:
https://alive2.llvm.org/ce/z/jA_tNb

bcl5980 added a comment.EditedMay 28 2022, 9:56 AM

I'm also worry if we should canonicalize shift+and to icmp+select

I'm not sure what impact this will have in IR, but you could try to write transforms like this in instcombine and see if that exposes other problems:
https://alive2.llvm.org/ce/z/jA_tNb

So can we write transforms below to fix this issue?:
https://alive2.llvm.org/ce/z/DiF6Wo
https://alive2.llvm.org/ce/z/nQtvt5

So can we write transforms below to fix this issue?:
https://alive2.llvm.org/ce/z/DiF6Wo
https://alive2.llvm.org/ce/z/nQtvt5

Yes, it is worth trying. After you create those, run tests (ninja check at least and possibly test-suite) and see if it uncovers any regressions. Also, check the assembly output for each of those patterns and verify that most in-tree targets (x86, AArch/ARM, riscv) show improvements.

bcl5980 updated this revision to Diff 432768.May 29 2022, 12:33 AM
bcl5980 edited the summary of this revision. (Show Details)

update shift+lshr+and transform

Codegen looks better in all cases that I checked, so that's good. And I'm assuming there are no existing regressions on tests.

But this is really 3-4 independent patches together now. Please split it up for easier review and less risk.

I'd start with the 'shift+and' transforms: please add tests where the intermediate values have extra uses and commit those with baseline results. Then post those as new reviews with the corresponding Alive2 proofs, so we have the correctness check for each transform.

Allen added a subscriber: Allen.May 29 2022, 9:52 AM

Codegen looks better in all cases that I checked, so that's good. And I'm assuming there are no existing regressions on tests.

But this is really 3-4 independent patches together now. Please split it up for easier review and less risk.

I'd start with the 'shift+and' transforms: please add tests where the intermediate values have extra uses and commit those with baseline results. Then post those as new reviews with the corresponding Alive2 proofs, so we have the correctness check for each transform.

I'm sorry but I have a question about the ninja check. What is difference between pre-merge checks and local ninja check test? I try to test local but a lot of test failures on bolt, clang, lld. There is no IR failure.

bcl5980 updated this revision to Diff 432838.May 29 2022, 11:33 PM
bcl5980 edited the summary of this revision. (Show Details)
  1. remove shift+lshr+and pattern
  2. update second pattern to make it more general

I'm sorry but I have a question about the ninja check. What is difference between pre-merge checks and local ninja check test? I try to test local but a lot of test failures on bolt, clang, lld. There is no IR failure.

The pre-merge checks are probably running more/different tests on a variety of targets.

I'm not sure how reliable those results are. You can see similar failures for seemingly unrelated patches like D126638.

It is not necessary to pass all pre-merge tests before committing a patch - I think it is still experimental. If the failure is real, then bots and their owners will definitely let you know after you commit.

I'm sorry but I have a question about the ninja check. What is difference between pre-merge checks and local ninja check test? I try to test local but a lot of test failures on bolt, clang, lld. There is no IR failure.

The pre-merge checks are probably running more/different tests on a variety of targets.

I'm not sure how reliable those results are. You can see similar failures for seemingly unrelated patches like D126638.

It is not necessary to pass all pre-merge tests before committing a patch - I think it is still experimental. If the failure is real, then bots and their owners will definitely let you know after you commit.

Thanks for the explaination.
One other question for this change itself:
D126617 can fix below two cases also

iff (C1 is pow2) & (C2 is pow2) & (C1 <= C2):
((C1 << X) & C2) == 0 -> X != (Log2(C2) - Log2(C1)); https://alive2.llvm.org/ce/z/SVtcJM
((C1 << X) & C2) != 0 -> X == (Log2(C2) - Log2(C1)); https://alive2.llvm.org/ce/z/eLzTjS

iff (C1 is pow2) & (C2 is pow2):
((C1 >>u X) & C2) == 0 -> X != (Log2(C1) - Log2(C2)); https://alive2.llvm.org/ce/z/9p362A
((C1 >>u X) & C2) != 0 -> X == (Log2(C1) - Log2(C2)); https://alive2.llvm.org/ce/z/FsiU-A

Should I only keep this pattern in this differential?

iff (C1 is pow2) & ((C2 & ~(C1-1)) + C1) is pow2) & (C1 < C2):
((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/JQYFnn
((C1 << X) & C2) != 0 -> X  < (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/BnyEmk
bcl5980 updated this revision to Diff 434721.Jun 7 2022, 1:19 AM
bcl5980 edited the summary of this revision. (Show Details)

rebase code
remove lshr case as it can handled by rGa0c3c60728ee

bcl5980 updated this revision to Diff 438244.Jun 19 2022, 8:53 PM
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 updated this revision to Diff 438245.Jun 19 2022, 8:55 PM

coding style update

bcl5980 edited the summary of this revision. (Show Details)Jun 19 2022, 9:01 PM
bcl5980 edited the summary of this revision. (Show Details)Jun 19 2022, 9:05 PM
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 updated this revision to Diff 438246.Jun 19 2022, 9:09 PM
bcl5980 edited the summary of this revision. (Show Details)

remove C1 < C2

bcl5980 edited the summary of this revision. (Show Details)Jun 19 2022, 9:11 PM

I added a similar fold here:
0399473de886

It seems like we might be missing some generalization for power-of-2 values, but this patch seems fine. Please add alive2 links to the patch summary (these are in issue #56124?).

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
5546–5547

Is this just a special-case of the next pattern? If so, I think it is better to reduce the patch. We might have slightly less efficiency at run-time, but there's less code to maintain here.

I added a similar fold here:
0399473de886

It seems like we might be missing some generalization for power-of-2 values, but this patch seems fine. Please add alive2 links to the patch summary (these are in issue #56124?).

alive2 link should already in the summary.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
5546–5547

I think it is not. But we can remove that after rGbfde861.
And we can also remove line 5590's transform after rGa0c3c607.
Do we need to remvoe these on this patch?

bcl5980 edited the summary of this revision. (Show Details)Jun 20 2022, 5:12 AM
bcl5980 edited the summary of this revision. (Show Details)
bcl5980 edited the summary of this revision. (Show Details)Jun 20 2022, 5:15 AM
spatel added inline comments.Jun 22 2022, 5:20 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
5546–5547

Yes, I prefer to remove code if we have other transforms to handle the patterns now. We do not want to keep dead code around - it makes the compiler slower and harder to understand.

bcl5980 updated this revision to Diff 438994.Jun 22 2022, 6:19 AM
bcl5980 edited the summary of this revision. (Show Details)

remove dead code

bcl5980 marked 2 inline comments as done.Jun 22 2022, 6:20 AM
spatel accepted this revision.Jun 22 2022, 12:12 PM

LGTM

This revision is now accepted and ready to land.Jun 22 2022, 12:12 PM