This is an archive of the discontinued LLVM Phabricator instance.

[MachineCSE] Extend the scope of multi block processing.
Needs ReviewPublic

Authored by skatkov on Apr 18 2023, 2:26 AM.

Details

Summary

At the moment CSE consider only one predecessor if it is alone.
This patch allows to search in any number of block if for each
of them there is only one predecessor.

The LookAheadLimit still controls the number of instruction
to proceed.

The motivation case is in llvm/test/CodeGen/X86/cse-two-preds.mir and https://godbolt.org/z/rEj9GPfnY

Diff Detail

Event Timeline

skatkov created this revision.Apr 18 2023, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2023, 2:26 AM
skatkov requested review of this revision.Apr 18 2023, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2023, 2:26 AM
goldstein.w.n added inline comments.Apr 19 2023, 2:29 PM
llvm/lib/CodeGen/MachineCSE.cpp
369

Why reverse iterator?

llvm/test/CodeGen/Thumb2/thumb2-cbnz.ll
9

why change the IR in this test?

llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll
63

Seems to cause a regression here?

Thank you, for review!

llvm/lib/CodeGen/MachineCSE.cpp
369

This is due to way I filled BB2Visit. It contains all BasicBlocks from MI parent to CSMI parent. If I traverse it in this order it is natural to use reverse iterator.

If you'd like I can re-write it in opposite way (whether we go from CMSI to MI or vise versa nothing is changed).
Use iterator and push back basic blocks from BB2Visit.

llvm/test/CodeGen/Thumb2/thumb2-cbnz.ll
9

The test is artificial, it contains the the same comparison in several basic blocks. To be honest such IR should not be in CodeGen, middle end should simplify it.

The basic block where cbnz is expected actually contains a comparison which is always true (due to dominating check). With this change the corresponding comparison is just eliminated and test fails.

Taking into account the name of test I decided to modify the test to keep the possibility to generate cbnz.

llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll
63

Indeed. The reason of regression is a difference in behavior of BranchFolding pass.
This problem I've tried to resolve in https://reviews.llvm.org/D148514 but as we saw I need to do more work there.
Briefly, before this change BranchFolding was able to handle this case in tail merging, due to the same tail

test
jcc

in several basic blocks.
With this change some intermittent test instruction has been eliminated and BrachFolding/TailMerging has smaller tail now and cannot handle it.

Taking into account that this test is also contains several identical branches (should be eliminated in middle end), the name of the test says that this test is not about BranchFolding or this branch at all and I plan to do something with BranchFolding optimization, I guess we can accept this regression.

skatkov updated this revision to Diff 515210.Apr 19 2023, 9:30 PM

Re-factor code to avoid usage of reverse iterator.

skatkov updated this revision to Diff 515216.Apr 19 2023, 9:51 PM

cosmetic changes.

skatkov updated this revision to Diff 515218.Apr 19 2023, 10:05 PM

another small change - early bailout.

lkail added a subscriber: lkail.Apr 19 2023, 10:07 PM
goldstein.w.n added inline comments.Apr 25 2023, 10:55 AM
llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll
63

Given that is causes a regression here, I think D148514 should proceed this patch.

goldstein.w.n added inline comments.Apr 25 2023, 4:25 PM
llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll
63

Given that is causes a regression here, I think D148514 should proceed this patch.

Can you make them a series?

skatkov added inline comments.Apr 25 2023, 7:55 PM
llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll
63

Done.

At the moment I do not see simple way to implement https://reviews.llvm.org/D148514, so I put this on hold due to temporary busy with other stuff.