This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Remove redundent instructions generated by combineAddrModes.
ClosedPublic

Authored by Peter on Mar 28 2023, 5:14 AM.

Details

Summary

CodeGenPare may optimize memory access modes.
During such optimization, it might create a new instruction representing combined value.
Later, If the optimization failed, the generated value is not removed and remains a dead instruction.

Normally this won't be a problem as dead code will be eliminated later.
However, in this case (Issue 58538), the generated instruction may trigger an infinite loop.
The infinite loop involves sinkCmpExpression, where it tries to optimize the placeholder generated by us.
(See the test case detailed in the issue)

To fix this, we remove the unnecessary placeholder immediately when we abort the optimization.
AddressingModeCombiner will keep track of the placeholder, and remove it if it is an inserted placeholder and has no uses.
This patch fixes https://github.com/llvm/llvm-project/issues/58538, a test is also included.

Diff Detail

Event Timeline

Peter created this revision.Mar 28 2023, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2023, 5:14 AM
Peter requested review of this revision.Mar 28 2023, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2023, 5:14 AM

I'm a bit confused. In the description you talk about new instructions introduced in AddressingModeCombiner::InsertPlaceholders.
The patch itself does not do anything with then. So the patch deals with the value which is used in combined addressing mode.

Another question, why did you decide to erase this instruction only one place instead of erasing it in all early bail-outs?
Don't you want to add this erasing into destructor of AddressingModeCombiner?

In parallel, don't you want to update sinkCmpExpression to ignore dead instructions?
Additionally, In the code I see

// If we have already inserted a cmp into this block, use it.
CmpInst *&InsertedCmp = InsertedCmps[UserBB];

why we continue sinking icmp while we have already have on in that BB from the bug?

Peter added a comment.Mar 29 2023, 4:54 AM

My apology. I am not an expert in these piece of code, so some places are confusing to me as well. If you have a better way to fix the issue, I am more than happy to follow.

I'm a bit confused. In the description you talk about new instructions introduced in AddressingModeCombiner::InsertPlaceholders.
The patch itself does not do anything with then. So the patch deals with the value which is used in combined addressing mode.

The dead instruction is introduced in findCommon, which returns that instruction as CommonValue.

Another question, why did you decide to erase this instruction only one place instead of erasing it in all early bail-outs?
Don't you want to add this erasing into destructor of AddressingModeCombiner?

I think we can erase such dead instruction on all early bail-outs, but I couldn't find any test cases where leaving them there will cause problems (like hangs we had in the issue).

why we continue sinking icmp while we have already have on in that BB from the bug?

Because AddressingModeCombiner::InsertPlaceholders keeps inserting placeholders (A select instruction) that happens to satisfy sinkCmpExpression's conditions

Peter added a comment.EditedMar 29 2023, 7:36 PM

why we continue sinking icmp while we have already have on in that BB from the bug?

Because AddressingModeCombiner::InsertPlaceholders keeps inserting placeholders (A select instruction) that happens to satisfy sinkCmpExpression's conditions

Another fix that may work is to ask this pass to not optimize dead instructions. For example, add the following:

bool CodeGenPrepare::optimizeInst(Instruction *I, ModifyDT &ModifiedDT) {
...
   if (auto *Cmp = dyn_cast<CmpInst>(I))
-   if (optimizeCmp(Cmp, ModifiedDT))
+   if (Cmp->getNumUses() != 0 && optimizeCmp(Cmp, ModifiedDT))
       return true;

ok, the algorithm working with combined address is a bit complex, so it seems that more or less safe way (at least as I see) to remove dead instruction in case of bail out would be:

  1. Add a destructor to AddressingModeCombiner
  2. Remember CommonValue as you did it
  3. Erase CommonValue as you did it in eraseCommonValue()

otherwise your fix is very narrow.

While you can restrict other optimization as @Peter wrote, but removing dead introduced instruction makes sense anyway.

Thank you for doing it.

Additionally, what I worry about is that the function CodeGenPrepare::optimizeMemoryInst itself remove original instruction.

if (Repl->use_empty()) {
  resetIteratorIfInvalidatedWhileCalling(CurInstIterator->getParent(), [&]() {
    RecursivelyDeleteTriviallyDeadInstructions(
        Repl, TLInfo, nullptr,
        [&](Value *V) { removeAllAssertingVHReferences(V); });
  });
}

I guess it will not be able to remove combineValue but it should be checked by full inspection of algorithm or you can use WeakTrackingVH to keep common value.

Peter updated this revision to Diff 509550.Mar 29 2023, 10:44 PM

Use destructor to remove possibly dead CommonValue

lgtm, please update summary. It does not corresponds to problem and fix.

llvm/lib/CodeGen/CodeGenPrepare.cpp
3661

make it private for a while.

Peter updated this revision to Diff 509552.Mar 29 2023, 10:59 PM

move eraseCommonValueIfDead() as private, add comments for it.

Peter edited the summary of this revision. (Show Details)Mar 29 2023, 11:03 PM
Peter marked an inline comment as done.Mar 29 2023, 11:06 PM

Please consider re-phrase the description to something like:

CodeGenPare may optimize memory access modes.
During such optimization, it might create new instruction representing combined value.
Later, If the optimization failed, the generated value is not removed and remains a dead instruction.

Normally this won't be a problem as dead code will be eliminated later.
However, in this case (Issue 58538), the generated instruction may trigger an infinite loop.
The infinite loop involves sinkCmpExpression, where it tries to optimize the placeholder generated by us.
(See the test case detailed in the issue)

To fix this, we remove the unnecessary placeholder immediately when we abort the optimization.
AddressingModeCombiner will keep track of the placeholder, and remove it if it is an inserted placeholder and has no uses.
This patch fixes https://github.com/llvm/llvm-project/issues/58538, a test is also included.

lgtm.

skatkov accepted this revision.Mar 29 2023, 11:08 PM
This revision is now accepted and ready to land.Mar 29 2023, 11:08 PM
Peter edited the summary of this revision. (Show Details)Mar 29 2023, 11:14 PM