This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Improve kill flag computation and add verification after MI peephole
ClosedPublic

Authored by nemanjai on Sep 1 2022, 4:39 AM.

Details

Summary

The MI Peephole pass has grown to include a large number of transformations over the years. Many of the transformations require re-computation of kill flags but don't do a good job of re-computing them. This causes us to have very common failures when the compiler is built with expensive checks. Over time, we added and augmented a function that is supposed to go and fix up kill flags after each transformation but we keep missing cases.
This patch does the following:

  • Removes the function to re-compute kill flags
  • Adds LiveVariables to compute and maintain kill flags while transforming code
  • Adds re-computation of kill flags for the post-RA peepholes for each block that contains a transformed instruction

Diff Detail

Event Timeline

nemanjai created this revision.Sep 1 2022, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2022, 4:39 AM
nemanjai requested review of this revision.Sep 1 2022, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2022, 4:39 AM
lkail added a subscriber: lkail.Sep 1 2022, 5:50 AM
amyk added a subscriber: amyk.Sep 21 2022, 4:28 PM
amyk added a comment.Sep 21 2022, 4:28 PM

Is it just me or is this patch not clean? It appears there is a LIT failure in CodeGen/PowerPC/sext_elimination.mir and I am getting some test-suite build failures.

Is it just me or is this patch not clean? It appears there is a LIT failure in CodeGen/PowerPC/sext_elimination.mir and I am getting some test-suite build failures.

Interesting. I was not seeing failures, but something may have changed between my revision and yours. I'll have a look.

amyk added a comment.Nov 2 2022, 9:56 AM

Some initial general comments for this patch:

  • The description doesn't exactly clarify how the flags are recomputed for pre/post-RA. Can the description be clarified a bit more?
  • The improvement of the LLVM_DEBUG messages and general code clean up could be a separate patch. Can this stuff be separated out?
amyk added a comment.Nov 17 2022, 7:19 PM

Just had a couple comments I wanted to submit prior to the rebase of this patch.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3714

Nit: unrelated whitespace change.

3732

Isn't this an unrelated change, as well?

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
502

We're accessing the register both in the cases when we're removing it from RegsToUpdate and when we're re-running LiveVariables.
It might make sense to pull out MO.isReg() into a separate variable.

1941

Nit: unrelated whitespace change.

amyk added a comment.Nov 23 2022, 5:43 PM

A few more submitted comments that I realized I had left on the review.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3719

We should clarify that for MI, we're actually adding both the defs and the uses.

3723

It would be good to have a more descriptive name for UpdateRegs. Perhaps RegsToUpdate.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
472

Since you're adding a significant amount of code to a deeply nested block, consider flipping this condition to an early exit to reduce the nesting.

nemanjai updated this revision to Diff 479511.Dec 1 2022, 7:52 PM

Address review comments

  • Commit the addition of LLVM_DEBUG messages as a separate NFC patch
  • Provide some helper functions
  • Remove unrelated changes
  • Some general cleanup
nemanjai edited the summary of this revision. (Show Details)Jan 25 2023, 8:13 AM
nemanjai added inline comments.Jan 25 2023, 8:28 AM
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
20

s/in after/after

lei added a subscriber: lei.Jan 26 2023, 7:40 AM
lei added inline comments.
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
4854–4855

this should be removed as well?

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
133

nit: I knnow you have doc above, but it would be good if we can add a note here about valid usage of this new function. When it should be used etc..

153

nit: seems prev code list all required first and then preserved?

1148

Why was this removed?

1287

nit: add doc as to why we are doing this?

1802

Is this change on purpose to fix a previous bug? Seems unrelated.

amyk added a comment.Jan 26 2023, 10:42 AM

Some additional comments and questions I thought of after our previous review/discussion.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
839

Do we need to addRegToUpdate() here?

1049

Do we need to addRegToUpdate() here?

1941

Nit: Not sure if you missed this before, but this change seems unrelated.

llvm/test/CodeGen/PowerPC/convert-ri-addi-to-ri.mir
2

Potentially silly question, but because we're adding livevars here explicitly, does this mean we are not running the live variables pass by default when we are adding live variables to PPCMIPeephole?

kamaub added a subscriber: kamaub.Jan 26 2023, 11:01 AM

Some comments from the llvm-on-power team:

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
471–472

From the use of the variable, it seems more consistent for this to be a Smallset instead of a Smallvector, additionally, is 2 enough? We check the MI and the DefMI for Regs to update, a bigger set size may be more optimal.

498

Please add a comment explaining why this is needed.

499
if (RegsToUpdate.empty())
  return;

I think this is more appropriate to the coding guidelines.

501
  1. Please DeMorganize this into an early exit condition.
  2. Please add a comment explaining why this needs to be a definition of a register.
kamaub added inline comments.Jan 30 2023, 12:32 AM
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
899

Should there be a call to recomputeLVForDyingInstr() as well since this ConvReg1 is a reg of RoundInstr which is being eraseFromParent() on line 847 and cannot be marked ToErase inside the lambda?

1790

We add this compare's operands to the RegsToUpdate unconditionally earlier, and now in this code path it is deleted meaning that we could end up attempting to recompute the LV for its VRegs and could hit errors.
If this is the case I think being inside a function means recomputeLVForDyingInstr() would have to become a static function in order to catch this corner case =?

kamaub added inline comments.Feb 14 2023, 7:33 AM
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
1259

Do we need to track the registers in this instruction

Overall I think that this patch makes sense to me. A few minor things:

  1. I think that there are some changes that are nice improvements but aren't really part of the kill flags fix. This is already a fairly large patch and removing these little fixes might make the patch clearer and easier to read. They can be made before or after with a (or several) NFC change(s).
  2. I know this is an older patch and I apologize for taking so long to look at it. When I tried to rebase this patch to the Top of main I realized that I had to add a couple more addRegToUpdate in order to get it working. This isn't an issue with the patch it's just that new changes to PPCMIPeephole will require additions or addRegToUpdate.
  3. Is there any way to provide an error or warning for the future when more peephole optimizations are added and for whatever reason the developer forgets to call addRegToUpdate? There may not be a way to do this and that's fine. I'm more curious.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
183

If you made this function a private member of PPCMIPeephole then you wouldn't have to pass in RegsToUpdate since that SmallSet is already a private member of PPCMIPeephole. It would just save constantly having to repeat the first parameter that is always the same.

Of course if you did change this you would also have to make convertUnprimedAccPHIs a private member which I'm not sure is worth it... But anyway, maybe this is work for the future.

1196

nit:
Adding ToErase here is a good idea but maybe it should be a separate cleanup patch.
I don't think it has anything to do with the kill flag computation.

1288

Don't we guarantee that RegsToUpdate has only virtual registers?

static void addRegToUpdateWithLine(SmallSet<Register, 16> &RegsToUpdate,
                                   Register Reg, int Line) {
  if (!Register::isVirtualRegister(Reg))
    return;
...

Maybe I'm missing something but I feel like we don't need to check && Register::isVirtualRegister(Reg).

What you may be able to do is add an assert to make sure that what I'm saying is actually true all the time.

1802

I'm also curious about this. I can't figure out why this else was added.
Should Simplified only be true if !IsPartiallyRedudant?

llvm/test/CodeGen/PowerPC/peephole-phi-acc.mir
780

This looks like a fix for this test case that isn't related to the kill flags fix. Maybe a separate NFC patch?

stefanp added inline comments.Mar 29 2023, 10:40 AM
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
1196

Nope. I was wrong. You do need to add this here as part of this patch.
Please disregard my previous comment.

1211

Note:
Checked that what would happen if we ended up with OrigOp1Reg == PPC::NoRegister here and it should be fine.

Register::isVirtualRegister(PPC::NoRegister) returns false and so we cannot add PPC::NoRegister to the list.

1941

nit: Whitespace

llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp
20

nit:
Unrelated cleanup.

amyk added a comment.Apr 14 2023, 8:43 AM

Realized I had another unsubmitted comment. I looked again and I do not have any other comments/questions for now aside from the ones that the others have asked about.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
1781–1788

Why do we not need to add addRegToUpdate() here when we finally eliminate the compare instruction?

nemanjai marked 29 inline comments as done.Apr 28 2023, 7:01 AM
nemanjai added inline comments.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
183

I think it is not unreasonable to make these two functions members. Their functionality is part of the pass so it makes sense for them to be members.
The alternative solution would be to make RegsToUpdate a static member and access it from the free functions. However, that has the very undesirable effect of making the pass not thread-safe which matters at least for LTO builds.

502

The call to MO.getReg() in the if statement is safe because of short circuiting of the call to MO.isReg().

839

I think so. Thanks for catching this.

899

No, because ConvReg1 is not defined by RoundInstr.

1049

Actually no because we have actually added a new definition of the register that MI originally defined in the exact same place. The original MI will be deleted and the live range of the register it defined doesn't change.

1148

I don't remember. I have put it back now.

1259

This is currently broken, so we can't really perform this optimization safely.

1288

Oh, good point. The check is redundant.

1781–1788

Because we've added all the register operands of CMPI1 and CMPI2 starting on line 1663.

1790

Actually, we don't need to because there is only one user of the register it defines and that user is BI2 which just had that use replaced so the only definition from CMPI2 is going to be a dead register. We don't try to recompute live variables for dead registers. I've added a comment to explain this.

1802

Hmm, I don't actually remember, but it definitely looks wrong. Removed it.

llvm/test/CodeGen/PowerPC/convert-ri-addi-to-ri.mir
2

Very good catch. This was a remnant of the initial implementation that simply added the livevars to the pipeline. I have removed it now.

nemanjai updated this revision to Diff 517915.Apr 28 2023, 7:27 AM
nemanjai marked 10 inline comments as done.

Rebase on top of some new additions to the pass and address numerous review comments.

stefanp added inline comments.May 12 2023, 10:55 AM
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
388

nit: spacing

441

We are adding RegMBB.first but what about RegMBB.second and AccReg? This is a new instruction so I assume that all of these registers now may have a different live range.

1943

There are two instructions that are erased here and I don't think all of the required registers are added to the update list.
I managed to get a reduced test case that I think shows this issue:

define hidden void @testCaller() local_unnamed_addr align 2 {
entry:
  br label %exit

exit: ; preds = %entry
  br label %cond.end.i.i

cond.end.i.i:                                     ; preds = %if.end116, %exit
  %CurrentState.0566 = phi i32 [ %CurrentState.2, %if.end116 ], [ -1, %exit ]
  %0 = load i32, ptr poison, align 8
  br label %while.body5.i

while.cond12.preheader.i:                         ; preds = %while.body5.i
  br i1 poison, label %if.end116, label %for.cond99.preheader

while.body5.i:                                    ; preds = %while.body5.i, %cond.end.i.i
  %Test.012.i = phi i32 [ 0, %cond.end.i.i ], [ %dec10.i, %while.body5.i ]
  %dec10.i = add nsw i32 %Test.012.i, -1
  %cmp4.i = icmp slt i32 0, %dec10.i
  br i1 %cmp4.i, label %while.body5.i, label %while.cond12.preheader.i

for.cond99.preheader:                             ; preds = %while.cond12.preheader.i
  %1 = load ptr, ptr poison, align 8
  %conv103 = sext i32 %0 to i64
  %arrayidx.i426 = getelementptr inbounds i32, ptr %1, i64 %conv103
  store i32 0, ptr %arrayidx.i426, align 4
  store i32 %CurrentState.0566, ptr poison, align 8
  br label %if.end116

if.end116:                                        ; preds = %for.cond99.preheader, %while.cond12.preheader.i
  %CurrentState.2 = phi i32 [ %0, %while.cond12.preheader.i ], [ poison, %for.cond99.preheader ]
  call fastcc void @callee()
  br label %cond.end.i.i
}
declare dso_local fastcc void @callee() unnamed_addr align 2

Compile with:

llc --mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -debug-only=ppc-mi-peepholes bugpoint-reduced-simplified.ll

I see the following output:

Combining pair:   %2:g8rc = EXTSW_32_64 %1:gprc
  %12:g8rc = RLDICR killed %2:g8rc, 2, 61
TO:   %12:g8rc = EXTSWSLI_32_64 %1:gprc, 2
Adding register: 1 on line 1944 for re-computation of kill flags
Deleting instruction:   %12:g8rc = RLDICR killed %2:g8rc, 2, 61
...
*** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
- function:    testCaller
- basic block: %bb.2 while.cond12.preheader.i (0x119473888)
Virtual register %2 is not needed live through the block.

*** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
- function:    testCaller
- basic block: %bb.3 while.body5.i (0x119473988)
Virtual register %2 is not needed live through the block.
LLVM ERROR: Found 2 machine code errors.
amyk added inline comments.Jun 1 2023, 7:07 AM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
4814

A question I had: Does this function also require a call to recomputeLivenessFlags after PostRA?

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
134

Typo:

nemanjai added inline comments.Jul 5 2023, 6:50 AM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
4814

We have an early exit on line 4757 for post-ra.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
388

Weird. I could have sworn I ran clang-format on this. I'll fix it. Thanks for noticing.

441

RegMBB.second is a MachineBasicBlock so we can't add it to the list of registers. The newly created registers won't be created with kill flags, so we don't need to recompute them and the result reg for the first PHI (Dst) is still defined in the exact same place, so it shouldn't need its kill flags recomputed.

1943

Ah, I see. Good catch. I think I'll just add the result register of SrcMI to the set and leave the SrcMI alone (DCE is run immediately after this pass to remove dead instructions).

nemanjai updated this revision to Diff 537357.Jul 5 2023, 8:05 AM

Address comments

  • Run clang-format
  • Resolve issue with shift-extend peephole
stefanp accepted this revision as: stefanp.Jul 18 2023, 12:13 PM

LGTM.
Thank you for cleaning up this pass!

This revision is now accepted and ready to land.Jul 18 2023, 12:13 PM
arsenm added a subscriber: arsenm.Jul 18 2023, 12:39 PM

Kill flags should be removed. You should switch to reverse liveness which does not rely on accurate kill flags

arsenm added inline comments.Jul 18 2023, 12:40 PM
llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp
425

This should use enterBasicBlockEnd and walk backwards through the block

Kill flags should be removed. You should switch to reverse liveness which does not rely on accurate kill flags

I'm not sure I follow. Are you suggesting that we remove all kill flags in this pass? We're really only fixing them up to appease the verifier rather than because anything in the PPC back end relies on the flags.

nemanjai added inline comments.Sep 21 2023, 10:11 AM
llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp
425

This will be done in a follow-up patch.