This is an archive of the discontinued LLVM Phabricator instance.

[MCP] Invalidate copy for super register in copy source
ClosedPublic

Authored by jrbyrnes on Aug 9 2023, 4:40 PM.

Details

Reviewers
bzEq
arsenm
lkail
Summary

We must also track the super sources of a copy, otherwise we introduce a sort of subtle bug.

Consider:

  1. DEF r0:r1
  2. USE r1
  3. r6:r9 = COPY r10:r13
  4. r14:15 = COPY r0:r1
  5. USE r6

6.. r1:4 = COPY r6:9

BackwardCopyPropagateBlock processes the instructions from bottom up. After processing 6., we will have propagatable copy for r1-r4 and r6-r9. After 5., we invalidate and erase the propagatble copy for r1-r4 and r6 but not for r7-r9.

The issue is that when processing 3., data structures still say we have valid copies for dest regs r7-r9 (from 6.). The corresponding defs for these registers in 6. are r1:r4, which we mark as registers to invalidate. When invalidating, we find the copy that corresponds to r1 is 4. (this was added when processing 4.), and we say that r1 now maps to unpropagatable copies. Thus, when we process 2., we do not have a valid copy, but when we process 1. we do -- because the mapped copy for subregister r0 was never invalidated.

The net result is to propagate the copy from 4. to 1., and replace DEF r0:r1 with DEF r14:r15. Then, we have a use before def in 2.

The main issue is that we have an inconsitent state between which def regs and which src regs are valid. When processing 5., we mark all the defs in 6. as invalid, but only the subreg use as invalid. Either we must only invalidate the individual subreg for both uses and defs, or the super register for both.

Diff Detail

Unit TestsFailed

Event Timeline

jrbyrnes created this revision.Aug 9 2023, 4:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2023, 4:40 PM
jrbyrnes requested review of this revision.Aug 9 2023, 4:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2023, 4:40 PM
jrbyrnes edited the summary of this revision. (Show Details)Aug 9 2023, 4:44 PM
arsenm added inline comments.Aug 9 2023, 5:06 PM
llvm/lib/CodeGen/MachineCopyPropagation.cpp
191–192

Can we track all of this in regunits? Following registers is way more confusing

209

I don't see how SrcRegs is used any differently. Should this just use a more normal looking liveness scan where there's only one set and each def removes the live uses?

llvm/test/CodeGen/AMDGPU/mcp-use-before-def.mir
28–29

Leftover?

lkail added a subscriber: lkail.EditedAug 9 2023, 7:20 PM
5: USE r6
6: r1:4 = COPY r6:9

Hi @jrbyrnes , do you know why MCP fails to invalidate r7:r9 at label 5? In my view, when we are invalidating r6 at label 5, we should also find the copy involving def or use of r6 which is label 6, and invalidate the regunits of both src operands(r6:r9) and dest operands(r1:r4), which is what invalidateRegister doing.

          RegsToInvalidate.insert(
              CopyOperands->Destination->getReg().asMCReg());
          RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
...
    for (MCRegister InvalidReg : RegsToInvalidate)
      for (MCRegUnit Unit : TRI.regunits(InvalidReg))
        Copies.erase(Unit);
5: USE r6
6: r1:4 = COPY r6:9

Hi @jrbyrnes , do you know why MCP fails to invalidate r7:r9 at label 5? In my view, when we are invalidating r6 at label 5, we should also find the copy involving def or use of r6 which is label 6, and invalidate the regunits of both src operands(r6:r9) and dest operands(r1:r4), which is what invalidateRegister doing.

          RegsToInvalidate.insert(
              CopyOperands->Destination->getReg().asMCReg());
          RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
...
    for (MCRegister InvalidReg : RegsToInvalidate)
      for (MCRegUnit Unit : TRI.regunits(InvalidReg))
        Copies.erase(Unit);

We only insert the MI corresponding to the copies for tracking the defs

for (MCRegUnit Unit : TRI.regunits(Def))
  Copies[Unit] = {MI, nullptr, {}, {}, true};

// Remember source that's copied to Def. Once it's clobbered, then
// it's no longer available for copy propagation.
for (MCRegUnit Unit : TRI.regunits(Src)) {
  auto I = Copies.insert({Unit, {nullptr, nullptr, {}, {}, false}});
  auto &Copy = I.first->second;
  if (!is_contained(Copy.DefRegs, Def))
    Copy.DefRegs.push_back(Def);
  if (!is_contained(Copy.SrcRegs, Src))
    Copy.SrcRegs.push_back(Src);
  Copy.LastSeenUseInCopy = MI;
}

Then, when calling invalidateRegister for the r6 in label 5, it is unable to find an MI in the CopyTracker. Thus

if (MachineInstr *MI = I->second.MI) {
  std::optional<DestSourcePair> CopyOperands =
      isCopyInstr(*MI, TII, UseCopyInstr);
  assert(CopyOperands && "Expect copy");

  RegsToInvalidate.insert(
      CopyOperands->Destination->getReg().asMCReg());
  RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
}

is never reached. I don't claim to understand why it was designed this way, but, from the perspective of this example, it seems that we should also be tracking this.

lkail added a comment.EditedAug 9 2023, 7:50 PM

We only insert the MI corresponding to the copies for tracking the defs

I see. I think the semantic of Copy[Unit].MI is the COPY defines Unit, forward propagation should be relying on this semantic.

Instead, Can Copy[Unit].LastSeenUseInCopy, which has the semantic of the most recent COPY tracked uses the Unit, help this case? In invalidateRegister,

if (MachineInstr *MI = I->second.MI) {...}
if (MachineInstr *MI = I->second.LastSeenUseInCopy) {...}
jrbyrnes added inline comments.Aug 10 2023, 10:25 AM
llvm/lib/CodeGen/MachineCopyPropagation.cpp
209

I don't see how SrcRegs is used any differently.

MCP currently works s.t. if a source or dest of some MI partially overlaps the source or dest of an active, then we say that the copy is no longer active across all its def regunits. In other words, we don't allow partial copy propagation. This is what Copy.DefRegs is used for, if we come across an MI with regunit that partially overlaps with some previous copy, we query the map to get the Copy structure, then we invalidate it for all the RegUnit in Def where Def is just the dest register in the actual copy.

However, we don't do the same for the src RegUnits of the copy. Enabling such is the intention behind Copy.SrcRegs. As is, if an MI has registers (B) which partially overlaps the source of a Copy (A), a subsequent MI will find that Copy is still active if it has registers that are in the difference A - B.

In short, Copy.SrcRegs holds the Source Reg of the copy, so that when we mark a copy as invalid, we can do so across all regunits in its source.

Should this just use a more normal looking liveness scan where there's only one set and each def removes the live uses

The tracker for backwardCopyProp really needs to track 3 things:

  1. "Activeness" of the uses of the copies. This is really just liveness, however any partial clobbering should say the full set of RegUnits involved in the use is inactive.
  2. Validity of propagation --
    1. Uses of copy uses - if we want to propagate a copy backwards, then we will replace DEF A; B = COPY A with -> DEF B. But, if there are any uses of A between the DEF and the COPY then the propagation is not longer valid.
    2. Uses of copy defs - If we have done the above propagation, and there is a user of B between the propagation site and the COPY, then we will have clobbered.

I think we can identify 1 and 2A with a tracker that does something that looks like a backwards liveness scan, however, I don't think we can capture 2B.

jrbyrnes updated this revision to Diff 549160.Aug 10 2023, 1:52 PM
jrbyrnes marked 2 inline comments as done.

Use LastSeenUseInCopy + use regunits in more places.

llvm/lib/CodeGen/MachineCopyPropagation.cpp
191–192

There are many places that use register MCRegister equality / isSubRegisterEq on CopyOperands->Destination / Source. In order to prove equality for regunits, we would need to iteratively do set equality which seems like it would make things more complicated. Maybe I've missed something?

lkail added inline comments.Aug 10 2023, 5:38 PM
llvm/lib/CodeGen/MachineCopyPropagation.cpp
153–154
diff
@@ -137,18 +137,20 @@ public:
     // and invalidate all of them.
     SmallSet<MCRegister, 8> RegsToInvalidate;
     RegsToInvalidate.insert(Reg);
+    auto invalidate = [&](MachineInstr *MI) {
+      std::optional<DestSourcePair> CopyOperands =
+          isCopyInstr(*MI, TII, UseCopyInstr);
+      assert(CopyOperands && "Expect copy");
+      RegsToInvalidate.insert(CopyOperands->Destination->getReg().asMCReg());
+      RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
+    };
     for (MCRegUnit Unit : TRI.regunits(Reg)) {
       auto I = Copies.find(Unit);
       if (I != Copies.end()) {
-        if (MachineInstr *MI = I->second.MI) {
-          std::optional<DestSourcePair> CopyOperands =
-              isCopyInstr(*MI, TII, UseCopyInstr);
-          assert(CopyOperands && "Expect copy");
-
-          RegsToInvalidate.insert(
-              CopyOperands->Destination->getReg().asMCReg());
-          RegsToInvalidate.insert(CopyOperands->Source->getReg().asMCReg());
-        }
+        if (MachineInstr *MI = I->second.MI)
+          invalidate(MI);
+        if (MachineInstr *MI = I->second.LastSeenUseInCopy)
+          invalidate(MI);
         RegsToInvalidate.insert(I->second.DefRegs.begin(),
                                 I->second.DefRegs.end());
       }

We should check both of I->second.MI and I->second.LastSeenUseInCopy.

191–192

I agree with that we should track at regunit basis. Maybe we can do that in another NFC patch.

jrbyrnes updated this revision to Diff 549218.Aug 10 2023, 6:24 PM
jrbyrnes marked an inline comment as done.

Always try to invalidate both the LastSeenUseInCopy and CopyInfo->MI.

With regard to the regunits conversion --- Should we keep the partial work in this patch, or save it all for a separate patch?

lkail added a comment.Aug 10 2023, 6:26 PM

Should we keep the partial work in this patch, or save it all for a separate patch?

We'd better use another NFC patch to track at regunit basis.

jrbyrnes updated this revision to Diff 549219.Aug 10 2023, 6:30 PM

Delete partial conversion to regunits

jrbyrnes updated this revision to Diff 549221.Aug 10 2023, 6:35 PM

Convert RegsToInvalidate back to MCRegister as well.

lkail accepted this revision.Aug 10 2023, 6:36 PM

LGTM with nit.

llvm/lib/CodeGen/MachineCopyPropagation.cpp
139

nit: RegsToInvalidate -> RegUnitsToInvalidate.

This revision is now accepted and ready to land.Aug 10 2023, 6:36 PM
jrbyrnes updated this revision to Diff 549223.Aug 10 2023, 6:44 PM

RegUnitsToInvalidate