This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Avoid adding duplicate implicit operands when expanding pseudo insts.
ClosedPublic

Authored by wwei on Sep 1 2021, 9:19 AM.

Details

Summary

When expanding pseudo insts, in order to create a new machine instr, we use BuildMI, which will add implicit operands by default.
And transferImpOps will also copy implicit operands from old ones. Finally, duplicate implicit operands are added to the same inst.
Sometimes this can cause correctness issues. Like below inst,

renamable $w18 = nsw SUBSWrr renamable $w30, renamable $w14, implicit-def dead $nzcv

After expanding, it will become

$w18 = SUBSWrs renamable $w13, renamable $w14, 0, implicit-def $nzcv, implicit-def dead $nzcv

A redundant implicit-def $nzcv is added, but the dead flag is missing.

Diff Detail

Event Timeline

wwei created this revision.Sep 1 2021, 9:19 AM
wwei requested review of this revision.Sep 1 2021, 9:19 AM

Do we need to call transferImpOps here?
Or can we fix it directly to not add duplicate operands?

wwei added a comment.Sep 2 2021, 2:28 AM

Yes, we need to call transferImpOps here, because it will copy the implict operand include the flag--like dead flag.
But BuildMI will ignore the flag, so the changes in this MR will use CreateMachineInstr(.... /*NoImplicit=*/true) to avoid adding duplicate implict operands
I will try to explain why there may have correct issues in postra scheduling later.
The test case is updated in : https://bugs.llvm.org/show_bug.cgi?id=51711

Do we need to call transferImpOps here?
Or can we fix it directly to not add duplicate operands?

wwei updated this revision to Diff 370252.Sep 2 2021, 6:34 AM
wwei added a comment.Sep 2 2021, 6:43 AM

The detailed explanation is:
Before post-ra scheduling, initial order of instructions:

A:    ... implicit-def dead $nzcv
B:    ... implicit-def $nzcv, implicit-def dead $nzcv
C:    ... implicit-def $nzcv
D:    ... implicit killed $nzcv    (use)

In the right way, the relationship should look like this after building the sched graph.

A     B                      
 \   /                   
   C   
   |             
   D

The deps: A->C(output dep), B->C(output dep), C->D(true dep)

Unfortunately, when the dependency graph is built, the graph we actually get :

A     B                      
     /                   
   C   
   |             
   D

The deps: A(no any deps with B/C/D) , B->C(output dep), C->D(true dep), so after post-ra scheduling, we may get a wrong result:
B C A D, A will break the true dependency(C->D)

buildSchedGraph function will walk the list of instructions, from bottom moving up.
First, it will build dependency for D and C, C->D(true dep). Then for B, it will have B->C(output dep), we can see the code in addPhysRegDeps,

// Clear previous uses and defs of this register and its subergisters.
for (MCSubRegIterator SubReg(Reg, TRI, true); SubReg.isValid(); ++SubReg) {
  if (Uses.contains(*SubReg))
    Uses.eraseAll(*SubReg);
  if (!MO.isDead())
    Defs.eraseAll(*SubReg);
}

because B has an implicit-def $nzcv without dead, Defs list will clear all the previous define insts(C will be cleared). So when checking the dependence for A, it can‘t create output deps with C.
Finally, we can make sure the root cause is, inst B having a redundant implicit-def $nzcv, which is added by expanding pseudo pass.

foad added a subscriber: foad.Sep 3 2021, 2:29 AM

I guess addPhysRegDeps is not prepared to see two defs of the same physical register with inconsistent "dead" flags. Is it even valid MIR to have two defs of the same phys reg?

dmgreen accepted this revision.Sep 3 2021, 6:21 AM

Yes, we need to call transferImpOps here, because it will copy the implict operand include the flag--like dead flag.

The kill/dead flags only need to be conservatively correct, as far as I understand. So we can remove them and let another pass re-calculate them as needed. Having them more correct is usually a good thing though.

I guess addPhysRegDeps is not prepared to see two defs of the same physical register with inconsistent "dead" flags. Is it even valid MIR to have two defs of the same phys reg?

I would think it was invalid, yeah. At least in this case there doesn't seem to be much reason to keep both around.

It feels a little odd to me to bypass the BuildMI call so that we can copy the implicit defs from another instruction. As opposed to using/updating the ones that are added for this instruction. But from what I can tell they should be fine for all the instructions handled here, and I don't see anywhere else using transferImpOps that should have the same issues.

LGTM. Thanks for the fix.

This revision is now accepted and ready to land.Sep 3 2021, 6:21 AM
foad added a comment.Sep 3 2021, 6:42 AM

I tried to fix a similar issue in generic code in D100939 but got some push-back.

Matt added a subscriber: Matt.Sep 3 2021, 8:45 AM