This is an archive of the discontinued LLVM Phabricator instance.

Insert IMPLICIT_DEFS for undef uses in tail merging
ClosedPublic

Authored by MatzeB on Aug 22 2017, 3:58 PM.

Details

Summary

Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:

  • computeLiveIns() just computes the live-ins in a LivePhysRegs set.
  • addLiveIns() applies the live-ins to a block live-in list.
  • computeAndAddLiveIns() is a convenience function combining the other two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Aug 22 2017, 3:58 PM
kparzysz added inline comments.Aug 23 2017, 10:01 AM
lib/CodeGen/BranchFolding.cpp
379

This needs to calculate liveouts with the tail removed.

This testcase still fails:

# RUN: llc -march=hexagon -run-pass branch-folder %s -o - 
---
name: fred
tracksRegLiveness: true

body: |
  bb.0:
    successors: %bb.1, %bb.2
      J2_jumpt undef %p0, %bb.2, implicit-def %pc
      J2_jump %bb.1, implicit-def %pc

  bb.1:
    successors: %bb.3
      %r1 = A2_tfrsi 1
      PS_storerhabs 0, undef %r0
      %r0 = A2_tfrsi 1
      J2_jump %bb.3, implicit-def %pc

  bb.2:
    successors: %bb.3
      %r0 = L2_loadruh_io undef %r1, 0
      PS_storerhabs 0, killed %r0
      %r0 = A2_tfrsi 1
      J2_jump %bb.3, implicit-def %pc

  bb.3:
      PS_jmpret killed %r31, implicit undef %r0, implicit-def %pc
...

The output is

body:             |
  bb.0:
    successors: %bb.1(0x40000000), %bb.2(0x40000000)
  
    J2_jumpt undef %p0, %bb.2, implicit-def %pc
  
  bb.1:
    successors: %bb.3(0x80000000)
  
    %r1 = A2_tfrsi 1
    J2_jump %bb.3, implicit-def %pc
  
  bb.2:
    successors: %bb.3(0x80000000)
  
    %r0 = L2_loadruh_io undef %r1, 0
  
  bb.3:
    liveins: %r0
  
    PS_storerhabs 0, killed %r0
    %r0 = A2_tfrsi 1
    PS_jmpret killed %r31, implicit undef %r0, implicit-def %pc

...

The path bb.0 -> bb.1 -> bb.3 does not define r0, but r0 is a live-in to bb.3.

kparzysz added inline comments.Aug 23 2017, 10:17 AM
lib/CodeGen/BranchFolding.cpp
379

In the original testcase, one of the blocks became the common tail. In this testcase, r1=... was added to that block to force a fresh block to be created for the tail. Then r0=... was added to both source blocks, but at the end so that it would be a part of the common tail. Because of how the live-outs in the code are calculated, this would prevent implicit defs from being added.

I had to change the order of the blocks to trigger the problem, since the traversal order for the predecessors may avoid exposing it.

kparzysz edited edge metadata.Aug 31 2017, 1:28 PM

Are there any new developments?

Are there any new developments?

I should be able to get another stab at this today or tomorrow.

MatzeB updated this revision to Diff 113913.Sep 5 2017, 2:29 PM

Rework branchfolding code to behave like this:

  • Make sure all the undef flags are merged
  • Compute new live-ins to NewDest (but do not set them yet)
  • Insert IMP_DEFS into NewDest predecessors where LiveOuts mismatch the new live-ins
  • Set new live-in of NewDest
  • For all other blocks with common tail:
    • Compute liveness at the position where we will cut the common tail and place the branch.
    • Insert IMP_DEFS as necessary where liveness mismatches live-ins of NewDest
    • Replace common tail with branch
kparzysz accepted this revision.Sep 6 2017, 9:18 AM

This is great. Thanks!

This revision is now accepted and ready to land.Sep 6 2017, 9:18 AM
This revision was automatically updated to reflect the committed changes.

Hi Matthias,

I have stumbled upon a problem related to this change while working on an out-of-tree target.
It reproduces with Hexagon too.

Is the insertion of the IMPLICIT_DEF legal if one of the two terminators of %bb.0 uses $r0 ?
(Here J2_jumpr uses $r0)

Consider this case:

---
name: func0
tracksRegLiveness: true

body: |
  bb.0:
    liveins: $r0, $r31
    successors: %bb.2
      J2_jumpt undef $p0, %bb.2, implicit-def $pc
      J2_jumpr killed $r0, implicit-def $pc

  bb.1:
    liveins: $r31
    successors: %bb.3
      $r0 = L2_loadruh_io undef $r1, 0
      PS_storerhabs 0, killed $r0
      J2_jump %bb.3, implicit-def $pc

  bb.2:
    liveins: $r31
    successors: %bb.3
      PS_storerhabs 0, undef $r0
      J2_jump %bb.3, implicit-def $pc

  bb.3:
    liveins: $r31
      PS_jmpret killed $r31, implicit-def $pc
...
---

Running llc -march=hexagon -run-pass branch-folder gives:

body:             |
  bb.0:
    successors: %bb.1(0x80000000)
    liveins: $r0, $r31

    $r0 = IMPLICIT_DEF
    J2_jumpt undef $p0, %bb.1, implicit-def $pc
    J2_jumpr killed $r0, implicit-def $pc

  bb.1:
    liveins: $r0

    PS_storerhabs 0, $r0
    PS_jmpret killed $r31, implicit-def $pc
...

Thus making $r0 undefined in bb.0, when originally it was not.

Thanks,
Alberto

I have the same question with @alberto-magni.

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 20 2022, 6:41 PM
Herald added subscribers: modimo, wenlei. · View Herald Transcript

Uh this is an old commit... Yes Albertos example does not look quite right, though I'm not sure how we would get there. The code does seem to compute liveness before inserting implicit defs (variants of LiveRegs.addLiveOuts(*Pred); LiveRegs.stepBackward() ... if (LiveRegs.available(...)) continue; BuildMI(IMPLICIT_DEF)) I would expect that when using stepBackward() over the jump instruction that uses r2 that it is marked as live and no IMPLICIT_DEF is inserted. If you want to debug this, you should probably start checking the computed liveness information there...

circYuan added a comment.EditedOct 25 2022, 2:26 AM

Hi @MatzeB, thank you for replying for this very the long yeras ago commit!!
AFAIK, this patch computes the live-ins of the common tail block, and compute the live-out of the common tail predecessor block.
If any of the predecessor live-outs mismatches the common tail live-ins, we insert IMPLICIT_DEF before the first terminator instruction, so the predecessor can have that register as a live-out, is it correct?
One thing I'm confuse is that in function mergeCommonTails, LiveRegs doesn't call stepBackward, it would only be called in computeLiveIns which is used to compute the live-ins of the common tail block. The predecessor block only calls addLiveOuts which add the pristine registers and all successors old live-ins to LiveRegs, but function replaceTailWithBranchTo reverse-iterates the MI and call stepBackward. Does mergeCommonTails also need to call stepBackward to update register to LiveRegs?
Since the IMPLICIT_DEF is inserted from the function mergeCommonTails in my case.
Many Thanks!!!