Page MenuHomePhabricator

Insert IMPLICIT_DEFS for undef uses in tail merging
ClosedPublic

Authored by MatzeB on Aug 22 2017, 3:58 PM.

Details

Summary

Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:

  • computeLiveIns() just computes the live-ins in a LivePhysRegs set.
  • addLiveIns() applies the live-ins to a block live-in list.
  • computeAndAddLiveIns() is a convenience function combining the other two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Aug 22 2017, 3:58 PM
kparzysz added inline comments.Aug 23 2017, 10:01 AM
lib/CodeGen/BranchFolding.cpp
379 ↗(On Diff #112247)

This needs to calculate liveouts with the tail removed.

This testcase still fails:

# RUN: llc -march=hexagon -run-pass branch-folder %s -o - 
---
name: fred
tracksRegLiveness: true

body: |
  bb.0:
    successors: %bb.1, %bb.2
      J2_jumpt undef %p0, %bb.2, implicit-def %pc
      J2_jump %bb.1, implicit-def %pc

  bb.1:
    successors: %bb.3
      %r1 = A2_tfrsi 1
      PS_storerhabs 0, undef %r0
      %r0 = A2_tfrsi 1
      J2_jump %bb.3, implicit-def %pc

  bb.2:
    successors: %bb.3
      %r0 = L2_loadruh_io undef %r1, 0
      PS_storerhabs 0, killed %r0
      %r0 = A2_tfrsi 1
      J2_jump %bb.3, implicit-def %pc

  bb.3:
      PS_jmpret killed %r31, implicit undef %r0, implicit-def %pc
...

The output is

body:             |
  bb.0:
    successors: %bb.1(0x40000000), %bb.2(0x40000000)
  
    J2_jumpt undef %p0, %bb.2, implicit-def %pc
  
  bb.1:
    successors: %bb.3(0x80000000)
  
    %r1 = A2_tfrsi 1
    J2_jump %bb.3, implicit-def %pc
  
  bb.2:
    successors: %bb.3(0x80000000)
  
    %r0 = L2_loadruh_io undef %r1, 0
  
  bb.3:
    liveins: %r0
  
    PS_storerhabs 0, killed %r0
    %r0 = A2_tfrsi 1
    PS_jmpret killed %r31, implicit undef %r0, implicit-def %pc

...

The path bb.0 -> bb.1 -> bb.3 does not define r0, but r0 is a live-in to bb.3.

kparzysz added inline comments.Aug 23 2017, 10:17 AM
lib/CodeGen/BranchFolding.cpp
379 ↗(On Diff #112247)

In the original testcase, one of the blocks became the common tail. In this testcase, r1=... was added to that block to force a fresh block to be created for the tail. Then r0=... was added to both source blocks, but at the end so that it would be a part of the common tail. Because of how the live-outs in the code are calculated, this would prevent implicit defs from being added.

I had to change the order of the blocks to trigger the problem, since the traversal order for the predecessors may avoid exposing it.

kparzysz edited edge metadata.Aug 31 2017, 1:28 PM

Are there any new developments?

Are there any new developments?

I should be able to get another stab at this today or tomorrow.

MatzeB updated this revision to Diff 113913.Sep 5 2017, 2:29 PM

Rework branchfolding code to behave like this:

  • Make sure all the undef flags are merged
  • Compute new live-ins to NewDest (but do not set them yet)
  • Insert IMP_DEFS into NewDest predecessors where LiveOuts mismatch the new live-ins
  • Set new live-in of NewDest
  • For all other blocks with common tail:
    • Compute liveness at the position where we will cut the common tail and place the branch.
    • Insert IMP_DEFS as necessary where liveness mismatches live-ins of NewDest
    • Replace common tail with branch
kparzysz accepted this revision.Sep 6 2017, 9:18 AM

This is great. Thanks!

This revision is now accepted and ready to land.Sep 6 2017, 9:18 AM
This revision was automatically updated to reflect the committed changes.

Hi Matthias,

I have stumbled upon a problem related to this change while working on an out-of-tree target.
It reproduces with Hexagon too.

Is the insertion of the IMPLICIT_DEF legal if one of the two terminators of %bb.0 uses $r0 ?
(Here J2_jumpr uses $r0)

Consider this case:

---
name: func0
tracksRegLiveness: true

body: |
  bb.0:
    liveins: $r0, $r31
    successors: %bb.2
      J2_jumpt undef $p0, %bb.2, implicit-def $pc
      J2_jumpr killed $r0, implicit-def $pc

  bb.1:
    liveins: $r31
    successors: %bb.3
      $r0 = L2_loadruh_io undef $r1, 0
      PS_storerhabs 0, killed $r0
      J2_jump %bb.3, implicit-def $pc

  bb.2:
    liveins: $r31
    successors: %bb.3
      PS_storerhabs 0, undef $r0
      J2_jump %bb.3, implicit-def $pc

  bb.3:
    liveins: $r31
      PS_jmpret killed $r31, implicit-def $pc
...
---

Running llc -march=hexagon -run-pass branch-folder gives:

body:             |
  bb.0:
    successors: %bb.1(0x80000000)
    liveins: $r0, $r31

    $r0 = IMPLICIT_DEF
    J2_jumpt undef $p0, %bb.1, implicit-def $pc
    J2_jumpr killed $r0, implicit-def $pc

  bb.1:
    liveins: $r0

    PS_storerhabs 0, $r0
    PS_jmpret killed $r31, implicit-def $pc
...

Thus making $r0 undefined in bb.0, when originally it was not.

Thanks,
Alberto