This is an archive of the discontinued LLVM Phabricator instance.

[SplitKit] In addDeadDef tolerate parent range that defines more lanes
ClosedPublic

Authored by foad on Sep 21 2020, 5:14 AM.

Details

Summary

Following on from D87757 "[SplitKit] Only copy live lanes", in
SplitEditor::addDeadDef, when we're checking whether the parent live
interval has a subrange defining the same lanes, tolerate the case
where the parent subrange defines a superset of the lanes. This can
happen when the child subrange comes from SplitEditor::buildCopy
decomposing a partial copy into a sequence of subreg copies that cover
the required lanes.

Diff Detail

Event Timeline

foad created this revision.Sep 21 2020, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 21 2020, 5:14 AM
foad requested review of this revision.Sep 21 2020, 5:14 AM
foad added a comment.EditedSep 21 2020, 5:28 AM

In the test case, RAGreedy::tryLocalSplit splits %72 at 7376r. First we have this:

%72 [1072r,7376r:0)[7376r,7440r:1)  0@1072r 1@7376r L000000000000000C [1072r,1072d:0)[7376r,7440r:1)  0@1072r 1@7376r L00000000000000F3 [1072r,7440r:0)  0@1072r weight:5.969267e-04
...
1072B	  %72:sgpr_128 = COPY %71:sgpr_128
...
7376B	  %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
7440B	  %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)

Then SplitEditor::enterIntvBefore creates %914 and inserts the bundled copies (annoyingly the SlotIndexes get renumbered at this point):

%72 [1072r,7388r:0)[7388r,7452r:1)  0@1072r 1@7388r L000000000000000C [1072r,1072d:0)[7388r,7452r:1)  0@1072r 1@7388r L00000000000000F3 [1072r,7452r:0)  0@1072r weight:5.969267e-04
%914 EMPTY  0@7380r L00000000000000F0 [7380r,7380d:0)  0@7380r L0000000000000003 [7380r,7380d:0)  0@7380r L000000000000000C EMPTY weight:0.000000e+00
...
1072B	  %72:sgpr_128 = COPY %71:sgpr_128
...
7380B	  undef %914.sub2_sub3:sgpr_128 = COPY %72.sub2_sub3:sgpr_128 {
	    internal %914.sub0:sgpr_128 = COPY %72.sub0:sgpr_128
7388B	  }
  %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec
7452B	  %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)

Then SplitEditor::finish tries to add dead defs for %914 LF0 and L03 but fails to find matching lane masks in the original live ranges for %72 -- which just had a single lane mask LF3.

As an alternative fix, I wonder if something should have spotted that the live ranges for %914:

%914 EMPTY  0@7380r L00000000000000F0 [7380r,7380d:0)  0@7380r L0000000000000003 [7380r,7380d:0)  0@7380r L000000000000000C EMPTY weight:0.000000e+00

should be simplified to:

%914 EMPTY  0@7380r L00000000000000F3 [7380r,7380d:0)  0@7380r L000000000000000C EMPTY weight:0.000000e+00

since the subranges for LF0 and L03 are identical.

I don't think it's safe to add dead defs to the extra lanes. You could split the superset into the matching part and the remaining part, then only operate on the matching one.

foad added a comment.Sep 24 2020, 1:48 AM

I don't think it's safe to add dead defs to the extra lanes. You could split the superset into the matching part and the remaining part, then only operate on the matching one.

I don't understand. We're not adding dead defs for any "extra" lanes. The parent had a subrange with lane mask LF3. The child has two subranges with lane masks LF0 and L03, so together they cover exactly the same lanes.

kparzysz accepted this revision.Sep 24 2020, 7:15 AM

You're right, it goes in the other direction. Looks ok to me.

This revision is now accepted and ready to land.Sep 24 2020, 7:15 AM
foad added a subscriber: vpykhtin.Sep 25 2020, 3:25 AM