Following on from D87757 "[SplitKit] Only copy live lanes", in
SplitEditor::addDeadDef, when we're checking whether the parent live
interval has a subrange defining the same lanes, tolerate the case
where the parent subrange defines a superset of the lanes. This can
happen when the child subrange comes from SplitEditor::buildCopy
decomposing a partial copy into a sequence of subreg copies that cover
the required lanes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
In the test case, RAGreedy::tryLocalSplit splits %72 at 7376r. First we have this:
%72 [1072r,7376r:0)[7376r,7440r:1) 0@1072r 1@7376r L000000000000000C [1072r,1072d:0)[7376r,7440r:1) 0@1072r 1@7376r L00000000000000F3 [1072r,7440r:0) 0@1072r weight:5.969267e-04 ... 1072B %72:sgpr_128 = COPY %71:sgpr_128 ... 7376B %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec 7440B %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)
Then SplitEditor::enterIntvBefore creates %914 and inserts the bundled copies (annoyingly the SlotIndexes get renumbered at this point):
%72 [1072r,7388r:0)[7388r,7452r:1) 0@1072r 1@7388r L000000000000000C [1072r,1072d:0)[7388r,7452r:1) 0@1072r 1@7388r L00000000000000F3 [1072r,7452r:0) 0@1072r weight:5.969267e-04 %914 EMPTY 0@7380r L00000000000000F0 [7380r,7380d:0) 0@7380r L0000000000000003 [7380r,7380d:0) 0@7380r L000000000000000C EMPTY weight:0.000000e+00 ... 1072B %72:sgpr_128 = COPY %71:sgpr_128 ... 7380B undef %914.sub2_sub3:sgpr_128 = COPY %72.sub2_sub3:sgpr_128 { internal %914.sub0:sgpr_128 = COPY %72.sub0:sgpr_128 7388B } %72.sub1:sgpr_128 = COPY %912:sreg_32_xm0_xexec 7452B %558:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %72:sgpr_128, 0, 0, 0 :: (dereferenceable invariant load 4)
Then SplitEditor::finish tries to add dead defs for %914 LF0 and L03 but fails to find matching lane masks in the original live ranges for %72 -- which just had a single lane mask LF3.
As an alternative fix, I wonder if something should have spotted that the live ranges for %914:
%914 EMPTY 0@7380r L00000000000000F0 [7380r,7380d:0) 0@7380r L0000000000000003 [7380r,7380d:0) 0@7380r L000000000000000C EMPTY weight:0.000000e+00
should be simplified to:
%914 EMPTY 0@7380r L00000000000000F3 [7380r,7380d:0) 0@7380r L000000000000000C EMPTY weight:0.000000e+00
since the subranges for LF0 and L03 are identical.
I don't think it's safe to add dead defs to the extra lanes. You could split the superset into the matching part and the remaining part, then only operate on the matching one.
I don't understand. We're not adding dead defs for any "extra" lanes. The parent had a subrange with lane mask LF3. The child has two subranges with lane masks LF0 and L03, so together they cover exactly the same lanes.