This is an archive of the discontinued LLVM Phabricator instance.

Make sure all subranges are define during re-materialization of constants in RegCoalescer
ClosedPublic

Authored by kariddi on Jan 28 2016, 1:03 PM.

Details

Reviewers
MatzeB
Summary

Hello,

In our target we sometimes load multiple constants with the same instruction into registers that have disjoint subregisters and then use the subregisters separately.

In certain cases during constant re-materialization in the register coalescer the one of these multiple constants load instructions can be selected to rematerialize only one of the constants.

The situation is something like this:

vreg1 = LOAD CONSTANTS 5, 8 ; Loading both 5 and 8 in different subregs
; Copying only part of the register here, but the rest is undef.
vreg2:sub_16bit<def, read-undef> = COPY vreg1:sub_16bit
==>
; Materialize all the constants but only using one eventually.
vreg2 = LOAD_CONSTANTS 5, 8

In the above case we had an undefined registers that gets partially defined by the copy of one of the two constants in VREG1.
The re-materialization substitutes the copy with the load instructions defining the whole register now.
This is fine, because the register was undefined, but if this happens we need to update the sub-register liveness to keep track that there is a dead definition of the lanes that weren't loaded previously.

Diff Detail

Event Timeline

kariddi updated this revision to Diff 46308.Jan 28 2016, 1:03 PM
kariddi retitled this revision from to Make sure all subranges are define during re-materialization of constants in RegCoalescer .
kariddi updated this object.
kariddi added a reviewer: MatzeB.
kariddi set the repository for this revision to rL LLVM.
kariddi added a subscriber: llvm-commits.
arsenm added a subscriber: arsenm.Jan 28 2016, 2:21 PM
arsenm added inline comments.
lib/CodeGen/MachineVerifier.cpp
1718–1719

Random line break

Can't make it trigger with the provided LL files sadly :-/ A test case for a public backend would have been very welcome!

lib/CodeGen/MachineVerifier.cpp
1718–1719

Thanks!

MatzeB edited edge metadata.Jan 29 2016, 7:54 PM

Thanks for working on this.

The MachineVerifier enhancement could be split into a separate commit and is good to go with the issues below addressed. It could probably be extended to handle use operands as well but that can be done in a separate commit.

I will need more time to review the register coalescer code. I'm not convinced yet that the unrelated subregisters that lack the liveness segments are always free, if they are not then we just cannot do the rematerialisation.

lib/CodeGen/MachineVerifier.cpp
1721
  • MRI->def_end() is shorter and looks more uniform.
  • I'd welcome a MachineRegisterInfo::def_iterator instead of the auto because a reader cannot easily deduce the type here.

Alternatively you could use this:

for (const MachineOperand &MO : make_range(MRI->def_begin(Reg), MRI->def_end())) { ... }
1727

LaneBitmask LiveMask = 0;

1728

LiveInterval::SubRange instead of auto is friendlier for the reader.

1734–1737

I think you can make this test more powerful if you leave out the if (MO.getSubReg() == 0) above and instead do something like this here:

LaneBitmask MOMask = TRI->getSubRegIndexLaneMask(MO.getSubReg());
if ( (LiveMask & MOMask) != MOMask) { report(...); }
lib/CodeGen/RegisterCoalescer.cpp
1016

I think you can leave out this check, the DstInt.hasSubRanges() can only happen if this is enabled anyway.

1020

LaneBitmask MaxMask = ...

1021

see above.

1022–1024

Do not use {} here (llvm coding convention)

1030

I think this should be

SlotIndex DefIndex = CurrIdx.getRegSlot(NewMI->getOperand(0).isEarlyClobber());
SR->createDeadDef(DefIndex, ...);

(even though I don't think any of the GPU targets with subregisters use early clobbers).

Thanks Mathias for all the comments!

I'm working on the new patch with your suggestions addressed.

The final commit would be split in two (fix + verifier).

lib/CodeGen/MachineVerifier.cpp
1734–1737

Actually this doesn't seem to be equivalent to what was there before. If getSubReg() returns 0 I get the mask ~0U from getSubRegIndexLaneMask

Would that work if I made that:

LaneBitmask MOMask = TRI->getSubRegIndexLaneMask(MO.getSubReg()) & MaxMask;
if ((LiveMask & MOMask) != MOMask) {
  report("Lane mask defined but no sub range reports its definition",
         MF);
}
kariddi updated this revision to Diff 46687.Feb 2 2016, 12:02 PM
kariddi edited edge metadata.
kariddi removed rL LLVM as the repository for this revision.

Updated patch with comments from Mathias.

I know he is working on additional verifier changes, so those could be split away from this patch

kariddi set the repository for this revision to rL LLVM.Feb 2 2016, 12:06 PM
MatzeB accepted this revision.Feb 2 2016, 1:20 PM
MatzeB edited edge metadata.

I've looked at the rematerialization code am convinced now that we can safely rematerialize a full vreg def even if we only use parts of it later.

So this LGTM with the comments below addressed.

lib/CodeGen/RegisterCoalescer.cpp
1021

You should move the DefIndex definition (from below) up and use it everywhere where you currently use CurrIdx.getRegSlot().

1027–1028

LIS->getVNInfoAllocator() is repeated 3 times and could be pulled into a variable.

1030

This is a 2nd createDeadDef here, you only need one (at DefIndex)!

This revision is now accepted and ready to land.Feb 2 2016, 1:20 PM

Thanks Mathias :-)

lib/CodeGen/RegisterCoalescer.cpp
1030

Ouch, copy pasting

kariddi closed this revision.Feb 2 2016, 4:31 PM

Committed revision 259614.