This is an archive of the discontinued LLVM Phabricator instance.

[ARM][LowOverheadLoops] Update liveness info
ClosedPublic

Authored by samparker on Jan 3 2020, 1:49 AM.

Details

Summary

After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). This has been extracted from D71837.

Diff Detail

Event Timeline

samparker created this revision.Jan 3 2020, 1:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2020, 1:49 AM

Looks interesting. Thanks for splitting it out, it makes it a lot clearer.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
63

I believe there is already a PostOrderTraversal, but it deals with whole functions?

Maybe call this PostOrderTraversalForLoop, to distinguish it a little?

77

within the block -> within the loop ?

llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

Is this r7 really dead?

llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir
111

Is r4 really dead here? It's "returned" (kind of) right?

llvm/test/CodeGen/Thumb2/LowOverheadLoops/while.mir
4–5

It can be good to update these checks in separate NFC patched so the diffs are clearly visible here.

samparker marked 2 inline comments as done.Jan 3 2020, 4:59 AM
samparker added inline comments.
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

Yeah, this is a weird one, it definitely doesn't look dead. I'll look more into CFI_INSTRUCTION.

llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir
111

I reasoned that it's dead as far as liveness goes for the function... I'll have a look to see if there's some special handling or hooks for return values.

samparker updated this revision to Diff 236055.Jan 3 2020, 6:58 AM
  • Rebased.
  • Renamed class to PostOrderLoopTraversal.
samparker marked an inline comment as done.Jan 3 2020, 8:58 AM
samparker added inline comments.
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

So this looks okay to me, just because the input code labels tPUSH as killing its operands even though the CFI pseudo instructions use them. LiveRegs works on MachineInstrs, whereas CFI is some pseudo MC instruction which doesn't even contain MC operands.

samparker updated this revision to Diff 236311.Jan 6 2020, 1:39 AM

MachineBasicBlock has now been modified to iterate through the block to look for a return instruction. This then allows LivePhysRegs to add the callee saved regs for blocks where the return isn't the last instruction, such as when we have a POP in an IT block. This currently has the unwanted effect of meaning we can't remove all the instructions from the loop iteration count, see mve-tail-data-types where a callee saved register is used by the moves, but I think this should be resolved in D71837.

dmgreen added inline comments.Jan 6 2020, 2:20 AM
llvm/include/llvm/CodeGen/MachineBasicBlock.h
577 ↗(On Diff #236311)

This doesn't match the description above.

isReturnBlock seems to be used in a number of other places too. Would they all be OK with this change?

The one in addLiveOuts seems to mark the CSR's as live out of the block, with isn't really true. They are live out of POP the instruction. The live out's of the block are not modified. Can we modify recomputeLivenessFlags to consider isReturn instructions instead?

llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

Perhaps put another way, what does an instruction with a dead def and no side-effects mean? Does it mean we can remove it? What about the frame-setup flag? Does that mean we can't remove it?

The recomputeLivenessFlags seems to already be used in ifcvt. Does this already happen in the same way there? Looking at tADDrSPi, it might have hasSideEffects set, because it otherwise causes miscompiles. Which might suggest that this isn't correct, but that it's a known issue at least.

The remaining instruction aren't actually removed in the next patch... because IsSafeToRemove isn't used for the setup code yet as it needs to operate over multiple blocks. This is something we're probably just going to need to live with for now.

samparker marked an inline comment as done.Jan 6 2020, 2:31 AM
samparker added inline comments.
llvm/include/llvm/CodeGen/MachineBasicBlock.h
577 ↗(On Diff #236311)

The rest of code base seems ok with the change, but yes, trying it in LivePhysRegs sounds better, I'll give it a go.

samparker updated this revision to Diff 236318.Jan 6 2020, 2:51 AM

Modified recomputeLivenessFlags to look at return instructions instead.

dmgreen added inline comments.Jan 7 2020, 9:33 AM
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

I went looking in IfCvt to see if I could get this to produce the same thing (it's the other place that uses this function, and does seem to run it on entry blocks after frame lowering). I couldn't see any examples of it marking registers as dead though.

Then I tried regenerating this function (from the IR above) and it was quite different. I think mostly because the -frame-pointer option has changed. With -frame-pointer=all the code is more similar, but the PUSH above is saving r7 (which kind of makes sense, it's pushed two lines up).

Any chance this file might have been modified to the point that it is not really valid any more?

samparker marked an inline comment as done.Jan 7 2020, 11:47 PM
samparker added inline comments.
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
130–131

I would say that an instruction with dead defs and no side-effects can be removed, so there's definitely an issue with the tADDrSPi... but I also wondered the same thing after not being able to re-generate the test too.

samparker updated this revision to Diff 236771.Jan 8 2020, 12:15 AM

Regenerated the disjoint vcmp test.

dmgreen accepted this revision.Jan 8 2020, 12:46 AM

LGTM then. Thanks.

This revision is now accepted and ready to land.Jan 8 2020, 12:46 AM
This revision was automatically updated to reflect the committed changes.
samparker reopened this revision.Jan 9 2020, 5:01 AM

I had to revert this due to buildbot failures. It seems the print order of liveins is non-deterministic... D72441 hopefully addresses this.

This revision is now accepted and ready to land.Jan 9 2020, 5:01 AM
This revision was automatically updated to reflect the committed changes.