After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). This has been extracted from D71837.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks interesting. Thanks for splitting it out, it makes it a lot clearer.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
63 | I believe there is already a PostOrderTraversal, but it deals with whole functions? Maybe call this PostOrderTraversalForLoop, to distinguish it a little? | |
77 | within the block -> within the loop ? | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
130–131 | Is this r7 really dead? | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir | ||
111 | Is r4 really dead here? It's "returned" (kind of) right? | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/while.mir | ||
4–5 | It can be good to update these checks in separate NFC patched so the diffs are clearly visible here. |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
---|---|---|
130–131 | Yeah, this is a weird one, it definitely doesn't look dead. I'll look more into CFI_INSTRUCTION. | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir | ||
111 | I reasoned that it's dead as far as liveness goes for the function... I'll have a look to see if there's some special handling or hooks for return values. |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
---|---|---|
130–131 | So this looks okay to me, just because the input code labels tPUSH as killing its operands even though the CFI pseudo instructions use them. LiveRegs works on MachineInstrs, whereas CFI is some pseudo MC instruction which doesn't even contain MC operands. |
MachineBasicBlock has now been modified to iterate through the block to look for a return instruction. This then allows LivePhysRegs to add the callee saved regs for blocks where the return isn't the last instruction, such as when we have a POP in an IT block. This currently has the unwanted effect of meaning we can't remove all the instructions from the loop iteration count, see mve-tail-data-types where a callee saved register is used by the moves, but I think this should be resolved in D71837.
llvm/include/llvm/CodeGen/MachineBasicBlock.h | ||
---|---|---|
577 ↗ | (On Diff #236311) | This doesn't match the description above. isReturnBlock seems to be used in a number of other places too. Would they all be OK with this change? The one in addLiveOuts seems to mark the CSR's as live out of the block, with isn't really true. They are live out of POP the instruction. The live out's of the block are not modified. Can we modify recomputeLivenessFlags to consider isReturn instructions instead? |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
130–131 | Perhaps put another way, what does an instruction with a dead def and no side-effects mean? Does it mean we can remove it? What about the frame-setup flag? Does that mean we can't remove it? The recomputeLivenessFlags seems to already be used in ifcvt. Does this already happen in the same way there? Looking at tADDrSPi, it might have hasSideEffects set, because it otherwise causes miscompiles. Which might suggest that this isn't correct, but that it's a known issue at least. |
The remaining instruction aren't actually removed in the next patch... because IsSafeToRemove isn't used for the setup code yet as it needs to operate over multiple blocks. This is something we're probably just going to need to live with for now.
llvm/include/llvm/CodeGen/MachineBasicBlock.h | ||
---|---|---|
577 ↗ | (On Diff #236311) | The rest of code base seems ok with the change, but yes, trying it in LivePhysRegs sounds better, I'll give it a go. |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
---|---|---|
130–131 | I went looking in IfCvt to see if I could get this to produce the same thing (it's the other place that uses this function, and does seem to run it on entry blocks after frame lowering). I couldn't see any examples of it marking registers as dead though. Then I tried regenerating this function (from the IR above) and it was quite different. I think mostly because the -frame-pointer option has changed. With -frame-pointer=all the code is more similar, but the PUSH above is saving r7 (which kind of makes sense, it's pushed two lines up). Any chance this file might have been modified to the point that it is not really valid any more? |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir | ||
---|---|---|
130–131 | I would say that an instruction with dead defs and no side-effects can be removed, so there's definitely an issue with the tADDrSPi... but I also wondered the same thing after not being able to re-generate the test too. |
I had to revert this due to buildbot failures. It seems the print order of liveins is non-deterministic... D72441 hopefully addresses this.
I believe there is already a PostOrderTraversal, but it deals with whole functions?
Maybe call this PostOrderTraversalForLoop, to distinguish it a little?