Check that no Q-regs are live out of the loop, unless the instruction within the loop is predicated on the vctp.
Details
Diff Detail
Event Timeline
Please ignore the RDA and BaseInstrInfo changes, I've just copied them from the parent.
Removed most of the code and going for the basic approach first: don't transform when any Q regs are live in the exit blocks.
Just a few nits/questions inlined.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
677 | Nit: perhaps good to mention here for completeness that MVE instructions that e.g. accumulate in GPR registers don't need checking, because tail-predication doesn't impact these values? | |
682 | Just checking: do we have test with only live Q registers in exit blocks that are not defined in the loop? That would perhaps be a bit strange, but why not... | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-operand-liveout.mir | ||
190 | q0 is live out/in, so the change in this patch will prevent TP. Just curious how that is related to the use of r2. I mean, that makes sense, the test is good to have, but is not the result of this patch? |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
682 | Probably not... and now looking at this again, there's nothing here checking that MI is actually producing the liveout value anyway... | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-operand-liveout.mir | ||
190 | Right, this test is to cover the next step when we try to transform these loops. Would be nice to just get it in so I don't forget about it again! |
Hi Sam,
just a comment inlined
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
683 | Why not returning false here instead of populating a SmallVector and checking for its emptiness later ? |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
683 | Indeed... I think this needs a rewrite :) |
Rewritten so that we check all the live-out regs during tail predication validation. I've added a couple of extra tests too to check that we allow predicated liveout instructions as well as having live out regs that aren't defined by instructions within the loop body.
A few more comments while I am looking in more detail at the tests now.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
519 | nit: for (const auto &RegMask ..) | |
537 | Looks like we don't need this loop (and insert instructions in LiveMIs), the findFirstVPTPredOperandIdx() can be done directly in the loop above. |
LGTM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
537 | Ok, ignore this one; I understand this is a prep step for a follow up patch. |
nit: for (const auto &RegMask ..)