While validating live-out values, record instructions that look like a reduction. This will comprise of a vector op (for now only vadd), a vorr (vmov) which store the previous value of vadd and then a vpsel in the exit block which is predicated upon a vctp. This vctp will combine the last two iterations using the vmov and vadd into a vector which can then be consumed by a vaddv.
Once we have determined that it's safe to perform tail-predication, we need to change this sequence of instructions so that the predication doesn't produce incorrect code. This involves changing the register allocation of the vadd so it updates itself and the predication on the final iteration will not update the falsely predicated lanes. This mimics what the vmov, vctp and vpsel do and so we then don't need any of those instructions.
Details
- Reviewers
SjoerdMeijer dmgreen - Commits
- rG3ee580d0176f: [ARM][LowOverheadLoops] Handle reductions
Diff Detail
Event Timeline
Big patch: this is just a first scan of the code, and a first round of nits. Now going to look again, to let things sink in.
llvm/include/llvm/CodeGen/ReachingDefAnalysis.h | ||
---|---|---|
191–192 | you renamed this Defs to Incoming... | |
193 | So rename this one too? | |
271 | and here too? | |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
205 | Can you comment what these members are? I agree that most are self-explanatory, but I am e.g. interested in Init, and if it e.g. can be null (there is a check in the fixup function), and what the meaning is of that. | |
574 | nit: perhaps an assert that VPSEL is a vpsel would be good. | |
594 | nit, I think nicer to read is: MO.getReg() == 0 -> !MO.getReg().isValid() | |
602 | Nit: just a bit shorter would be: for (auto &MO : MI->uses()) { if (MO.isImm() && MO.getImm() == Imm) return true; return false; | |
650 | nit: was just curious why we expect the first item in the set to be the vpsel. Can we rely on that with a set? | |
664 | Could this be a good candidate for a helper function in ARMBaseInstrInfo.h? | |
681 | I guess you mean VMOV can be an alias for VORR, which you're checking here? | |
708 | Can or should this not be checked much earlier? |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
205 | 'Init' is the possible instruction that maybe initialising our result register, such as a mov #0, but we won't necessarily have an instruction doing this. | |
650 | Because the set only has one member if we get here. | |
664 | I'm not sure how relevant it is to the rest of the backend, but it would be more readable here as a local helper - especially once we add more supported opcodes. | |
681 | Yes, I don't think we actually have a MVE VMOV instruction. | |
708 | Yes, at some point in an unknown patch. I've moved it into one of the legality helpers. |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
684 | typo: that | |
808 | Perhaps it's good to add an comment here or in the description of the algorithm on line 733 - 753 that reductions need special treatment as they define values that are not used by predicated instructions inside the loop. | |
1336 | Do we have a test case with more than 1 reduction? |
- Rebased after adding tests in reductions.ll.
- Re-added support for VADDi8 and VADDi16.
- Added some extra comments and TODOs.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
693 | The vectorizer doesn't seem to want to produce vector float reductions loops, so I've left them out on testing. |
you renamed this Defs to Incoming...