Page MenuHomePhabricator

[ARM][LowOverheadLoops] Handle reductions

Authored by samparker on Mar 3 2020, 8:34 AM.



While validating live-out values, record instructions that look like a reduction. This will comprise of a vector op (for now only vadd), a vorr (vmov) which store the previous value of vadd and then a vpsel in the exit block which is predicated upon a vctp. This vctp will combine the last two iterations using the vmov and vadd into a vector which can then be consumed by a vaddv.
Once we have determined that it's safe to perform tail-predication, we need to change this sequence of instructions so that the predication doesn't produce incorrect code. This involves changing the register allocation of the vadd so it updates itself and the predication on the final iteration will not update the falsely predicated lanes. This mimics what the vmov, vctp and vpsel do and so we then don't need any of those instructions.

Diff Detail

Event Timeline

samparker created this revision.Mar 3 2020, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2020, 8:34 AM

Big patch: this is just a first scan of the code, and a first round of nits. Now going to look again, to let things sink in.


you renamed this Defs to Incoming...


So rename this one too?


and here too?


Can you comment what these members are? I agree that most are self-explanatory, but I am e.g. interested in Init, and if it e.g. can be null (there is a check in the fixup function), and what the meaning is of that.


nit: perhaps an assert that VPSEL is a vpsel would be good.


nit, I think nicer to read is:

MO.getReg() == 0



Nit: just a bit shorter would be:

for (auto &MO : MI->uses()) {
  if (MO.isImm() && MO.getImm() == Imm)
     return true;
return false;

nit: was just curious why we expect the first item in the set to be the vpsel. Can we rely on that with a set?


Could this be a good candidate for a helper function in ARMBaseInstrInfo.h?


I guess you mean VMOV can be an alias for VORR, which you're checking here?


Can or should this not be checked much earlier?

samparker marked 5 inline comments as done.Mar 16 2020, 8:28 AM
samparker added inline comments.

'Init' is the possible instruction that maybe initialising our result register, such as a mov #0, but we won't necessarily have an instruction doing this.


Because the set only has one member if we get here.


I'm not sure how relevant it is to the rest of the backend, but it would be more readable here as a local helper - especially once we add more supported opcodes.


Yes, I don't think we actually have a MVE VMOV instruction.


Yes, at some point in an unknown patch. I've moved it into one of the legality helpers.

samparker updated this revision to Diff 274072.Jun 29 2020, 5:33 AM

Rebased, which has made checking the VPSEL predicate a much more simple task.

SjoerdMeijer added inline comments.Jun 29 2020, 8:26 AM

typo: that


Perhaps it's good to add an comment here or in the description of the algorithm on line 733 - 753 that reductions need special treatment as they define values that are not used by predicated instructions inside the loop.


Do we have a test case with more than 1 reduction?

samparker updated this revision to Diff 274438.Jun 30 2020, 5:55 AM
  • Rebased after adding tests in reductions.ll.
  • Re-added support for VADDi8 and VADDi16.
  • Added some extra comments and TODOs.
SjoerdMeijer accepted this revision.Jun 30 2020, 8:39 AM

Thanks, nice optimisation.


is it not const anymore?


Do we have a (negative) test case with a float reduction?


nit: perhaps move this comment down a bit...


...and check this earlier.

This revision is now accepted and ready to land.Jun 30 2020, 8:39 AM
samparker marked an inline comment as done.Jul 1 2020, 12:24 AM
samparker added inline comments.

The vectorizer doesn't seem to want to produce vector float reductions loops, so I've left them out on testing.

This revision was automatically updated to reflect the committed changes.