This is an archive of the discontinued LLVM Phabricator instance.

[ARM][LowOverheadLoops] Handle reductions
ClosedPublic

Authored by samparker on Mar 3 2020, 8:34 AM.

Details

Summary

While validating live-out values, record instructions that look like a reduction. This will comprise of a vector op (for now only vadd), a vorr (vmov) which store the previous value of vadd and then a vpsel in the exit block which is predicated upon a vctp. This vctp will combine the last two iterations using the vmov and vadd into a vector which can then be consumed by a vaddv.
Once we have determined that it's safe to perform tail-predication, we need to change this sequence of instructions so that the predication doesn't produce incorrect code. This involves changing the register allocation of the vadd so it updates itself and the predication on the final iteration will not update the falsely predicated lanes. This mimics what the vmov, vctp and vpsel do and so we then don't need any of those instructions.

Diff Detail

Event Timeline

samparker created this revision.Mar 3 2020, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2020, 8:34 AM

Big patch: this is just a first scan of the code, and a first round of nits. Now going to look again, to let things sink in.

llvm/include/llvm/CodeGen/ReachingDefAnalysis.h
155

you renamed this Defs to Incoming...

157

So rename this one too?

233

and here too?

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
179

Can you comment what these members are? I agree that most are self-explanatory, but I am e.g. interested in Init, and if it e.g. can be null (there is a check in the fixup function), and what the meaning is of that.

554

nit: perhaps an assert that VPSEL is a vpsel would be good.

574

nit, I think nicer to read is:

MO.getReg() == 0

->

!MO.getReg().isValid()
582

Nit: just a bit shorter would be:

for (auto &MO : MI->uses()) {
  if (MO.isImm() && MO.getImm() == Imm)
     return true;
return false;
630

nit: was just curious why we expect the first item in the set to be the vpsel. Can we rely on that with a set?

644

Could this be a good candidate for a helper function in ARMBaseInstrInfo.h?

661

I guess you mean VMOV can be an alias for VORR, which you're checking here?

688

Can or should this not be checked much earlier?

samparker marked 5 inline comments as done.Mar 16 2020, 8:28 AM
samparker added inline comments.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
179

'Init' is the possible instruction that maybe initialising our result register, such as a mov #0, but we won't necessarily have an instruction doing this.

630

Because the set only has one member if we get here.

644

I'm not sure how relevant it is to the rest of the backend, but it would be more readable here as a local helper - especially once we add more supported opcodes.

661

Yes, I don't think we actually have a MVE VMOV instruction.

688

Yes, at some point in an unknown patch. I've moved it into one of the legality helpers.

samparker updated this revision to Diff 274072.Jun 29 2020, 5:33 AM

Rebased, which has made checking the VPSEL predicate a much more simple task.

SjoerdMeijer added inline comments.Jun 29 2020, 8:26 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
768

typo: that

803

Perhaps it's good to add an comment here or in the description of the algorithm on line 733 - 753 that reductions need special treatment as they define values that are not used by predicated instructions inside the loop.

1282

Do we have a test case with more than 1 reduction?

samparker updated this revision to Diff 274438.Jun 30 2020, 5:55 AM
  • Rebased after adding tests in reductions.ll.
  • Re-added support for VADDi8 and VADDi16.
  • Added some extra comments and TODOs.
SjoerdMeijer accepted this revision.Jun 30 2020, 8:39 AM

Thanks, nice optimisation.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
250–254

is it not const anymore?

777

Do we have a (negative) test case with a float reduction?

821

nit: perhaps move this comment down a bit...

823

...and check this earlier.

This revision is now accepted and ready to land.Jun 30 2020, 8:39 AM
samparker marked an inline comment as done.Jul 1 2020, 12:24 AM
samparker added inline comments.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
777

The vectorizer doesn't seem to want to produce vector float reductions loops, so I've left them out on testing.

This revision was automatically updated to reflect the committed changes.