This is an archive of the discontinued LLVM Phabricator instance.

[ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks
ClosedPublic

Authored by SjoerdMeijer on Jan 14 2020, 9:01 AM.

Details

Summary

This patch uses helper function rewriteLoopExitValues that is refactored in D72602 to rematerialise the iteration count in exit blocks, so that we can clean-up loop update expressions inside the hardware-loops later in ARMLowOverheadLoops.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Jan 14 2020, 9:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2020, 9:01 AM

I had not realised that this required LCSSA form. That will likely cause many more changed than you really want. It's probably best not to create LCSSA form here without destroying it afterwards. Do you know if LoopSimpleForm is needed too? It looks from the code that it shouldn't (it can handle multiple exits).

You might be better off only creating LCSSA form for the loops that really need it. Otherwise I imagine a lot of tests change. There is formLCSSA and formLCSSARecursively. The first I one is probably fine for individual loops. I vaguely remember something that will go through and remove the one operand phis too, but I'm not sure what that is called (other than instcombine).

llvm/lib/Target/ARM/MVETailPredication.cpp
593

It's probably better to be explicit about what is needed and preserved here. I don't think LoopSimpleForm would be needed, and you can probably get away with not requiring LCSSA either.

I vaguely remember something that will go through and remove the one operand phis too, but I'm not sure what that is called (other than instcombine).

I'm hoping that this close to codegen, the phis won't be a problem..?

llvm/lib/Target/ARM/MVETailPredication.cpp
61

Doesn't look like these need to be members.

samparker added inline comments.Jan 15 2020, 12:32 AM
llvm/lib/Target/ARM/MVETailPredication.cpp
426

And we should be able to just insert a phi here, with the vctp operand as the operand to the phi and the vctp then using the phi.

Thanks for sharing your thoughts!

So I had also not realised that this requires LCSSA, and for a bit more context: in my first experiment, I was running the whole IndVarSimplify pass directly after MVETailPredicationPass, which takes care of bringing it in LCSSA form as this is a required pre-condition for a few things. The more focused approach, only using rewriteLoopExitValues() as a helper which is what this patch is about, and not running IndVarSimplify, this LCSSA assert was triggered.

I will now look into this, and thanks for your suggestions: if it is just calling formLCSSA() that would be ideal, but if that is a bit overkill I will just insert the Phi.

SjoerdMeijer retitled this revision from [ARM][MVE] WIP: Tail-Predication: rematerialise iteration count in exit blocks to [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.
SjoerdMeijer edited the summary of this revision. (Show Details)

I have removed WIP and TODO from the patch subject/description, as this revision now demonstrates I think what this patch should be doing:

  • I have removed LoopPass from this, as this indeed causes a few more passes to run, which we don't need, and causes (unrelated) changes elsewhere. Now, the only test changes are in the LowOverheadLoop tests, which is what we want.
  • I use formLCSSA() to transform the loop into LCSSA form if it isn't.

This gives you a change to already look at the codegen changes and differences, while I will probably need to add some more opt tail-predication tests for this. The codegen changes demonstrate the benefit from moving loop update instruction out of the loop, at the cost of setting up the count in the preheader.

Good to see those looks finally loosing those subs and movs!

llvm/lib/Target/ARM/MVETailPredication.cpp
142

Do we know if the loop entering the pass are in LCSSA already? If so, it seems a little excessive to call this when we only have one instruction to fix up...

219

I'm pretty sure you only want to do this when you've cloned the vctp in an exit block.

llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll
983–984

These changes seem unnecessary.

dmgreen added inline comments.Jan 15 2020, 8:46 AM
llvm/lib/Target/ARM/MVETailPredication.cpp
142

Do you need to test this? Will formLCSSA either just work or not by calling it?

I just noticed an EXPENSIVE_CHECKS in formLCSSA that checks if the subloops are already in LCSSA form. Maybe use formLCSSARecursively instead?

145–146

Space after the ","

148

Does formLCSSA return false on failure, or nothing changed?

I think this should cover it now, and address the comments:

  • RematerializeIterCount() is only called when we've cloned the VCTP instruction in an exit block, this makes the unrelated test case go away.
  • RematerializeIterCount() has become really simple with just a call to formLCSSARecursively() (and of course rewriteLoopExitValues()) because when there's nothing to do, formLCSSARecursively is a cheap operation, so we indeed don't need to check first if the loop is in LCSSA form.
  • One of the motivating examples for this work and optimisation is a reduction in vector-reduce-mve-tail.ll. I decided to use `formLCSSARecursively, instead of manually inserting a phi, because there are a few values live-out, the ones that feed the final reduction. We need to insert 3 Phi nodes, thus I don't think it is worth optimising and calling formLCSSARecursively is best.
  • This is tested by exiting tests. For example, the reductions vector-reduce-mve-tail.ll will bring it in LCSSA form first, and is testing that. Simpler tests elsewhere check when this doesn't need to happen first.
samparker accepted this revision.Jan 17 2020, 1:18 AM

LGTM, but please address nits before committing.

llvm/lib/Target/ARM/MVETailPredication.cpp
137

Nit: L is already a member, so it doesn't need to be passed.

582

Nit: Just return a bool from Cleanup instead of passing an extra arg.

This revision is now accepted and ready to land.Jan 17 2020, 1:18 AM

Seems good to me, if Sam is happy with the tail predication parts.

llvm/lib/Target/ARM/MVETailPredication.cpp
68–69

This can be in the same block as the variables above.

Thanks guys, will address this before committing.

This revision was automatically updated to reflect the committed changes.