This is an archive of the discontinued LLVM Phabricator instance.

[ARM][MVE] Tail predicate in the presence of vcmp
ClosedPublic

Authored by samparker on Dec 6 2019, 3:32 AM.

Details

Summary

Record the discovered VPT blocks while checking for validity and, for now, only handle blocks that begin with VPST and not VPT. We're now allowing more than one instruction to define vpr, but each block must somehow be predicated using the vctp. This leaves us with several scenarios which need fixing up:

  1. A VPT block with is only predicated by the vctp and has no internal vpr defs.
  2. A VPT block which is only predicated by the vctp but has an internal vpr def.
  3. A VPT block which is predicated upon the vctp as well as another vpr def.
  4. A VPT block which is not predicated upon a vctp, but contains it and all instructions within the block are predicated upon in.

The changes needed are, for:

  1. The easy one, just remove the vpst and unpredicate the instructions in the block.
  2. Remove the vpst and unpredicate the instructions up to the internal vpr def. Need insert a new vpst to predicate the remaining instructions.
  3. No nothing.
  4. The vctp will be inside a vpt and the instruction will be removed, so adjust the size of the mask on the vpst.

Diff Detail

Event Timeline

samparker created this revision.Dec 6 2019, 3:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2019, 3:32 AM
samparker updated this revision to Diff 233368.Dec 11 2019, 7:22 AM
samparker edited the summary of this revision. (Show Details)

Big ol' rewrite. We now have a VPTBlock class to track predication.

Yep, that's a lot of changes... :-)
I haven't read or digested everything yet, so this a first round of nits.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
51

Can you add some comments what a VPTBlock is? For example, is it exactly the same as in the ArmARM, or slightly different? Are there any restrictions on the number of instructions in this block, so probably good to assert that in addInst?

70

I think this one can benefit from some comments too. I.e., some comments on how a non-uniform predicate looks like in this context would be good.

152

nit: getVPTBlocks?

382

typo: predicatd

463

This function is doing a lot of things. Can this first if-else if be rewritten (using some helpers) to something along the lines of this:

if (!SetVCTP(MI) || IsVPT(MI))
  return false;
else if (shouldCreateNewVPTBlock(MI))
  return true;
474

And this block is doing adding the instruction to the block, so can this e.g. be addInstToVPTBlock()?

478

typo: it's

496

Ignore this comment if it's not really possible, but I was wondering if there's a really better/clearer way to do this.
I see that you want to say here that it's fine if VPR is the last operand, because that means it's predicated, and it is a 'real' use if it is not a def and not the last operand..

Can you perhaps simply not iterate over the last one:

i < MI->getNumOperands() - 1

and then simply have:

if (MO.isDef())
    CurrentPredicate.insert(MI);
else if (MO.isUse())
    ...
samparker updated this revision to Diff 233575.Dec 12 2019, 4:14 AM
  • Addressed comments.
  • Added a fix and a test for checking that all predicates are actually chained together.

One more nit about adding comments. This is a nice summary / description:

This leaves us with several scenarios which need fixing up:

  1. A VPT block with is only predicated by the vctp and has no internal vpr defs.
  2. A VPT block which is only predicated by the vctp but has an internal vpr def.
  3. A VPT block which is predicated upon the vctp as well as another vpr def.
  4. A VPT block which is not predicated upon a vctp, but contains it and all instructions within the block are predicated upon in.

Would be good to add that as a comment to RecordVPTBlocks ? Or add this where most appropiate...

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
269

In D71426, we came to the conclusion that these sort of helper functions should live ARMBaseInstrInfo :-)

llvm/lib/Target/ARM/Utils/ARMBaseInfo.h
95

Not important, you'd have noticed anyway, but in D71426 I've moved this to here already.

samparker updated this revision to Diff 234307.Dec 17 2019, 7:59 AM

Rebased and added overview comment to conversion method.

SjoerdMeijer accepted this revision.Dec 18 2019, 1:08 AM

Cheers, nice one, LGTM

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
774

This is a big if-statement, with some more nested ifs in it.
Indentation can be reduced by merging the ifs like this, which also makes the 4 different cases more obvious that you've added as a comment above, it almost directly maps onto that:

if (Block.HasNonUniformPredicate() && isVCTP(Divergent->MI))
..
else if (Block.HasNonUniformPredicate() && Block.IsOnlyPredicatedOn(LoLoop.VCTP))
..
else if (Block.IsOnlyPredicatedOn(LoLoop.VCTP))
..
This revision is now accepted and ready to land.Dec 18 2019, 1:08 AM
This revision was automatically updated to reflect the committed changes.