This is an archive of the discontinued LLVM Phabricator instance.

[VP] Implementing expansion pass for VP load and store.
ClosedPublic

Authored by loralb on Sep 10 2021, 1:58 AM.

Details

Summary

Added function to the ExpandVectorPredication pass to handle VP loads and stores.

Diff Detail

Event Timeline

hussainjk created this revision.Sep 10 2021, 1:58 AM
hussainjk requested review of this revision.Sep 10 2021, 1:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2021, 1:58 AM
hussainjk edited the summary of this revision. (Show Details)Sep 10 2021, 2:00 AM
hussainjk added reviewers: bmahjour, nemanjai.
hussainjk edited the summary of this revision. (Show Details)

Fixing errors that were introduced from rebasing.

hussainjk updated this revision to Diff 372046.Sep 10 2021, 6:52 PM

More fixes to the length-only scalarization, and added tests using the Power8 target for this scalarization.

frasercrmck added a project: Restricted Project.
frasercrmck added a subscriber: frasercrmck.

Adding some extra reviewers to help keep on top of this

This needs "generic" testing e.g. like those in test/CodeGen/Generic/expand-vp.ll

llvm/lib/CodeGen/ExpandVectorPredication.cpp
531

I wonder if the addition of this method should be in a follow-up patch? That is, first get the "basic" support in using masked.* intrinsics in and consider this an enhancement/optimization for certain targets like PowerPC?

simoll added inline comments.Jan 5 2022, 5:48 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
518

We should pick the expansion strategy based on the target's preferences. Either extend VPLegalizationStrategy to allow for strategy selection or add a new function to TTI that returns the expansion method (enum).

531

I agree. Better to add the default expansion strategy first, then extend TTI to allow for strategy selection and the cascading scheme in a followup.

bmahjour added inline comments.Jan 5 2022, 7:31 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
531

This scheme is meant for any target where masked load/stores are not supported in hardware and all the active lanes are packed on the left (ie EVL only). I have a hard time imagining a target that wouldn't benefit from this scheme. I think that's why Hussain added this as part of the default expansion.

simoll added inline comments.Jan 7 2022, 10:29 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
531

If there are two expansion schemes, targets should be able to choose between them through TTI. AVX.* supports masked load/store well - PPC may benefit more from the piecewise expansion scheme.

bmahjour added inline comments.Jan 7 2022, 11:07 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
531

The only other expansion scheme is a fail-safe that applies to legalization of masked load/stores in general (not just the EVL cases), so it's less optimal regardless of the architecture. I neither see the need nor strongly object to adding a TTI query for this, but if we add one I think the default should be the cascading scheme.

bmahjour added inline comments.Jan 7 2022, 11:08 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
531

(and again we are talking about targets where masked/load stores are not supported in hardware, so I guess AVX doesn't apply).

I'm still in favor of splitting up the patch into the default expansion (which can be the cascading loads) and a second one for masked.load expansion.

llvm/lib/CodeGen/ExpandVectorPredication.cpp
475

I had overlooked this before. You are checking whether masked.load is supported, so my argument for selecting the expansion scheme with TTI is moot.

simoll added inline comments.Jan 11 2022, 2:10 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
482

TTI.isLegalMaskedStore?

simoll added inline comments.Jan 11 2022, 2:15 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
518

load/store unfolding does not work for scalable vectors this way. We should default to the memory intrinsic path for scalable types.

craig.topper added inline comments.Jan 14 2022, 12:51 PM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
396

VPIntrinsic is a subclass of Instruction, we shouldn't need an explicit cast.

398

Don't use auto here.

408

Why not llvm_unreachable?

466

Doesn't IRBuilder's constructor also set the debug location?

469

Don't use auto.

473

Don't use auto

478

Drop else after return

485

Drop else after return

492

Does't IRBuilder's constructor set the debug location?

515

llvm_unreachable

520

Drop else after return and drop the curly braces for the if body

536

Shouldn't be needed

545

Use StringRef

549

llvm_unreachable

559

Capitalize

576

break should be inside the curly braces

582

Same here

595

Drop parentheses around the statement

601

Using StringRef for Prefix should eliminate the need for Twine constructor here

loralb added a subscriber: loralb.Feb 25 2022, 6:53 AM

hello @hussainjk @bmahjour, are you still working on this? Otherwise, I may take it over on your behalf.

hello @hussainjk @bmahjour, are you still working on this? Otherwise, I may take it over on your behalf.

Thanks @loralb. Please go ahead.

loralb commandeered this revision.Feb 28 2022, 8:39 AM
loralb added a reviewer: hussainjk.
loralb updated this revision to Diff 411818.Feb 28 2022, 8:47 AM
loralb retitled this revision from Implementing expansion pass for VP load and store. to [VP] Implementing expansion pass for VP load and store..
loralb edited the summary of this revision. (Show Details)

Changelog:

  • Remove VP gather/scatter references (to be added in a follow-up patch)
  • Following the discussion in the comments, remove expandPredicationInUnfoldedLoadStore() function (to be added in a follow-up patch)
  • Apply suggestions in comments
  • Add tests
loralb added inline comments.Feb 28 2022, 8:51 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
515

I changed the behaviour of the defalut case to reflect what was done before this patch, but I am not sure which one is the right approach: what do you think is best?

frasercrmck added inline comments.Mar 1 2022, 4:40 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
166

unpredicated seems misleading here, since we're using masked.load and masked.store: those are predicated in a sense.

406

nit, but I don't think you need these braces around the switch statements. NewStore/NewLoad are defined in their own scope.

408

I'd prefer something like /*IsVolatile*/ false

llvm/test/CodeGen/Generic/expand-vp-load-store.ll
16

We should be testing the IsUnmasked path here too.

loralb updated this revision to Diff 412168.Mar 1 2022, 11:07 AM

Changelog:

  • Address comments

N.B.: the IsUnmasked == true path in expandPredicationInMemoryIntrinsic() is not reachable, as also shown by the tests. How should this be handled in the code? Do we still handle this case or we add something like assert(!isAllTrueMask(MaskParam)) instead of defining the IsUnmasked boloean?

loralb marked 4 inline comments as done.Mar 1 2022, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 12:15 PM

Changelog:

  • Address comments

N.B.: the IsUnmasked == true path in expandPredicationInMemoryIntrinsic() is not reachable, as also shown by the tests. How should this be handled in the code? Do we still handle this case or we add something like assert(!isAllTrueMask(MaskParam)) instead of defining the IsUnmasked boloean?

Is that because you're using scalable vectors and the isAllTrueMask is expecting a ConstantVector? To my mind we should either:

  • use a better true-mask check (surely there's one we can reuse instead of implementing our own)
  • add tests for fixed vectors
  • use constantexpr all-ones masks in the scalable-vector tests

The last is a bit of a hack, but the other two sound reasonable. Adding tests for fixed vectors sounds like a good idea for this patch anyway, and it'd give us coverage of this code path. The first one should be done, but done in a separate patch. Just add a FIXME in the scalable-vector tests for now?

loralb updated this revision to Diff 418582.Mar 28 2022, 8:04 AM

Changelog:

  • Improve isAllTrueMask() (for scalable vectors only)
  • Add and update tests

I did not manage to find any ready to use function in place of isAllTrueMask(), so I added this PatternMatch approach found in other places of the codebase. Maybe we can unify its behaviour in a later patch?

loralb updated this revision to Diff 419085.Mar 30 2022, 2:53 AM

Changelog:

  • Use getSplatValue() instead of PatternMatch in isAllTrueMask()
  • Update tests
frasercrmck added inline comments.Mar 30 2022, 4:46 AM
llvm/lib/CodeGen/ExpandVectorPredication.cpp
90

I don't know if it matters whether this is isOneValue or isAllOnesValue? In practice for i1 masks it's the same, but somehow the latter sounds more appropriate to me.

llvm/test/CodeGen/Generic/expand-vp-load-store.ll
35

Shouldn't we see regular load here?

79

Same here: regular store?

loralb added inline comments.Mar 30 2022, 5:50 AM
llvm/test/CodeGen/Generic/expand-vp-load-store.ll
35

If I understand this correctly, since the evl value is unknown, using here a regular load means we may try to load elements that we should not, hence why we need the masked version

79

This should work like the load above

loralb updated this revision to Diff 419124.Mar 30 2022, 6:04 AM

Changelog:

  • Use isAllOnesValue() instead of isOneValue()
simoll accepted this revision.Jun 2 2022, 2:24 AM

LGTM

This revision is now accepted and ready to land.Jun 2 2022, 2:24 AM

Following the discussion in the comments, remove expandPredicationInUnfoldedLoadStore() function (to be added in a follow-up patch)

Is there a follow up patch for this yet?

Following the discussion in the comments, remove expandPredicationInUnfoldedLoadStore() function (to be added in a follow-up patch)

Is there a follow up patch for this yet?

hello @bmahjour, there is no follow-up patch yet and right now it is not in my short-term plan; I may be able to keep working on it later on. There wouldn't be any problem though if someone is interested on taking this over.

loralb updated this revision to Diff 444639.Jul 14 2022, 6:45 AM

Changelog:

  • Rebase
This revision was automatically updated to reflect the committed changes.