This is an archive of the discontinued LLVM Phabricator instance.

getVPMemoryOpCost interface
ClosedPublic

Authored by bmahjour on Sep 7 2021, 9:30 PM.

Details

Summary

Added TTI queries for the cost of a VP Memory operation, and added DataType and Alignment to the hasActiveVectorLength() interface.

Diff Detail

Event Timeline

hussainjk requested review of this revision.Sep 7 2021, 9:30 PM
hussainjk created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2021, 9:30 PM
bmahjour commandeered this revision.Sep 10 2021, 1:57 PM
bmahjour added a reviewer: hussainjk.

Taking over from @hussainjk as he is going to grad school.

Matt added a subscriber: Matt.Sep 25 2021, 12:36 PM

Is this still needed?

RolandF added inline comments.Oct 15 2021, 12:15 PM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

I prefer the downstream name getVariableLengthMemoryOpCost. I'm not sure
what VP stands for here and it seems like VP is already used in the context of VPlan. At least explain in the comment.

Is this still needed?

Yes, this patch establishes the target interface that is implemented by D109417.

bmahjour added inline comments.Oct 18 2021, 8:26 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

VP stands for Vector Predication and is the name that the community has used for all the predicated intrinsics. See https://llvm.org/docs/LangRef.html#vector-predication-intrinsics

RolandF added a comment.EditedOct 18 2021, 11:23 AM

Is this still needed?

Yes, this patch establishes the target interface that is implemented by D109417.

Yes, I realize. Are you intending to go ahead with that patch? It has been sitting for a while. I would imagine that is useful info for other reviewers.

llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

Well, I followed your link. There are no "vp" memory ops. The loads and stores are named "masked" operations.

RolandF added inline comments.Oct 18 2021, 12:33 PM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

The target implementation you propose depends on having an Instruction parameter passed in. If you pass in an Instruction you could actually figure out from the dynamic cast that this is a vp intrinsic. The existing getMemoryOpCost allows passing in an Instruction, so it seems like you could figure this out without a new interface.

simoll added inline comments.Oct 19 2021, 1:48 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

The LLVM LangRef wording for vp.load/store/gather/scatter are missing. The intrinsics themselves are upstream though.

Is this still needed?

Yes, this patch establishes the target interface that is implemented by D109417.

Yes, I realize. Are you intending to go ahead with that patch? It has been sitting for a while. I would imagine that is useful info for other reviewers.

The following patches are related and still in flight and we intend to go ahead with them all. There has been a delay due to Hussain leaving, but we definitely want to resume the work soon and finish it. If you are interested in helping commandeer some of them please let me know.

https://reviews.llvm.org/D109416 - Cost model infrastructure for VPMemory ops
https://reviews.llvm.org/D109417 - Cost model for VPMemory ops
https://reviews.llvm.org/D109584 - Expansion of VP load/store to scalar loads/stores
https://reviews.llvm.org/D109377 - Type legalization for VP memory SelectionDAG nodes
https://reviews.llvm.org/D109379 - PPC lowering for VP memory SD nodes

bmahjour added inline comments.Oct 19 2021, 9:05 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

There is precedence for this, for example we have separate interfaces for getMaskedMemoryOpCost and getGatherScatterOpCost, so we try to follow that in this patch.
I think the instruction being passed to getMemoryOpCost is meant to be optional, and it may not even be a vp intrinsic call. For example we could be querying the cost of turning a scalar load into a vector predicated load intrinsic, in which case the instruction passed to getMemoryOpCost would be a scalar load so we cannot do a dyn_cast on it to figure out if we want a vp intrinsic cost or otherwise.

bmahjour updated this revision to Diff 381394.Oct 21 2021, 2:29 PM
bmahjour marked 4 inline comments as done.
bmahjour edited the summary of this revision. (Show Details)

added DataType and Alignment to the hasActiveVectorLength() interface.

RolandF added inline comments.Oct 28 2021, 11:58 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1121

Now that the instruction pointer is truly optional, having a separate call actually provides value and I like this patch a lot better.

RolandF accepted this revision.Nov 1 2021, 8:07 AM

LGTM

This revision is now accepted and ready to land.Nov 1 2021, 8:07 AM
simoll added inline comments.Nov 24 2021, 11:53 PM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1385

This is really just a getter that tells you whether using VP intrinsics makes sense on this target at all. It shouldn't be modified this way.

Couldn't you use the cost function to tell you whether a combination of %evl and type for memory accesses makes sense?

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
1341 ↗(On Diff #381394)

This patch should only be for TTI. You are using the default impl here anyway - target-specific changes should go in a separate patch.

bmahjour marked an inline comment as done.Nov 25 2021, 12:52 PM
bmahjour added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfo.h
1385

Whether the target supports EVL in hardware may depend on opcode and data type. For example, Power9/10 only support load and stores. Other targets may have dependency on data type or alignment. For these reasons I think we should pass opcode and datatype/alignment as input to this function. I just realized I had forgotten to add opcode to this interface. I'll do that.

Having a function that can tell us whether an opcode/type combination can be efficiently vector predicated (without having to model the cost) is helpful in LoopVectorizer as well. For example when deciding how to vectorize we don't have a VPIntrinsic yet, so we cannot use getVPLegalizationStrategy. Having something similar to isLegalMaskedLoad would be useful, and I was hoping we use hasActiveVectorLength() for that purpose (per previous conversation in D99750#inline-949716). Is there another interface I should we looking at?

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
1341 ↗(On Diff #381394)

This is just a stub, but I'll move it to D109417 for cleaner separation.

bmahjour updated this revision to Diff 390111.Nov 26 2021, 12:58 PM
bmahjour marked an inline comment as done.Dec 2 2021, 8:50 AM

@simoll I've addressed your comments above, please let me know if you have any further comments. I plan to commit this soon. Thanks!

This revision was automatically updated to reflect the committed changes.