This is an archive of the discontinued LLVM Phabricator instance.

Implementation of intrinsic and SDNode definitions for VP load, store, gather, scatter.
ClosedPublic

Authored by hussainjk on Mar 25 2021, 10:56 AM.

Details

Summary

This patch adds intrinsic definitions and SDNodes for predicated load/store/gather/scatter, based on the work done in D57504.

Diff Detail

Event Timeline

hussainjk created this revision.Mar 25 2021, 10:56 AM
hussainjk requested review of this revision.Mar 25 2021, 10:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2021, 10:56 AM
vkmr added a comment.Mar 26 2021, 10:22 AM

Thanks Hussain for this patch!
For now I have a couple of minor comments, I will take a deeper look later.
Also, I don't quite understand yet how VP handles (or plans to handle) specifying alignment parameter. Does it need VPBuilder (not to be confused with VPlan VPBuilder!) for that? Would it make sense to add the Alignment parameter to VP load/store/scatter/gather intrinsics (much like their masked counterparts) for now, unless there is already a way to specify that?

llvm/include/llvm/IR/Intrinsics.td
1372

Better to use DefaultAttrsIntrinsic? (Ditto for other load, gather and scatter)

1397

minor style nit: A blank line above TODO. Perhaps reposition it to above the scope where it applies.

Thanks Hussain for this patch!
For now I have a couple of minor comments, I will take a deeper look later.
Also, I don't quite understand yet how VP handles (or plans to handle) specifying alignment parameter. Does it need VPBuilder (not to be confused with VPlan VPBuilder!) for that? Would it make sense to add the Alignment parameter to VP load/store/scatter/gather intrinsics (much like their masked counterparts) for now, unless there is already a way to specify that?

We should use the align(N) attribute on pointers to convey the alignment. That's what we do in the LLVM for SX-Aurora stack (https://github.com/sx-aurora-dev/llvm-project/blob/683f41073689262a8a493cce7caba13bbccea56c/llvm/test/CodeGen/VE/Packed/vp_load_packed.ll#L80) . I'd say that the alignment parameter in the pre-existing masked.load/store simply predates that attribute.
And yes, setting alignment is something that the VPBuilder or what's-it-called-after-bikeshedding would take care of for you.

simoll added a project: Restricted Project.Mar 29 2021, 6:05 AM
simoll added inline comments.Mar 29 2021, 7:53 AM
llvm/unittests/IR/VPIntrinsicTest.cpp
34–54

There are no mask or vlen attributes in upstream.

Can we split this up into a patch with only the intrinsics and a second patch with SelectionDAG?

hussainjk updated this revision to Diff 334705.Apr 1 2021, 8:09 AM

Reverted some SDNode bits.

hussainjk updated this revision to Diff 334719.Apr 1 2021, 8:55 AM

Various code review fixes.

craig.topper added inline comments.
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1369–1370

Can we override clang-format here? The previous code had a more consistent structure.

llvm/include/llvm/IR/IntrinsicInst.h
421

It kind of feels like maybe there should be a VPMemIntrinsic and a VPStoreIntrinsic subclasses that contain these methods? Then you can use the more common isa, cast, dyn_cast and not have Optional return values.

llvm/lib/IR/IntrinsicInst.cpp
347

Doesn't the getValue() below already assert if there is no value?

Just some minor comments. I think Craig's suggestion of mem subclasses sounds like it's worth investigating.

llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1369–1370

Alternatively, I suppose, it could be a switch statement (aside from the last two)? Presumably that'd be optimized well enough by compilers?

llvm/include/llvm/IR/IntrinsicInst.h
428

Inconsistent documentation style, e.g. return the and return The, and use of commas vs forward slashes.

Also, is clang-tidy correct here? No uppercase letter on static methods?

llvm/include/llvm/IR/Intrinsics.td
1377

How some the loads are IntrReadMem but the stores aren't IntrWriteMem?

llvm/include/llvm/IR/VPIntrinsics.def
92

typo: cannonical -> canonical

llvm/lib/IR/IntrinsicInst.cpp
353

Purely a matter of style but this is slightly shorter and (I think) a little clearer:

if (auto PtrParamOpt = GetMemoryPointerParamPos(getIntrinsicID()))
  return getArgOperand(PtrParamOpt.getValue());
return nullptr;
Matt added a subscriber: Matt.May 4 2021, 6:54 AM
hussainjk updated this revision to Diff 345873.May 17 2021, 7:28 AM

Addressing review comments.

hussainjk added inline comments.May 18 2021, 9:53 AM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1369–1370

It seems like this style is more idiomatic in the surrounding code, so leaving it as-is for now (with spacing fixed).

llvm/include/llvm/IR/IntrinsicInst.h
421

There will be separate classes for the corresponding SDNodes at least, in the followup patch. A bunch of the code there also assumes you can generically call these methods, so I think it would be good to leave it like this for now and revisit it in the followup patch to minimize things breaking.

craig.topper added inline comments.May 18 2021, 10:00 AM
llvm/include/llvm/IR/IntrinsicInst.h
428

Should this be getMemoryPointerParamPos as clang-format says?

432

Same here

Addressing review comments.

I had a thought about alignment. Do we have a plan on how to deal with unaligned accesses? The other non-VP masked intrinsics are scalarized by the ScalarizeMaskedMemIntrin pass and regular loads and stores are handled by -- I believe -- LegalizeDAG.

This doesn't apply for scalable vectors, where presumably the onus is on the author/compiler transform not to emit them in the first place. But then for that we have TTI methods like isLegalMaskedStore and friends.

I had a thought about alignment. Do we have a plan on how to deal with unaligned accesses?

Not yet

The other non-VP masked intrinsics are scalarized by the ScalarizeMaskedMemIntrin pass and regular loads and stores are handled by -- I believe -- LegalizeDAG.

Maybe vp can piggyback on ScalarizeMaskedMemIntrin for unaligned access expansion.

This doesn't apply for scalable vectors, where presumably the onus is on the author/compiler transform not to emit them in the first place. But then for that we have TTI methods like isLegalMaskedStore and friends.

ScalarizeMaskedMemIntrin could use llvm.experimental.vector.extract to do the same for scalable vectors...

I had a thought about alignment. Do we have a plan on how to deal with unaligned accesses?

Not yet

The other non-VP masked intrinsics are scalarized by the ScalarizeMaskedMemIntrin pass and regular loads and stores are handled by -- I believe -- LegalizeDAG.

Maybe vp can piggyback on ScalarizeMaskedMemIntrin for unaligned access expansion.

It already will piggyback on ScalarizeMaskedMemIntrin for scalarization, so that should be straightforward to put into the followup VP-mem-expansion patch.

I am ready to LGTM this. Any blockers?

craig.topper added inline comments.Jun 7 2021, 3:32 PM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1369–1370

Was this idiomatic comment referring to my comment or the switch suggestion?

If we don't do a switch I'd like to see it back the way it was originally so that each opcode is on one line. We can put clang-format off/on around it to suppress the clang format warning.

llvm/include/llvm/IR/IntrinsicInst.h
428

This should return unsigned to match the recent change to GetVectorLengthParamPos and GetMaskParamPos.

432

unsigned

llvm/lib/IR/IntrinsicInst.cpp
346

unsigned

hussainjk updated this revision to Diff 350650.Jun 8 2021, 10:22 AM

addressing review comments

hussainjk added inline comments.Jun 8 2021, 10:24 AM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1369–1370

Regarding the switch, but I've gone ahead and changed it to a switch statement.

craig.topper added inline comments.Jun 9 2021, 7:45 AM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
1395

Extra space after case here

LGTM other than the last few nits.

llvm/lib/IR/IntrinsicInst.cpp
348

Is the explicit this-> required?

LGTM too. Other than the last comments from Fraser and I.

hussainjk updated this revision to Diff 351204.Jun 10 2021, 9:50 AM

addressing review comments

I believe I've addressed everything now.

This revision is now accepted and ready to land.Jun 10 2021, 11:15 AM

I don't have commit access. Can someone commit this please?

simoll accepted this revision.Jun 10 2021, 11:18 PM

Sure. I will commit this.

simoll requested changes to this revision.Jun 11 2021, 12:20 AM

This needs to be rebased onto recent upstream.
In order to pass all unittest then, this patch needs an implementation in VPIntrinsic::getDeclarationForParams.

This revision now requires changes to proceed.Jun 11 2021, 12:20 AM

Just so I know for the VP reduction intrinsics, should we not be adding LangRef documentation at this stage too, given we're adding new intrinsics? How about support in the ExpandVectorPredication pass? Should that come now or can it wait?

Just so I know for the VP reduction intrinsics, should we not be adding LangRef documentation at this stage too, given we're adding new intrinsics? How about support in the ExpandVectorPredication pass? Should that come now or can it wait?

Ideally, yes, you want to have the documentation with the intrinsics and their expansion in one patch.
If that grows too large, you split off the expansion for another patch.
For this patch, since we are kind of stalling on memory intrinsics to become available upstream, it'd be ok for me to just have the intrinsics and save everything else for followup patches. The intrinsics are documented in the reference implementation anyway.

Just so I know for the VP reduction intrinsics, should we not be adding LangRef documentation at this stage too, given we're adding new intrinsics? How about support in the ExpandVectorPredication pass? Should that come now or can it wait?

Ideally, yes, you want to have the documentation with the intrinsics and their expansion in one patch.
If that grows too large, you split off the expansion for another patch.
For this patch, since we are kind of stalling on memory intrinsics to become available upstream, it'd be ok for me to just have the intrinsics and save everything else for followup patches. The intrinsics are documented in the reference implementation anyway.

Got you, thanks for the info. I'd definitely prefer to have it all done in one for my own patches but I'm not going to block this one or anything.

I rebased, fixed some failing unit tests, and extended VPIntrinsic::getDeclarationForParams to handle memory intrins. Not sure if I got arc to properly pick everything up.

Apart from a style nit, LGTM. Will run some local tests before accepting. Thx!

llvm/unittests/IR/VPIntrinsicTest.cpp
249

No braces

One more thing: title & summary do not reflect what's in the patch anymore. Please update those too.

hussainjk updated this revision to Diff 355562.Jun 30 2021, 8:40 AM

addressing review comments

One more thing: title & summary do not reflect what's in the patch anymore. Please update those too.

Ahh, can you clarify?

simoll accepted this revision.Jul 1 2021, 2:30 AM

One more thing: title & summary do not reflect what's in the patch anymore. Please update those too.

Ahh, can you clarify?

My bad. I had overlooked that the VPIntrinsics.def implicitly instantiates the SDNodes. (I thought for a bit there were no SDNodes in this patch anymore). You're all good :)

This revision is now accepted and ready to land.Jul 1 2021, 2:30 AM
This revision was landed with ongoing or failed builds.Jul 1 2021, 4:35 AM
This revision was automatically updated to reflect the committed changes.