Page MenuHomePhabricator

implementation of sdag support for VP memory intrinsics
ClosedPublic

Authored by hussainjk on Jul 12 2021, 10:31 PM.

Details

Summary

Followup to D99355: SDAG support for vector-predicated load/store/gather/scatter.

Diff Detail

Event Timeline

hussainjk created this revision.Jul 12 2021, 10:31 PM
hussainjk requested review of this revision.Jul 12 2021, 10:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2021, 10:31 PM

The existing masked/gather/scatter SDNodes have seen some changes since the VP nodes were basically forked from them. I am not familiar with the indexing in masked load, for example. Maybe some of the RISC-V V folks can chime in whether any of the recent masked-sdnode updates should be applied to the vp nodes too.

llvm/include/llvm/CodeGen/SelectionDAGNodes.h
2386

Since expansion/compression is unsupported I don't think we need the is.*Load/Store functions.

simoll added a project: Restricted Project.Jul 20 2021, 8:25 AM

The existing masked/gather/scatter SDNodes have seen some changes since the VP nodes were basically forked from them. I am not familiar with the indexing in masked load, for example. Maybe some of the RISC-V V folks can chime in whether any of the recent masked-sdnode updates should be applied to the vp nodes too.

I feel that it would be best if the SDNodes matched the non-VP ones as closely as possible. RISC-V doesn't have use for ISD::MemIndexedMode but some target may have one. I'm thinking even expanding loads and truncating stores should be an option. The only complexity is having to legalize it, but we can hopefully piggy-back on the logic for non-VP nodes.

This patch also needs to extend SelectionDAGBuilder to custom-build the "mem" nodes from the intrinsics. Otherwise we won't be creating the SDNodes correctly, e.g. there'll be an invalid MachineMemOperand.

This patch also needs to extend SelectionDAGBuilder to custom-build the "mem" nodes from the intrinsics. Otherwise we won't be creating the SDNodes correctly, e.g. there'll be an invalid MachineMemOperand.

As far as I can tell, this is being done correctly through SelectionDAGBuilder::visitVectorPredicationIntrinsic. Is this not the case?

This patch also needs to extend SelectionDAGBuilder to custom-build the "mem" nodes from the intrinsics. Otherwise we won't be creating the SDNodes correctly, e.g. there'll be an invalid MachineMemOperand.

As far as I can tell, this is being done correctly through SelectionDAGBuilder::visitVectorPredicationIntrinsic. Is this not the case?

No I don't believe that's the case. There's quite a bit that needs to change for these nodes: we need an in/out chain and we need to create a MachineMemOperand.

Yesterday I tried to build on top of this patch to support the VP nodes in RISC-V but quickly hit segfaults in AddNodeIDCustom.

I believe this change in the reference patch shows the kind of think we need.

Addressing review comments: updating sdnodes with indexing, expanding, compressing support; bringing in line with load/store sdnodes.

hussainjk updated this revision to Diff 363804.Aug 3 2021, 10:20 AM

Addressing review comments: extending SDBuilder::VisitVectorPredicationIntrinsic.

frasercrmck added inline comments.Aug 4 2021, 7:42 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7508–7523

I think we're missing VP_GATHER and VP_SCATTER here?

hussainjk updated this revision to Diff 365640.Aug 10 2021, 6:43 PM

Addressing reviews: adding some missing cases for predicated gather/scatter.

frasercrmck added inline comments.Aug 11 2021, 3:55 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7534

I don't think you can use PtrOperand to initialize the MachineMemOperand for VP_GATHER because it's not a pointer - it's a vector of pointers.

The regular llvm.masked.gather intrinsic creates a MachinePointerInfo using only the address space.

7576

Same concern about MachinePointerInfo as with VP_GATHER.

Addressing review comments.

Sorry that my review feedback is piecemeal. What I'm doing is running the following test (declares omitted for brevity) with RISC-V. I think we want to see "cannot select" errors for each function, but I'm currently only getting that with vp.load and fixed-length vp.gather. The rest assert.

; RUN: llc -mtriple=riscv64 -mattr=+experimental-v -riscv-v-vector-bits-min=128 < %s

define <vscale x 2 x i32> @vp_gather_nxv2i32(<vscale x 2 x i32*> %ptrs, <vscale x 2 x i1> %mask, i32 zeroext %evl) {
  %ld = call <vscale x 2 x i32> @llvm.vp.gather.nxv2i32.p0nxv2i32(<vscale x 2 x i32*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
  ret <vscale x 2 x i32> %ld
}

define <4 x i32> @vp_gather_v4i32(<4 x i32*> %ptrs, <4 x i1> %mask, i32 zeroext %evl) {
  %ld = call <4 x i32> @llvm.vp.gather.v4i32.p0v4i32(<4 x i32*> %ptrs, <4 x i1> %mask, i32 %evl)
  ret <4 x i32> %ld
}

define <vscale x 2 x i32> @vp_load_nxv2i32(<vscale x 2 x i32>* %ptr, <vscale x 2 x i1> %mask, i32 zeroext %evl) {
  %ld = call <vscale x 2 x i32> @llvm.vp.load.nxv2i32.p0nxv2i32(<vscale x 2 x i32>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
  ret <vscale x 2 x i32> %ld
}

define <4 x i32> @vp_load_v4i32(<4 x i32>* %ptr, <4 x i1> %mask, i32 zeroext %evl) {
  %ld = call <4 x i32> @llvm.vp.load.v4i32.p0v4i32(<4 x i32>* %ptr, <4 x i1> %mask, i32 %evl)
  ret <4 x i32> %ld
}

define void @vp_store_nxv2i32(<vscale x 2 x i32> %val, <vscale x 2 x i32>* %ptr, <vscale x 2 x i1> %mask, i32 zeroext %evl) {
  call void @llvm.vp.store.nxv2i32.p0nxv2i32(<vscale x 2 x i32> %val, <vscale x 2 x i32>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
  ret void
}

define void @vp_store_v4i32(<4 x i32> %val, <4 x i32>* %ptr, <4 x i1> %mask, i32 zeroext %evl) {
  call void @llvm.vp.store.v4i32.p0v4i32(<4 x i32> %val, <4 x i32>* %ptr, <4 x i1> %mask, i32 %evl)
  ret void
}

define void @vp_scatter_v4i32(<4 x i32> %val, <4 x i32*> %ptrs, <4 x i1> %mask, i32 zeroext %evl) {
  call void @llvm.vp.scatter.v4i32.p0v4i32(<4 x i32> %val, <4 x i32*> %ptrs, <4 x i1> %mask, i32 %evl)
  ret void
}

define void @vp_scatter_nxv2i32(<vscale x 2 x i32> %val, <vscale x 2 x i32*> %ptrs, <vscale x 2 x i1> %mask, i32 zeroext %evl) {
  call void @llvm.vp.scatter.nxv2i32.p0nxv2i32(<vscale x 2 x i32> %val, <vscale x 2 x i32*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
  ret void
}
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7969

This should be getVectorElementCount so it doesn't crash on scalable vectors

8007

getVectorElementCount for scalable vectors.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7524

nit: empty line

7580

This is an out-of-bounds access because vp.store/vp.gather return void and so ValueVTs is empty.

Addressing review comments.

@simoll @frasercrmck All the mentioned tests are now passing, and I don't think there are any other outstanding issues. Are we good to accept and merge?

The visitVectorPredicationIntrinsic is pretty crammed now. Probably a good idea to extract the SDNode buildling for load/store/scatter/gather into their own functions as done for regular (masked) load/store/gather/scatter.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7544–7565

This is the same code as ..

7592–7613

.. this code. Can we de-duplicate and move this into its own function?

frasercrmck added inline comments.Aug 19 2021, 5:46 AM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
2349

nit: inconsistent capitalization, plus VLen is maybe better known at EVL.

Also it doesn't have to be i32. It may be widened to a target-specific value, e.g.,

MVT EVLParamVT = TLI.getVPExplicitVectorLengthTy();

Not sure how best to describe that in the comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7667

I think all of these VLen params may be better as EVL since that's how they're known in the rest of the code. Any thoughts, @simoll?

simoll added inline comments.Aug 19 2021, 7:42 AM
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
2349

Not sure how best to describe that in the comments.

// The type of EVL is TLI.getVPExplicitVectorLengthTy() ?

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7667

+1 Let's stick to EVL for consistency.

Addressing review comments.

frasercrmck added inline comments.Mon, Aug 30, 7:25 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7521

I think this should be Opcode == ISD::VP_SCATTER?

frasercrmck accepted this revision.Mon, Aug 30, 7:27 AM

LGTM other than that last issue, I think.

This revision is now accepted and ready to land.Mon, Aug 30, 7:27 AM

I don't think the commit message title needs to mention the fact it's a "Followup to D99355" - that can be in the body.

Addressing review comment.

hussainjk retitled this revision from Followup to D99355: implementation of sdag support for vp load/store/gather/scatter. to implementation of sdag support for vp load/store/gather/scatter..Mon, Aug 30, 11:25 AM
hussainjk edited the summary of this revision. (Show Details)
hussainjk retitled this revision from implementation of sdag support for vp load/store/gather/scatter. to implementation of sdag support for VP memory intrinsics.
hussainjk edited the summary of this revision. (Show Details)

@simoll Do you mind merging this one in as well?

This revision was landed with ongoing or failed builds.Tue, Aug 31, 8:04 AM
This revision was automatically updated to reflect the committed changes.