This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Support scalable-vector masked gather operations
ClosedPublic

Authored by frasercrmck on Feb 8 2021, 6:51 AM.

Details

Summary

This patch supports the masked gather intrinsics in RVV.

The RVV indexed load/store instructions only support the "unsigned unscaled"
addressing mode; indices are implicitly zero-extended or truncated to XLEN and
are treated as byte offsets. This ISA supports the intrinsics directly, but not
the majority of various forms of the MGATHER SDNode that LLVM combines to. Any
signed or scaled indexing is extended to the XLEN value type and scaled
accordingly. This is done during DAG combining as widening the index types to
XLEN may produce illegal vectors that require splitting, e.g.
nxv16i8->nxv16i64.

Support for scalable-vector CONCAT_VECTORS was added to avoid spilling via the
stack when lowering split legalized index operands.

Diff Detail

Event Timeline

frasercrmck created this revision.Feb 8 2021, 6:51 AM
frasercrmck requested review of this revision.Feb 8 2021, 6:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2021, 6:51 AM
frasercrmck added inline comments.Feb 8 2021, 7:38 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2800

I'll add tests in case LLVM tries to create "ext loads" out of these. We'd have to undo that if that were the case.

llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td
92 ↗(On Diff #322098)

We can improve codegen for when the mask is known "true".

420 ↗(On Diff #322098)

This is (helpfully) copy/pasted from VPseudos.td. Is there a way to create a "list" (e.g. of val/idx tuples) there which we reuse here?

  • optimize for known true masks
  • remove accidental CHECKs
  • test for extloads
craig.topper added inline comments.Feb 11 2021, 1:03 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2823

Is it possible that IndexVT has 32-bit elements on a 64-bit target and that the IndexVT is already LMUL==8 such that this SIGN_EXTEND/ZERO_EXTEND would produce an illegal LMUL==16 type?

2833

Assert that this is a power of 2. I'm not completely sure that's guaranteed, but maybe it is for the types we expect to see.

frasercrmck marked an inline comment as done.
  • rebase
  • assert on a power-of-two scale
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2823

Yes, it's possible, and similarly on a 32-bit target with an illegal 8- or 16-bit index type, but I think that implies the original intrinsic would have an illegal pointer type, like <vscale x 16 x i32*>. So in that sense it's ideally something LLVM would help legalize (split) for us, but sadly it folds the illegal intrinsic to one with legal types, e.g.: i32* base, <vscale x 16 x i32> idxs so it doesn't see that until it asks us to custom-lower.

Am I right in thinking we have to custom-legalize this ourselves, duplicating logic from LegalizeVectorTypes?

2833

Good idea.

frasercrmck added inline comments.Feb 12 2021, 6:48 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2823

It should be possible to do a DAGCombine pre-legalization which creates the illegal XLEN index types and passes it back to go through the vanilla legalization, if you think that's a valid route.

Either way it's going to require support for scalable-vector insert/extract subvector. In the worst case we'll have to expand nxv64i64 (from nxv64i8) into 8 x nxv8i64.

craig.topper added inline comments.Feb 12 2021, 9:22 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2823

Would the DAG combine end up fighting with the combine that uses shouldRemoveExtendFromGSIndex?

frasercrmck added inline comments.Feb 15 2021, 1:15 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2823

It does, so you have to turn off that hook. I don't think that's the worst thing for our gathers since only UNSIGNED UNSCALED can be represented as index types smaller than XLEN, and I couldn't find anything that would generate that.

  • rebase
  • legalize addressing mode in pre-legalize DAGCombine
  • lower to intrinsics to limit pattern bloat
  • test vectors that need splitting
  • account for frame layout changes
This revision is now accepted and ready to land.Mar 17 2021, 11:33 AM
frasercrmck edited the summary of this revision. (Show Details)Mar 18 2021, 2:22 AM