This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN

Authored by luke on Mar 1 2023, 10:33 AM.



This enables the interleaved access pass on O1 and above, and causes
interleaving/deinterleaving shuffles of fixed length vectors with
stores/loads to be lowered into vssegN/vlsegN.

We need to be careful and make sure that we only lower vsseg/vlseg
whenever we know the fixed vector type will fit within the minimum vlen,
and that the interleaving factor is supported for the given LMUL.

Diff Detail

Event Timeline

luke created this revision.Mar 1 2023, 10:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2023, 10:33 AM
luke requested review of this revision.Mar 1 2023, 10:33 AM
luke added inline comments.Mar 1 2023, 10:40 AM

The idea here is that opaque pointer types will be lowered to i32 or i64 depending on the xlen, but the subtarget might only have ve32x when on rv64: Therefore ptr isn't a legal element type.
I originally intended to split this out into a separate change, but it doesn't actually affect any test cases since they seem to be covered by isTypeLegal, which is used in the gather/scatter lowering pass.


The existence of the actual pseudo is checked here because we need to make sure we only lower when this condition from the spec is met:

The EMUL setting must be such that EMUL * NFIELDS ≤ 8, otherwise the instruction encoding is reserved.

It's hairy having to extract the value type, but this seems to be what the gather/scatter does here too.

Also, because we reject illegal types with !isTypeLegal(getValueType(DL, VTy)), we don't support segmented loads/stores of v3i32 etc.
Ideally we could still use them here and have them widened to v4i32 etc. later, but it's blocked on us needing the MVT to check for the existence of the pseudo.

luke updated this revision to Diff 502083.Mar 3 2023, 2:39 AM

Add missing includes

craig.topper added inline comments.Mar 3 2023, 10:44 AM

Save the result of getValueType from the earlier call.


Would it be sufficient to just check LMul * Factor is <= 8 instead of looking up the pseudos?


static const?


static const

luke updated this revision to Diff 502266.Mar 3 2023, 2:43 PM

Address review comments

luke updated this revision to Diff 502269.Mar 3 2023, 2:57 PM

Remove unneeded include

reames added inline comments.Mar 6 2023, 8:55 AM

Ignoring fractional LMUL here is wrong. mf8 should count as LMUL1 here I believe. As you have it written here, you have it treated as LMUL8.

Also, please add a test for this case.


Style: return LMUL * Factor <= 8.


We could lower this as a wide load, plus a set of SEW=32 shuffles. I think the codegen here is distinctly sub-optimal. I'm not advocating for improving it now, but you might want to adjust the comment to be a bit more future proof.

p.s. This case is a really good catch.

1 ↗(On Diff #502269)

The naming on this test file doesn't make any sense to me.

If I'm gathering the intent of these tests correctly, they can be rewritten using wider types using V. I'd recommend doing so, and merging them into the base fixed-vectors-interleaved-access.ll test. If you think they really do need the zve32x, maybe into the test file named with that?


Why only riscv32 here?


Can you add a few more test variants here? I realize it's a bit duplicated with the transform test and the intrinsic codegen tests, but having the end to end codegen demonstrated for at least a few of the common factors seems worthwhile.


I'm concerned about the loss of test coverage here. I think it makes sense to add a runline which explicitly turns off interleave load/store formation. Or even just do so unconditionally for this file.

luke marked 3 inline comments as done.Mar 7 2023, 6:00 AM
luke updated this revision to Diff 503267.Mar 8 2023, 1:58 AM
luke marked an inline comment as done.

Address review comments

luke updated this revision to Diff 503268.Mar 8 2023, 1:59 AM

Address review comments

luke marked 6 inline comments as done.Mar 8 2023, 2:02 AM
luke updated this revision to Diff 503441.Mar 8 2023, 10:34 AM

Update test case with codegen changes from rebase

luke updated this revision to Diff 506621.Mar 20 2023, 9:28 AM

Rebase and update tests

luke updated this revision to Diff 506653.Mar 20 2023, 11:03 AM

Remove unused argument in isLegalInterleavedAccessType and make it public, since it can be used by TTI cost model hooks later

reames retitled this revision from [RISCV] Lower interleaved accesses to [RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN.Mar 22 2023, 7:47 AM
craig.topper added inline comments.Mar 24 2023, 5:10 PM

You can return true for Fractional. This division can only produce 1 or 0. Factor is 2-8 and LMUL is 2, 4, or 8.


8->10 to over the maximum operands: 8 vectors + pointer + VL


Merge these 3 ifs.

luke updated this revision to Diff 508551.Mar 27 2023, 3:01 AM

Address review comments

This revision is now accepted and ready to land.Mar 31 2023, 9:45 PM