This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Custom ISel for fixed length extract/insert_subvector.
ClosedPublic

Authored by paulwalker-arm on Jun 30 2020, 5:58 AM.

Details

Summary

We use extact_subvector and insert_subvector to "cast" between
fixed length and scalable vectors. This patch adds custom c++
based ISel for the following cases:

fixed_vector = ISD::EXTRACT_SUBVECTOR scalable_vector, 0
scalable_vector = ISD::INSERT_SUBVECTOR undef(scalable_vector), fixed_vector, 0

Which result in either EXTRACT_SUBREG/INSERT_SUBREG for NEON sized
vectors or COPY_TO_REGCLASS otherwise.

Diff Detail

Event Timeline

paulwalker-arm created this revision.Jun 30 2020, 5:58 AM
Herald added a project: Restricted Project. · View Herald Transcript
efriedma added inline comments.Jun 30 2020, 12:31 PM
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

I'm not sure I understand the issue here. Is the problem just that for a pattern, you need to write the type of the result? I don't think there's any problem with writing a pattern involving a type that isn't always legal; if the type isn't legal, it just won't match.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8697

The comment here is really terse, and I'm not sure this is handling all the cases we need to.

In particular, I'm not sure this correctly handles types like nxv2f32.

paulwalker-arm marked an inline comment as done.Jun 30 2020, 1:50 PM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

I believe the issue is the runtime nature of the >128bit fixed length vectors means they are not mapped to any register class, which prevents pattern based matching.

We did investigate using hwmodes but it didn't prove to be a viable solution.

efriedma added inline comments.Jun 30 2020, 2:16 PM
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

I believe the issue is the runtime nature of the >128bit fixed length vectors means they are not mapped to any register class

You could change that in AArch64RegisterInfo.td, if you wanted to. I don't think that would cause any issues; it's okay if some of the types in the list aren't legal for all subtargets.

paulwalker-arm marked an inline comment as not done.Jun 30 2020, 2:50 PM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

Thanks for the tip, I'll take look and see how it works out.

paulwalker-arm marked 3 inline comments as done.Jul 1 2020, 6:04 AM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

I tried adding the extra fixed length vector MVTs to ZPRClass but it didn't really work out.

Firstly I needed to fix up a bunch of patterns due to "Could not infer all types in pattern" build errors. This might be down to ZPRClass being sized as 128, which looks especially weird once it starts to contain a lot of MVTs that are known bigger than 128. Ultimately though the failing patterns are easily fixed but I guess potentially burdensome since 99.9% of the patterns shouldn't need to care about fixed length SVE.

After this the extract_subvector patterns still do not match. I've tracked this down to SDTSubVecExtract's usage of SDTCisSubVecOfVec which is not strictly true given the newly expanded definition of extract_subvector. I cannot just update SDTCisSubVecOfVec because it's used by other operations where we don't want to allow a mixture of fixed length and scalable vectors (e.g. concat_vector).

I can add a new variant of SDTCisSubVecOfVec for use by extract_subvector and insert_subvector, but given the minimal usage I'm not sure it's worth it. What do you think?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8697

I've create a helper function to detect unpacked vector types and used it here so that only fixed length vector extracts from packed scalable types are considered legal.

paulwalker-arm marked an inline comment as done.

Updated selection comments to fully explain the rational. Updated lowering code to be more caution in what is considered legal.

efriedma added inline comments.Jul 1 2020, 12:30 PM
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

I guess it's fine to leave as-is, in that case.

3385

There isn't any corresponding code for INSERT_SUBVECTOR in AArch64ISelLowering?

paulwalker-arm marked an inline comment as done.Jul 1 2020, 1:18 PM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3385

Unlike EXTRACT_SUBVECTOR, AArch64ISelLowering didn't have any custom lowering for INSERT_SUBVECTOR for any vector types. For this reason I assumed the defaults were fine. My new usage when lowering fixed length to SVE only uses the legal variants so doesn't introduce any new requirements.

efriedma added inline comments.Jul 1 2020, 2:15 PM
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3366

Do we need the isPackedVectorType check here as well?

3385

In theory, you need exactly the same code.

Realistically, I'm not sure target-independent code will actually ever create an insert_subvector with a non-zero index. I briefly tried grepping through the relevant code, and couldn't find anything. I guess we could ignore the issue for now.

paulwalker-arm marked 2 inline comments as done.Jul 1 2020, 3:45 PM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3366

The code in LowerEXTRACT_SUBVECTOR should preclude that case, but I can add some asserts.

Added an implementation for LowerINSERT_SUBVECTOR along with more asserts.

paulwalker-arm marked 2 inline comments as done.Jul 2 2020, 8:09 AM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

Thanks. I'll circle back round to this after we've enough functionality for fixed length to be usable.

3385

To be consistent I've provided an implementation. We're not really in a position to properly test it, although it defaults to expand for all the cases where tests don't exist.

cameron.mcinally marked an inline comment as done.Jul 6 2020, 10:11 AM
cameron.mcinally added inline comments.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3244

No objections, but I do like Eli's idea for the long term. This would be a step in the right direction for passing fixed-width arguments. Having a restricted IR for fixed-width isn't ideal.

llvm/test/CodeGen/AArch64/sve-fixed-length-subvector.ll
26

Nit: could probably mark the loads/stores volatile to avoid the branch.

paulwalker-arm marked an inline comment as done.Jul 6 2020, 4:12 PM
paulwalker-arm added inline comments.
llvm/test/CodeGen/AArch64/sve-fixed-length-subvector.ll
26

I'm not sure how this helps. The reason for the branch is to force a block boundary to ensure the extract_subvector resulting from lowering the load is not combined with the insert_subvector that's created when lowering the store.

llvm/test/CodeGen/AArch64/sve-fixed-length-subvector.ll
26

Ok, that makes sense. Let me ask the opposite though -- if the load and store are volatile, will the fixed-width lowering honor the volatile?

x = load volatile *p
y = fneg x
x = load volatile *p
store *p, x

Unlikely a problem considering the loads made it to the backend, but would be good to confirm.

paulwalker-arm marked an inline comment as done.Jul 7 2020, 9:51 AM
paulwalker-arm added inline comments.
llvm/test/CodeGen/AArch64/sve-fixed-length-subvector.ll
26

The resulting masked memory operation takes the same MachineMemOperand as the original fixed length operation, so the volatile flag will be maintained.

This revision is now accepted and ready to land.Jul 7 2020, 12:28 PM
cameron.mcinally accepted this revision.Jul 7 2020, 12:41 PM
This revision was automatically updated to reflect the committed changes.