This is an archive of the discontinued LLVM Phabricator instance.

[AArch64-SVE]: force using SVE in streaming mode to lower arithmetic and logical fixed-width vector ops.
ClosedPublic

Authored by hassnaa-arm on Oct 5 2022, 3:05 PM.

Details

Summary

Force using SVE in streaming mode to lower arithmetic and logical fixed-width vector ops.

Diff Detail

Event Timeline

hassnaa-arm created this revision.Oct 5 2022, 3:05 PM
hassnaa-arm requested review of this revision.Oct 5 2022, 3:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2022, 3:06 PM
Matt added a subscriber: Matt.Oct 5 2022, 7:55 PM

Hi @hassnaa-arm, could you rename the title to something that describes the patch a little more? I think something like

[AArch64][SVE]: Force the use of SVE to lower fixed-width arithmetic ops in streaming mode

would be a bit clearer. What do you think?

Hi @hassnaa-arm, it looks like this patch is based off D133433. Can you add that as a parent revision so it's obvious to the reviewer please? You can do this by clicking on "Edit Related Revisions" -> "Edit Parent Revisions" at the top-right corner of the page. Thanks!

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4470

Hi @hassnaa-arm, this change doesn't look right. I would expect it to break some tests? When we're not in streaming mode we also want to override NEON for 64-bit element types. Can you put the OverrideNEON flag back in, perhaps something like

// If SVE is available then i64 vector multiplications can also be made legal.
bool OverrideNEON = VT == MVT::v2i64 || VT == MVT::v1i64 || Subtarget->forceSVEInStreamingMode();
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll
92

Again, this is illegal in streaming mode.

1588

This looks like a NEON instruction - can you investigate where this is coming from?

fix lowerMul to override NEON for v2i64 and v1i64 even if SVE is not forced

hassnaa-arm marked an inline comment as done.Oct 6 2022, 3:53 AM
hassnaa-arm retitled this revision from [AArch64-SVE]: force using SVE in streaming mode to [AArch64-SVE]: force using SVE in streaming mode to lower arithmetic and logical fixed-width vector ops..Oct 6 2022, 4:02 AM

Thanks for this @hassnaa-arm! I had some comments about how to tidy up the tests a bit. I also think some there are some load/store test changes that shouldn't be part of this patch.

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-stores.ll
420 ↗(On Diff #465695)

nit: whitespace

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ext-loads.ll
8 ↗(On Diff #465695)

I don't think these changes should be part of this patch, since it's not changing loads and stores?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll
13

Could you also add a test for an illegal NEON type too, i.e. <4 x i8> or <2 x i16>?

92

Please ignore this comment! stp q0, q1 is legal - my mistake!

473

Again, could you add at least one illegal type - <4 x i8> or <2 x i16>?

1410

Can you add an illegal NEON type such as <2 x i16>?

1588

Please ignore this - my mistake!

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll
299

This still has the vscale_range(16,0) attribute. Can you remove it and recreate the CHECK lines please?

1063

Again, this still has the vscale_range(16,0) attribute. Can you remove it and regenerate the CHECK lines?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll
17

Can you add a test for an illegal type such as <4 x i8> too?

218

Wow, this code surely gets an award for being so impressively bad?!

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll
472

I think that you can remove the tests greater than 512 bits, i.e. <128 x i8>. If the tests already work for <64 x i8> they are likely to work for anything larger too.

774

Again, maybe remove this test since I'm not sure what extra value it gives us?

1608

Again, maybe remove this test?

1775

Again, maybe remove this test?

2240

Again, maybe remove this test?

2335

Again, maybe remove this test?

2652

Again, maybe remove this test?

2747

Again, maybe remove this test?

david-arm added inline comments.Oct 6 2022, 7:33 AM
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll
3384

Again, maybe remove this test?

3686

Again, maybe remove this test?

4520

Again, maybe remove this test?

4687

Again, maybe remove this test?

5152

Again, maybe remove this test?

5247

Again, maybe remove this test?

5564

Again, maybe remove this test?

5659

Again, maybe remove this test?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-loads.ll
2 ↗(On Diff #465695)

Not part of this patch?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll
2 ↗(On Diff #465695)

Not part of this patch?

revert changes related to load/store as they are not related to this patch

hassnaa-arm marked 20 inline comments as done.

add some illegal NEON types test cases and remove unnecessary tests

hassnaa-arm added inline comments.Oct 6 2022, 12:01 PM
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ext-loads.ll
8 ↗(On Diff #465695)

I'm sorry, it's by mistake.
I will correct it.

This looks a lot better now thanks @hassnaa-arm and the test files are much smaller too. I spotted two issues in a couple of tests where we are using illegal NEON instructions, e.g. bic. Would you be able to investigate these please?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll
977

This is a NEON vector instruction - this is definitely illegal in streaming mode. Can you try to find out why this is being inserted please?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll
1212

Again, these bic instructions are illegal in streaming mode.

Fix invalid bic instruction in streaming mode.
Fix invalid bic instruction that was generated during 'and' combining by converting the fixed-length vector to scalable one to combine SVEAnd instead of and.

Remove unrelated changes

Remove unrelated changes

Remove unrelated changes

Update by latest changes of parent patch

Update by changes of parent patch

Update by parent patch

sdesmalen added inline comments.Oct 18 2022, 5:32 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15852

nit: In LLVM the style is to start local variables with an upper-case, i.e. ScalableLHS.

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll
66–93

I think this test can be removed, because you've already covered the "twice as wide" case (32 x i8) which ensures we don't emit any other instructions not valid in streaming mode. The "four times as wide' should already be covered by sve-fixed-length-int-arith.ll.

151

This test can be removed for the same reason as mentioned above.

224

This test can be removed for the same reason as mentioned above.

297

This test can be removed for the same reason as mentioned above.

(same for all other 4 x as wide instances in the remainder of this file and other files in this patch)

hassnaa-arm marked 5 inline comments as done.Oct 18 2022, 7:34 AM

Remove not needed test cases

paulwalker-arm added inline comments.Oct 19 2022, 5:47 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
15849–15855

This looks odd. We shouldn't really be doing lower within DAGCombine. What happens if you just exit the combine for the invalid case?

That said, I can see functions like tryAdvSIMDModImm32() are used in other part of codegen so I'm wondering if the prevention logic is best place within such functions so all use cases are covered.

22578

When we hit a similar issue with LowerToPredicatedOp() we decide to drop the calls to useSVEForFixedLengthVectorVT() in favour or just using VT.isFixedLengthVector() && isTypeLegal(VT). Would the same work in your case?

hassnaa-arm added inline comments.Oct 19 2022, 8:33 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22578

Sorry, I don't understand.
you mean dropping the call for useSVEForFixedLengthVectorVT(...) ?
or you mean using use SVEForFixedLengthVectorVT(VT) without passing the ovrrideNEON parameter ?

paulwalker-arm added inline comments.Oct 19 2022, 8:42 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22578

The former, so you can drop the call to useSVEForFixedLengthVectorVT() and instead have assert(VT.isFixedLengthVector() && isTypeLegal(VT) && .... By this point we should be working with only legal types and there's no harm in handling any of them.

hassnaa-arm added inline comments.Oct 19 2022, 9:26 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22578

why do you suggest that instead of using useSVEForFixedLengthVectorVT() ?
and why do you suggest it for LowerToPredicatedOp() only not also other lowering functions ?

paulwalker-arm added inline comments.Oct 19 2022, 9:47 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22578

useSVEForFixedLengthVectorVT() is a semi-complex function that exists to choose which path to take during code generation and is thusly used to determine how to lower an ISD::ADD for example.

However, by calling LowerToPredicatedOp() you've already made that decision and so you only need to detect scenarios that would result in broken code generation. For the case of LowerToPredicatedOp() this just means ensuring the input is a legal fixed length vector.

hassnaa-arm marked 4 inline comments as done.Oct 20 2022, 4:26 AM

Remove step of converting to scalable vector that was added within DAGCombine (performAndCombine)

Update by latest changes of parent patch.

Update by parent patch

Update by parent patch

Update by parent patch

Hi @hassnaa-arm, I think this patch is very close to being ready! However, do you know why the test file sve-streaming-fixed-length-int-shifts.ll was deleted?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1396

I remember in one of your previous patches that @sdesmalen mentioned you shouldn't need to add v1i64 as it should be treated as a scalar. What happens if you remove it? I imagine your v1i64 tests might just generate scalar code?

Update by parent patch

Hi @hassnaa-arm, I think this patch is very close to being ready! However, do you know why the test file sve-streaming-fixed-length-int-shifts.ll was deleted?

It was a fault while rebasing the parent patch to this patch.
In the parent patch, that deleted file was replaced by sve-streaming-mode-fixed-length-int-shifts.ll

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1396

Yes, at that patch, there were no tests needing custom-lowering for v1i64.

But in this patch, the test file of sve-streaming-mode-fixed-length-int-log.ll
has invalid instructions for the test cases of :

define <1 x i64> @and_v1i64(<1 x i64> %op1, <1 x i64> %op2)
define <1 x i64> @xor_v1i64(<1 x i64> %op1, <1 x i64> %op2)
hassnaa-arm marked an inline comment as done and an inline comment as not done.Nov 1 2022, 7:42 AM
paulwalker-arm accepted this revision.Nov 1 2022, 6:07 PM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1396

I'll also add that in general you don't want to revert such integer types to scalar because that'll cause GPR-VPR transfers that can be expensive, perhaps even more so when it comes to streaming mode. You can also see that within LowerMUL we have special handling for v1i64 to keep it in the vector unit.

This revision is now accepted and ready to land.Nov 1 2022, 6:07 PM
sdesmalen added inline comments.Nov 2 2022, 1:38 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1396

Does that mean we need coverage as well for v1i32? (and perhaps also v1i8, v1i16). If so, I wonder if that might warrant a separate patch rather than support the odd case here?

paulwalker-arm added inline comments.Nov 2 2022, 6:52 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1396

Discussed offline but for completeness the answer is no. The other MVTs you list are not type legal (only 64/128-bit vectors are legal for NEON) and so they'll not make it into operation legalisation.

This revision was landed with ongoing or failed builds.Nov 10 2022, 4:38 AM
This revision was automatically updated to reflect the committed changes.