This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][CodeGen] Always use SVE (when enabled) to lower 64-bit vector multiplies
ClosedPublic

Authored by david-arm on Feb 2 2022, 8:28 AM.

Details

Summary

This patch adds custom lowering support for ISD::MUL with v1i64 and v2i64
types when SVE is enabled, regardless of the minimum SVE vector length. We
do this because NEON simply does not have 64-bit vector multiplies, so we
want to take advantage of these instructions in SVE.

I've updated the 128-bit min SVE vector bits tests here:

CodeGen/AArch64/sve-fixed-length-int-arith.ll
CodeGen/AArch64/sve-fixed-length-int-mulh.ll
CodeGen/AArch64/sve-fixed-length-int-rem.ll

Diff Detail

Event Timeline

david-arm created this revision.Feb 2 2022, 8:28 AM
david-arm requested review of this revision.Feb 2 2022, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2022, 8:28 AM
sdesmalen added inline comments.Feb 3 2022, 2:00 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

I find this quite confusing, because I would expect useSVEForFixedLengthVectorVT to return true when OverrideNEON is specified and SVE is enabled.

Personally I'd rather see you reintroducing the enum like you originally added in https://reviews.llvm.org/D117871?id=401931#inline-1127609 so that we can progressively migrate away from the OnlyIfSVEGreaterThan128Bits and replace this with Always, at which point we can do away with the function altogether. But at least it will be easier to search for cases to fix.

@paulwalker-arm do you have any strong opinions here?

llvm/test/CodeGen/AArch64/sve-fixed-length-int-mulh.ll
214

These check lines seem unnecessary because the output is the same. I wonder if they can be removed, or otherwise have a CHECK-NEON prefix, where the first RUN lines has --check-prefixes=CHECK,CHECK-NEON and the latter has --check-prefixes=CHECK-NEON,VBITS_EQ_128 ?

paulwalker-arm added inline comments.Feb 3 2022, 5:22 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

I'm still against having such an enum. Originally useSVEForFixedLengthVectorVT served the specific purpose of returning true when a VT required SVE, or rather SVE with a known minimum length, in order to be considered legal. It gets used during operation legalisation as a way to categorise types because isScalableVector()/isFixedLengthVector() is not sufficient and listing the VTs explicitly doesn't work because of the variable nature of the minimum SVE vector length.

The addition of OverrideNEON has always been a bit of a hack, because it gives the impression useSVEForFixedLengthVectorVT's result depends on some underlying operation, which is does not. I accepted this because it did provide a little more protection during initial bring up. After D117871 that protection has gone, which is fine because it had outlived its usefulness. To my mind this now means useSVEForFixedLengthVectorVT no longer has any need for its OverrideNEON parameter and thus it can be completely removed.

That said, useSVEForFixedLengthVectorVT is convenient because as you say, it makes the code easier to read and easier to search for the places where SVE is used for fixed length vectors. So my counter proposal is to keep OverrideNEON as a bool but change how it's used within useSVEForFixedLengthVectorVT to match https://reviews.llvm.org/D117871?id=401931#inline-1127609 (i.e. so it's checked before the Subtarget->useSVEForFixedLengthVectors ban hammer. The extra restriction I'm going to impose is that the setOperationAction calls should be protected by the same check used when lowering the operations.

If this is agreeable, please can you make the useSVEForFixedLengthVectors change in isolation as an NFC patch? Note that you'll need to make a similar change to LowerToScalableOp as you did for LowerToPredicatedOp, which is fine given the functions have very similar jobs and then there's isVectorLoadExtDesirable which is going to get a little messier as it's now the only case that only wants to override NEON when wide vectors are enabled. We can probably clean this up later if we decide to use SVE for NEON loads and stores as well.

paulwalker-arm added inline comments.Feb 3 2022, 5:42 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

I fear I may have missed a key piece of information regarding my NFC patch suggestion. The part of the useSVEForFixedLengthVectorVT functionality you want to change over time is the way the NEON override is controlled via a runtime switch. So my suggestion is to mark those places explicitly using useSVEForFixedLengthVectors. That way, over time we'll just remove those calls.

david-arm added inline comments.Feb 3 2022, 8:05 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

Hi @paulwalker-arm, sorry I've read your comments but I'll be honest I'm still not really sure what you're asking me to do here? Do you still want me to create another patch that changes useSVEForFixedLengthVectorVT to always use SVE if OverrideNEON is true? That will be a large patch because it causes lots of test failures.

Or are you asking me to simply use useSVEForFixedLengthVectors instead of useSVEForFixedLengthVectorVT in this patch?

david-arm added inline comments.Feb 3 2022, 8:12 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

Ah, maybe what you mean is remove the call to useSVEForFixedLengthVectors from useSVEForFixedLengthVectorVT, then add it explicitly in the place where we call useSVEForFixedLengthVectorVT, i.e.

Transform:

if (ST->useSVEForFixedLengthVectorVT())
  LowerToPredicatedOp();

to

if (ST->useSVEForFixedLengthVectors() &&  ST->useSVEForFixedLengthVectorVT())
  LowerToPredicatedOp();
david-arm updated this revision to Diff 405676.Feb 3 2022, 9:07 AM
paulwalker-arm added inline comments.Feb 3 2022, 5:23 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3951–3952

Sorry for the confusion Dave. I've created D118957 to show what I meant. If this works for you then feel free to either take the changes into D118917 or just rebase onto D118957 and I'll land my patch once accepted.

david-arm updated this revision to Diff 405885.Feb 4 2022, 12:58 AM
paulwalker-arm accepted this revision.Feb 7 2022, 9:43 AM

For sure the test CHECK line structure will need to be revisited but that's likely better done once some of the interdependencies like MULH are finished and we can be sure the code quality for >=128bit vectors is consistently good across all target vector lengths.

This revision is now accepted and ready to land.Feb 7 2022, 9:43 AM
This revision was landed with ongoing or failed builds.Feb 8 2022, 7:38 AM
This revision was automatically updated to reflect the committed changes.