This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Custom lowering of floating-point reductions
ClosedPublic

Authored by c-rhodes on Apr 23 2020, 9:04 AM.

Details

Summary

This patch implements custom floating-point reduction ISD nodes that
have vector results, which are used to lower the following intrinsics:

  • llvm.aarch64.sve.fadda
  • llvm.aarch64.sve.faddv
  • llvm.aarch64.sve.fmaxv
  • llvm.aarch64.sve.fmaxnmv
  • llvm.aarch64.sve.fminv
  • llvm.aarch64.sve.fminnmv

SVE reduction instructions keep their result within a vector register,
with all other bits set to zero.

Changes in this patch were implemented by Paul Walker and Sander de
Smalen.

Diff Detail

Event Timeline

c-rhodes created this revision.Apr 23 2020, 9:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2020, 9:04 AM
c-rhodes updated this revision to Diff 259845.Apr 24 2020, 3:12 AM

Add context.

sdesmalen added inline comments.Apr 24 2020, 10:37 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8417

getSizeInBits().getKnownMinSize()

8418

Should this return SDValue() because VT is not yet legal?

8421

getVectorNumElements will issue a warning for scalable vectors, use getElementCount().Min

11298

nit: this is only used once on line 11309, you can inline the variable.

11315

same here.

llvm/test/CodeGen/AArch64/sve-intrinsics-fp-reduce.ll
1

Why is this change required?

c-rhodes updated this revision to Diff 260272.Apr 27 2020, 4:02 AM
  • Address Sander's comments
c-rhodes marked 5 inline comments as done.Apr 27 2020, 4:04 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
8418

if VT weren't legal it should be caught by the check above and SDValue() returned.

llvm/test/CodeGen/AArch64/sve-intrinsics-fp-reduce.ll
1

To remove // kill: def $d0 killed $d0 killed $z0 from the output.

Can you split the extractelement support into a separate patch?

llvm/include/llvm/IR/IntrinsicsAArch64.td
904–905

Please commit this separately.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
893

Is there some reason to mark EXTRACT_VECTOR_ELT "Custom" when the type isn't legal?

8430

Is there some reason we should do this transform as part of legalization, as opposed to just writing a few isel patterns?

llvm/utils/TableGen/CodeGenDAGPatterns.cpp
628

Do you mean <vscale x m x i32>?

631

Please don't use getVectorNumElements() on scalable vectors.

c-rhodes added inline comments.Apr 28 2020, 7:51 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
893

In the context of this patch no, but I guess it will make sense when legalizing vector_extract_elt. I've changed this to be legal only for legal FP types now that ISEL patterns are being used as you suggested.

8430

No good reason that I'm aware of, I've replaced this with the following patterns:

def : Pat<(vector_extract (nxv8f16 ZPR:$Zs), (i64 0)),
          (f16 (EXTRACT_SUBREG (v8f16 (EXTRACT_SUBREG ZPR:$Zs, zsub)), hsub))>;
def : Pat<(vector_extract (nxv4f32 ZPR:$Zs), (i64 0)),
          (f32 (EXTRACT_SUBREG (v4f32 (EXTRACT_SUBREG ZPR:$Zs, zsub)), ssub))>;
def : Pat<(vector_extract (nxv2f64 ZPR:$Zs), (i64 0)),
          (f64 (EXTRACT_SUBREG (v2f64 (EXTRACT_SUBREG ZPR:$Zs, zsub)), dsub))>;
llvm/utils/TableGen/CodeGenDAGPatterns.cpp
628

This change is no longer needed since removing these patterns:

def : Pat<(v8f16 (extract_subvector ZPR:$Zs, (i64 0))),
          (EXTRACT_SUBREG ZPR:$Zs, zsub)>;
def : Pat<(v4f32 (extract_subvector ZPR:$Zs, (i64 0))),
          (EXTRACT_SUBREG ZPR:$Zs, zsub)>;
def : Pat<(v2f64 (extract_subvector ZPR:$Zs, (i64 0))),
          (EXTRACT_SUBREG ZPR:$Zs, zsub)>;

which were no longer needed after replacing the LowerEXTRACT_VECTOR_ELT with ISEL patterns as you suggested.

c-rhodes updated this revision to Diff 260634.Apr 28 2020, 7:53 AM

Changes:

  • Replace LowerEXTRACT_VECTOR_ELT changes with ISEL patterns.
  • Remove scalable vector -> fixed-width extract_subvector patterns which are no longer needed with above patterns.
  • Remove TableGen change which is no longer needed after removing the above patterns.
c-rhodes marked an inline comment as done.
c-rhodes added inline comments.
llvm/include/llvm/IR/IntrinsicsAArch64.td
904–905

Moved to D79010.

sdesmalen added inline comments.Apr 28 2020, 1:43 PM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
8439–8440

Given the change to TableGen is no longer needed for this patch, I guess the changes to these patterns are no longer needed either?

We'll still need the TableGen changes and the`extract_subvector` patterns at a later point, but I don't think there's currently a way to test that yet (unless we allow using shufflevector to extract a fixed-width vector from a scalable vector).

efriedma added inline comments.Apr 28 2020, 3:18 PM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
336

Maybe worth explaining why you need two EXTRACT_SUBREG, as opposed to just one.

It would be nice to handle non-zero indexes, but I guess that can wait for a followup.

c-rhodes updated this revision to Diff 260892.Apr 29 2020, 5:45 AM

Changes:

  • Removed patterns from AArch64InstrInfo.td and AArch64InstrFormats.td that are no longer used.
  • Added comment clarifying FP reduction patterns.
  • Moved remaining IntrinsicsAArch64.td intrinsics to D79010.
c-rhodes marked 2 inline comments as done.Apr 29 2020, 5:47 AM
efriedma accepted this revision.Apr 29 2020, 1:23 PM

LGTM with one minor comment.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
897

I think "Legal" is the default, so this line doesn't do anything; .

This revision is now accepted and ready to land.Apr 29 2020, 1:23 PM