This is an archive of the discontinued LLVM Phabricator instance.

[SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
ClosedPublic

Authored by kmclaughlin on Apr 21 2020, 9:03 AM.

Details

Summary

This patch maps IR operations for sdiv & udiv to the
@llvm.aarch64.sve.[s|u]div intrinsics.

A ptrue must be created during lowering as the div instructions
have only a predicated form.

Patch contains changes by Andrzej Warzynski.

Diff Detail

Event Timeline

kmclaughlin created this revision.Apr 21 2020, 9:03 AM
Herald added a project: Restricted Project. · View Herald Transcript

I'd prefer to handle legalization in a separate patch from handling legal sdiv/udiv operations, so we actually have some context to discuss the legalization strategy.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7670

If we're going to support these operations, we might as well just add isel patterns; that's what we've been doing for other arithmetic operations.

sdesmalen added inline comments.Apr 21 2020, 12:44 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7670

Just to provide a bit of context to this approach:

For unpredicated ISD nodes for which there is no predicated instruction, the predicate needs to be generated. For scalable vectors this will be a ptrue all, but for fixed-width vectors may take some other predicate such as VL8 for fixed 8 elements.

Rather than creating new predicated AArch64 ISD nodes for each operation such as AArch64ISD::UDIV_PRED, the idea is to reuse the intrinsic layer we already added to support the ACLE - which are predicated and for which we already have the patterns - and map directly onto those.

By doing the expansion in ISelLowering, the patterns stay simple and we can generalise getPtrue method so that it generates the right predicate for any scalable/fixed vector size as done in D71760 avoiding the need to write multiple patterns for different vector lengths.

This patch was meant as the proof of concept of that idea (as discussed in the sync-up call of Apr 2).

llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
28

This test should use CHECK-DAG instead of CHECK-NEXT, as the sdiv instructions are independent. (same for some of the other tests)

efriedma added inline comments.Apr 21 2020, 1:37 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7670

Using INTRINSIC_WO_CHAIN is a little annoying; it's hard to read in DAG dumps, and it gives weird error messages if we fail in selection. But there aren't really any other immediate downsides I can think of, vs. doing it the other way (converting the intrinsic to AArch64ISD::UDIV_PRED).

Long-term, we're going to have a target-independent ISD::UDIV_PRED. We probably want to start using those nodes at some point, to get target-independent optimizations. Not sure if that impacts what we want to do right now.

sdesmalen added inline comments.Apr 23 2020, 3:32 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7670

I agree that using INTRINSIC_WO_CHAIN will be a bit annoying for more complicated patterns. The reuse of the intrinsics was merely for convenience because we already have the patterns, and was not a critical part of the design. It shouldn't be a big effort to create AArch64ISD nodes and use these for the intrinsics as well.

If we use AArch64-specific nodes we can implement what's needed now for SVE codegen and I expect we can easily migrate to target-independent nodes when they get added.

  • Removed changes to handle legalisation from this patch (this will be included in a follow up)
  • Added AArch64ISD nodes for SDIV_PRED & UDIV_PRED
  • Changed LowerDIV to use the new ISD nodes rather than lowering to SVE intrinsics
  • Update tests to use CHECK-DAG
efriedma accepted this revision.Apr 23 2020, 12:35 PM

LGTM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11354

Whitespace.

This revision is now accepted and ready to land.Apr 23 2020, 12:35 PM
This revision was automatically updated to reflect the committed changes.
kmclaughlin marked an inline comment as done.

Thank you both for your comments on this patch, @efriedma & @sdesmalen!