This is an archive of the discontinued LLVM Phabricator instance.

[POC][SVE] Allow code generation for fixed length vectorised loops [Patch 1/2].
AbandonedPublic

Authored by paulwalker-arm on Dec 20 2019, 6:29 AM.

Details

Reviewers
rengolin
efriedma
Summary

No expectations for review at this stage unless you are super keen.

This is a proof of concept patch to show how SVE can be used to code generate fixed length vectors. It represents the minimum amount of SVE instructions (ld1, st1, uzp, uunpklo) required to get most workloads to run.

The general idea is to make all fixed length vector types that fit within a user specified size legal and custom lower all fixed length vector operations to scalable vector operations that use a suitably created predicate. After legalisation there should be no vector operations that operate on fixed length vectors beyond insert/extract_subvector and various extends and truncate. The reason for special case these operations is to maximum our ability for DAG combine to remove them. Those still around at isel get custom selected within ISelDagToDag.

Diff Detail

Event Timeline

paulwalker-arm created this revision.Dec 20 2019, 6:29 AM
Herald added a reviewer: efriedma. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
paulwalker-arm edited the summary of this revision. (Show Details)Dec 20 2019, 6:31 AM

I've spent a little time considering alternatives. There are basically a few possibilities here:

  1. Say the types aren't legal, and convert the types as part of type legalization.
  2. Say the types are legal, but the operations aren't, and custom-legalize (or teach LegalizeDAG to legalize) all the operations.
  3. Say the types and patterns are legal, and add a bunch of extra patterns to match the instructions.

Any of these options have rough edges... but type legalization seems more natural to me. That would let you rewrite all the types, so you aren't stuck with "fake" types for values that are live across basic blocks. And it would let you take more advantage of existing type legalization infrastructure: you don't have to have a custom legalization handler for every SVE operation, and you probably get better handling of cases where the fixed->scalable transform still leaves some illegal types. And the code would be naturally shared with other targets.

You haven't really demonstrated how you plan to handle i1 vectors, in particular, <8 x i1> might be <vscale x 8 x i1> or <vscale x 4 x i1> depending on how it's generated/used. You can probably make it work with your current approach; I guess you end up inserting predicate pack/unpack operations in Select(), and we should be able to eliminate most of the redundant operations? Not completely sure how that works out. (If an i1 vector is live across basic blocks, the generated code gets really messy, but that's not really specific to SVE.)

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3139 ↗(On Diff #234877)

If we're going to use this approach, we probably want to custom legalize TRUNCATE and ANY_EXTEND, and leave only the INSERT_SUBVECTOR/EXTRACT_SUBVECTOR operations to be handled in Select().

fhahn added a subscriber: fhahn.Dec 20 2019, 1:17 PM
vkmr added a subscriber: vkmr.Apr 2 2020, 7:17 AM
dancgr added a subscriber: dancgr.May 28 2020, 10:39 AM
paulwalker-arm removed reviewers: rengolin, efriedma.
paulwalker-arm added a subscriber: efriedma.

Rebasing to reflect the majority of the functionality is now in master. What remains is likely to be abandoned in favour of function attributes but it's here for those who want to experiment.

Herald added a reviewer: efriedma. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
paulwalker-arm planned changes to this revision.Jul 13 2020, 10:18 AM
paulwalker-arm planned changes to this revision.Jul 20 2020, 6:10 AM
paulwalker-arm abandoned this revision.Sep 2 2020, 4:20 AM

The intention of this patch is now complete. All work is available in master with the exception of the hook into -msve-vector-bits which is not necessarily the direction we'll use once function attributes are available.