This is an archive of the discontinued LLVM Phabricator instance.

[ARM,MVE] Add intrinsics for gather/scatter load/stores.
ClosedPublic

Authored by simon_tatham on Nov 4 2019, 1:50 AM.

Details

Summary

This patch adds two new families of intrinsics, both of which are
memory accesses taking a vector of locations to load from / store to.

The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of
base addresses, and an immediate offset to be added consistently to
each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar
base address, and a vector of offsets to add to it. The
'shifted_offset' variants also multiply each offset by the element
size type, so that the vector is effectively of array indices.

At the IR level, these operations are represented by a single set of
four IR intrinsics: {gather,scatter} × {base,offset}. The other
details (signed/unsigned, shift, and memory element size as opposed to
vector element size) are all specified by IR intrinsic polymorphism
and immediate operands, because that made the selection job easier
than making a huge family of similarly named intrinsics.

I considered using the standard IR representations such as
llvm.masked.gather, but they're not a good fit. In order to use
llvm.masked.gather to represent a gather_offset load with element size
smaller than a pointer, you'd have to expand the <8 x i16> vector of
offsets into an <8 x i16*> vector of pointers, which would be split up
during legalization, so you'd spend most of your time undoing the mess
it had made. Also, ISel support for llvm.masked.gather would be easy
enough in a trivial way (you can expand it into a gather-base load
with a zero immediate offset), but instruction-selecting lots of
fiddly idioms back into all the _other_ MVE load instructions would be
much more work. So I think dedicated IR intrinsics are the more
sensible approach, at least for the moment.

On the clang tablegen side, I've added two new features to the
Tablegen source accepted by MveEmitter: a 'CopyKind' type node for
defining a type that varies with the parameter type (it lets you ask
for an unsigned integer type of the same width as the parameter), and
an 'unsignedflag' value node for passing an immediate IR operand which
is 0 for a signed integer type or 1 for an unsigned one. That lets me
write each kind of intrinsic just once and get all its subtypes and
immediate arguments generated automatically.

Also I've tweaked the handling of pointer-typed values in the code
generation part of MveEmitter: they're generated as Address rather
than Value (i.e. including an alignment) so that they can be given to
the ordinary IR load and store operations, but I'd omitted the code to
convert them back to Value when they're going to be used as an
argument to an IR intrinsic.

On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you
not only the full assembly-language suffix for a given vector type
(like 's32' or 'u16') but also the numeric-only one used by store
instructions (just '32' or '16').

Diff Detail

Event Timeline

simon_tatham created this revision.Nov 4 2019, 1:50 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 4 2019, 1:50 AM

Incorporated the Address→Value conversion system that was accidentally
in D69790 instead of this patch.

simon_tatham edited the summary of this revision. (Show Details)Nov 4 2019, 4:45 AM
dmgreen added inline comments.Nov 4 2019, 10:39 AM
clang/include/clang/Basic/arm_mve_defs.td
175

Should the t and u be s and k?

clang/test/CodeGen/arm-mve-intrinsics/scatter-gather.c
3

-DPOLYMORPHIC line

llvm/include/llvm/IR/IntrinsicsARM.td
815–819

This looks useful.

llvm/lib/Target/ARM/ARMInstrMVE.td
308

Very nice. Like it.

4655

Is UnsignedFlag used here, in the scatters?

simoll added a subscriber: simoll.Nov 4 2019, 11:00 AM
simon_tatham marked 3 inline comments as done.Nov 5 2019, 1:18 AM
simon_tatham added inline comments.
clang/include/clang/Basic/arm_mve_defs.td
175

Note to self: never change your mind about the variable names half way through writing a comment...

clang/test/CodeGen/arm-mve-intrinsics/scatter-gather.c
3

D'oh, that's what I get for copy-pasting the header from the one test file that doesn't have that extra line.

llvm/include/llvm/IR/IntrinsicsARM.td
815–819

Eventually we'll probably need another one alongside it, for the many intrinsics that add an inactive parameter as well as a predicate mask. But for the moment I'm just adding what I need to use right now.

Fixed variable names in comment for CopyKind; added -DPOLYMORPHIC test run (and it still passes); removed pointless use of UnsignedFlag.

Looks good as far as I can see.

clang/include/clang/Basic/arm_mve.td
93

Is it worth giving a name to VecOf<Unsigned<Scalar>>?

Added defs UScalar and UVector for the unsigned versions of Scalar and Vector. I agree they're likely to come in useful again (in the max/min absolute value intrinsics, for example), and also, they tidy up this batch quite a lot.

While I'm at it, reformatted all the scatter-gather tablegen to fit in 80 columns and be halfway legible. NFC.

dmgreen accepted this revision.Nov 5 2019, 9:38 AM

Nice. LGTM.

This revision is now accepted and ready to land.Nov 5 2019, 9:38 AM