This is an archive of the discontinued LLVM Phabricator instance.

[ARM,MVE] Add intrinsics for contiguous load/stores.
ClosedPublic

Authored by simon_tatham on Nov 11 2019, 9:18 AM.

Details

Summary

This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.

Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.

The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the -enable-arm-maskedldst option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.

In the Tablegen backend, I've had to add a handful of extra support
features:

  • We need to be able to make clang::Address objects out of a pointer and an alignment (previously we only needed these when the user passed us an existing one).
  • We can now specify vector types that aren't 128 bits wide (for use in those intermediate values in IR), the parametrized type system can make one starting from two existing vector types (using the lane count of one and the element type of the other).
  • I've added support for code generation of pointer casts, and for specifying LLVM types as operands to IRBuilder operations (for zext and sext, though I think they'll come in useful again).
  • Now not all IR construction operations need to be specified as Builder.CreateFoo; some don't involve a Builder at all, and one passes it as a parameter to a tiny static helper function in CGBuiltin.cpp.

Event Timeline

simon_tatham created this revision.Nov 11 2019, 9:18 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 11 2019, 9:18 AM

Minor revision to the Tablegen changes to support different kinds of IR construction function: now the differing function-call prefixes are set up in arm_mve_defs.td, instead of in MveEmitter.cpp. That fits better with further changes I'm making in that area.

Very nice

Just to check, we don't have to care about big endian here? Is just works OK because the rest of llvm handles it OK? (I'm not sure if a vld1 is different to a vldr in big endian, for example).

Yes, vld1 has the same semantics as vldrw_*32 or vldrh_*16 or vldrb_*8. It's just a convenience alias that makes polymorphism easier – if I remember rightly the intended use case was people writing MVE intrinsics inside C++ templates.

dmgreen accepted this revision.Nov 13 2019, 3:58 AM

OK. vldr and vld1 working differently for Neon under BE, if I'm remembering correctly.

LGTM then.

clang/utils/TableGen/MveEmitter.cpp
475

Maybe update this comment?

This revision is now accepted and ready to land.Nov 13 2019, 3:58 AM
This revision was automatically updated to reflect the committed changes.