This adds some initial example IR intrinsics for MVE instructions that
deliver multiple output values, and hence, have to be instruction-
selected by custom C++ code instead of Tablegen patterns.
I've added the writeback gather load instructions (taking a vector of
base addresses and a single common offset, returning a vector of
loaded values and an updated vector of base addresses); one example
from the long shift family (taking and returning a 64-bit value in two
GPRs); and the VADC instruction (which propagates a carry bit from
each vector-lane addition to the next, taking an input carry flag in
FPSCR and outputting the final one in FPSCR as well).
To support the VPT-predicated forms of these instructions, I've
written some helper functions to add the cluster of MVE predicate
operands to the end of a MachineInstr. AddMVEPredicateToOps is used
when the instruction actually is predicated (so it takes a predicate
mask argument), and AddEmptyMVEPredicateToOps is for when the
instruction is unpredicated (so it fills in $noreg for the mask). Each
one comes in a form suitable for vpred_n, and one for vpred_r
which takes the extra 'inactive' parameter.
For VADC, the representation of the carry flag in the IR intrinsic is
a word intended to be moved directly to and from FPSCR_nzcvqc, i.e.
with the carry flag in bit 29 of the word. (The user-facing ACLE
intrinsic will want it to be in bit 0, but I'll do that on the clang
side.)
The immediate is in the range +-128? Do we need to diagnose here when that is out of range? Is it already diagnosed by the front end?