This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).
The llvm.memset intrinsic is converted to a TP loop for both
constant and non-constant input sizes (of llvm.memset).
Depends on D99723
.. for tail predicated loops.