[ARM] Transforming memcpy to Tail predicated Loop
This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).
From an implementation point of view, the patch
- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter, on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable to be (by later passes) into a WLSTP loop.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D99723