[ARM] Introduce t2DoLoopStartTP

Authored by dmgreen on Nov 10 2020, 10:08 AM.


[ARM] Introduce t2DoLoopStartTP

This introduces a new pseudo instruction, almost identical to a
t2DoLoopStart but taking 2 parameters - the original loop iteration
count needed for a low overhead loop, plus the VCTP element count needed
for a DLSTP instruction setting up a tail predicated loop. The idea is
that the instruction holds both values and the backend
ARMLowOverheadLoops pass can pick between the two, depending on whether
it creates a tail predicated loop or falls back to a low overhead loop.

To do that there needs to be something that converts a t2DoLoopStart to
a t2DoLoopStartTP, for which this patch repurposes the
MVEVPTOptimisationsPass as a "tail predication and vpt optimisation"
pass. The extra operand for the t2DoLoopStartTP is chosen based on the
operands of VCTP's in the loop, and the instruction is moved as late in
the block as possible to attempt to increase the likelihood of making
tail predicated loops.

Differential Revision: https://reviews.llvm.org/D90591


dmgreenNov 10 2020, 10:08 AM
Differential Revision
D90591: [ARM] Introduce t2DoLoopStartTP
rG02af11094fe4: [libc++] NFC: Add helper methods to simplify __shared_ptr_emplace