This patch adds a pass called "MVE VPT Optimisations", which does a few optimisations before register allocation.
The goal of this pass is to maximize the size of the VPT blocks created by the MVE VPT Block Insertion pass.
Currently, this pass:
- Replaces VPCMPs with VPNOTs when possible.
- The instruction selector in its current state doesn't generate VPNOTs very often. Instead, it generates a VCMP with the operands swapped and the condition reversed. This pass spots those VCMPs and transforms them into VPNOTs.
- Why generate more VPNOTs? So the MVE VPT Block Insertion pass can use them (& remove them) to create larger/more complex VPT blocks (e.g. TEET, TETE, etc.)
- Replaces usages of old VPR values with VPNOTs when inside a block of predicated instructions.
- This is done to avoid overlapping lifetimes of different VPR values, reducing the chance that a spill/reload occurs.
- Why ? Spill/reloads of VPR are particularly harmful to the MVE VPT Block Insertion Pass: it prevents it from creating large VPT blocks.
Perhaps put this into the below getOptLevel() != CodeGenOpt::None block? As it is an optimisation