Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
- Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count.
- Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS.
- ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart.
- Add support in ArmLowOverheadLoops to handle the new pseudo instruction.
Initially I was a bit confused by this, the xor in particular. It makes sense though, but you're not checking it here in the patterns?