After the rewrite of this pass (D79175) I missed one thing: the inserted VCTP intrinsic can be cloned to exit blocks if there are instructions present in it that perform the same operation. However, it turns out that for handling reductions, see D75533, it's actually easier not not to have the VCTP in exit blocks. If the exit block was cloned, we rematerialized the trip count in the exit blocks, because that made other dead-code removal easier. But our dead-code removal got more powerful, so I don't think this is still needed and therefore this also removes RematerializeIterCount.
Not sure if we'll have to revisit this again if we're doing, say a reduction, and but also have a VCMP in the loop which writes to the VPR meaning it needs to be spilled and restored for the VPSEL... But we can cross that bridge if we come to it, cheers!