When working with edges entering or leaving a VPBasicBlock, we need to use either the first or last IR block corresponding to the VPBasicBlock. The existing code is only correct when a VPBasicBlock in the header or latch position corresponds to exactly one BasicBlock. This happens to be true as the only VPBasicBlock which corresponds to more than one BasicBlock today is a VPReplicateRecipe which (as an implementation detail) can't be either a loop header or latch.
I decided to track the whole set for simplicity. It feels odd to not have a way to map IR blocks to VP blocks and vice versa, so since I was changing it anyways, figured I'd just track the whole set.
Hmm, VPBasicBlock::execute() is called once per VPBB when generating IR, so how could VPBB2IRBB record multiple/overwriting NewBB's here per same this? If there are such cases, perhaps its better to simplify them and retain a single IRBB per VPBB, adding an assert that no overwriting takes place. VPBB's contain only recipes and are free of control-flow - which is modeled using multiple VPBB's and VPRegions - so should fill a single IRBB.
A related issue are A-B-C cases above that try to reuse the same IRBB for pairs of back-to-back VPBB's (which may best be avoided for clarity and left to subsequent simplifyCFG to fold instead), but these are multiple VPBB's filling one IRBB, rather than the converse.
Another related issue is the correspondence between original IRBB's and VPBB's during VPlan construction rather than execution. There multiple replicate-and-predicate recipes that stem from the same IRBB are each assigned separate VPBB's (of a replicating region), as can be seen by trying to assign each such VPBB a unique name associated with its original IRBB (VPBBsForBB). Again yielding multiple VPBB's per one IRBB.