This patch adds a new cost heuristic that allows peeling a single
iteration off multi-exit read-only loops, if they contain loop-invariant
loads that dominate the latch. If all non-latch exits are terminated
with unreachable, the invariant loads in the loop are guaranteed to be
dereferenceable, enabling hoisting/CSE'ing them.
This enables vectorization of loops with certain runtime-checks, like
multiple calls to std::vector::at.
This should give a 20-30% improvement in score of Geekbench5/HDR.
Returns... ? Is that supposed to be boolean?