If a blocks branches are not analyzable then moving the block to the top of a loop may well not create any new fallthrough edges. In those cases just leave the loop in a more standard order.
Some SystemZ atomix tests have changed. They look smaller to me, but may or may not be better.
On ARM this helps to slightly more reliably lower low overhead loops.