We have been running tests/benchmarks downstream with tail-predication enabled for some time now and this behaves as expected: we are not aware of any correctness issues, and this performs better across the board than disabling tail-predication.
So, if we can get D88086 out of the way, I think it is time to flip the switch.