AArch64 is using the default TLI cost settings (expensive) for count-leading/trailing-zeros. I think this should be considered a cheap operation (and therefore fair game for speculation) for any AArch64 implementation.
The net result of allowing this speculation for the regression tests in this patch is that we get this code:
ctlz: clz w0, w0 ret cttz: rbit w8, w0 clz w0, w8 ret
Instead of:
ctlz: cbz w0, .LBB0_2 clz w0, w0 ret .LBB0_2: orr w0, wzr, #0x20 ret cttz: cbz w0, .LBB1_2 rbit w8, w0 clz w0, w8 ret .LBB1_2: orr w0, wzr, #0x20 ret
See D14469 for the larger motivation.