It isn't really profitable in general, but it's profitable in cases where there we don't need to spill LR and the callee is a function pointer.
We don't actually generate a tail-call until after isel: we can't tell whether it will be profitable at that point, so we delay the decision to a separate Thumb1TailCallOptimizer pass.
Depends on D49459.