After applying recent register allocation patches the compiler generates a register scavenging error. Branch Relaxation runs after RA, but RA doesn't allocate enough registers to plan for the case where an indirect branch ends up being needed during branch relaxation. This causes a “did not find scavenging index” assert to be hit from assignRegToScavengingIndex from within RegScavenger.
In this patch we estimate before RA whether an indirect branch is likely to be needed, and reserve 2 SGPRs if the branch distance is found to be above a threshold. This is difficult as you often don't have an accurate idea of the code size and branch distance before register allocation and when you would need to reserve the registers. We therefore make the distance calculation a reduced complexity approximation and add a tuning factor on the threshold through the -amdgpu-long-branch-factor command line argument.
If you're going to use FP for this, might as well use double