Addresses https://bugs.llvm.org/show_bug.cgi?id=25191
This patch adds AArch64 support for __builtin_flt_rounds() intrinsic by implementing custom lowering to compute it from the FPCR system register. I based this change heavily off of the implementation in the ARM backend.