fp16 and bf16 values can be used in GCC's inline assembly using the "w"
constraint, which means "VFP floating-point registers d0-d31" - fp16 and
bf16 values are stored in S registers (which alias the D registers).
This change ensures that LLVM is compatible with GCC for programs that
use fp16 and the 'w' constraint.