Implement rounding mode support for addxf3/subxf3.
On architectures that implemented the support, this will access the corresponding floating point environment register to apply the correct rounding. For other architectures, it will keep the current behaviour and use IEEE-754 default rounding mode (to nearest, ties to even).
ARM32/AArch64 support implemented in this change. i386 and AMD64 will be added in a follow up change.
This needs to be guarded; vmrs is only valid on targets which support VFP. Not sure if we need to try to check that at runtime...