Opcode FLT_ROUNDS_ is lowered as:
- Move FPSCR content to an FP register
- Move lower 32 bits into a GPR
- Adjust its value, and return
For subtargets without direct move support, it will store and then load. The load address needs adjustment (+4) only on big-endian targets. This patch fixes it on little-endian and will try direct move instruction first.
Can we use some generic node to express this semantic ? So that , we will handle it automatically. i.e.
x = MFFS
y = bitcast f64 to i64
z = trunc y to i32