This approach of materializing 64-bit masks using ADDI+SRLI was suggested a while ago by @lewis-revill [1] but never implemented.
- The ShiftAmount is tweaked;
- In the code generated for the expansion of fcopysign(a, -b) an addi was replaced by a srli. This could be slightly slower (e.g. picorv32) but it's probably not very concerning. Even the E31 has single-cycle shifts, and these softfloat computations are probably not important workloads for microcontrollers.