Floating point positive zero can be selected using fmv.w.x / fmv.d.x / fcvt.d.w and a zero source register.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
GCC seems to use fmv.s.x / fmv.d.x (the former of which now should be called fmv.w.x? ) instead of fcvt, not sure why. A quick Googling returned:
http://gcc.1065356.n8.nabble.com/New-Port-for-RISC-V-td1338959.html
+;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
+;; not available in RV32.
Which would suggest that, at least at some point in the past, there was a reason for preferring FMV over FCVT? Weird. I have to look into that more carefully.
A quick check with the GCC 10 in Fedora Rawhide shows the following results for this small testcase
float zero_float(void) { return 0; } double zero_double(void) { return 0; }
-march=rv32gc -mabi=ilp32d | -march=rv64gc -mabi=lp64d | |
zero_float | fmv.s.x | fmv.s.x |
zero_double | fcvt.d.w | fmv.d.x |
Perhaps these are slightly better than just fcvt everywhere because they avoid the rounding mode required by fcvt.s.w / fcvt.s.l / fcvt.d.l?
I'm happy to use those fmv.{s,d}.x instead.
With the rounding mode being stateless (it's per instruction), I'm don't know what the issue would be with that.
I'm happy to use those fmv.{s,d}.x instead.
I haven't yet had the opportunity to better investigate this. At first glance it seems fine to follow the GCC lead, and it could always be changed later, so if you want to update the patch for that I think I would be fine with approving it. But ideally we wouldn't just cargo cult this, and actually investigate what motivated that choice :-)
Thanks for sharing the table Roger.
Unless you fast-path register zero (or the value itself being zero), a naive implementation is going to have to do a full integer->float conversion, which is more likely to be a multi-cycle operation, regardless of rounding mode. Contrast that with an fmv, which is a special case of fsgnj, but the generic case is still just bit selection and concatenation, so it should always be single-cycle. Having said that, the Rocket schedule (and reading the code) indicates that both are 2-cycle operations, and for the Bluespec Piccolo/Flute cores both are 1-cycle operations, but the GCC schedule for the 7 series SiFive cores claims that fmv will be in the A pipe (address, i.e. loads/stores, but also any FP<->int given that already needs to be present on the load/store paths), and fcvt will be in the B pipe (branches, but also mul/div and and any other FP ops).
Actually I got confused, fmv.x.?and fmv.?.x aren't special cases of fsgnj (that's only true for the FP<->FP fmv's), but their own instructions outright. The rest of what I said still applies though.
Thanks for the detailed info James. So the GCC table is the ideal implementation, and probably what we should adopt, it seems?
LGTM!
llvm/lib/Target/RISCV/RISCVInstrInfoF.td | ||
---|---|---|
397–411 | Make sure the whitespace new line introduced here is consistent with the rest of the file. |
Thanks a lot for the review @luismarques !
llvm/lib/Target/RISCV/RISCVInstrInfoF.td | ||
---|---|---|
397–411 | Sure I'll do. |
Make sure the whitespace new line introduced here is consistent with the rest of the file.