This avoids a crash for scalable vectors and or scalarization for
fixed vectors.
The algorithm is different enough that I don't think it makes sense
to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86
SSE2 output.
Paths
| Differential D117247
[RISCV] Add inline expansion for vector fround. ClosedPublic Authored by craig.topper on Jan 13 2022, 1:17 PM.
Details Summary This avoids a crash for scalable vectors and or scalarization for The algorithm is different enough that I don't think it makes sense
Diff Detail
Event TimelineHerald added subscribers: VincentWu, luke957, achieveartificialintelligence and 25 others. · View Herald TranscriptJan 13 2022, 1:17 PM
Comment Actions Sorry for dropping the ball on this: floating-point isn't in my comfort zone. It sounds like it should do the right thing to me. I'll try to run this over our downstream (OpenCL) testing and see if it passes.
Comment Actions LGTM, our testing shows a nice improvement with this. Do you think we should backport this to LLVM 14, given it fixes a scalable-vector crash? This revision is now accepted and ready to land.Feb 4 2022, 8:18 AM
This revision was landed with ongoing or failed builds.Feb 4 2022, 9:12 AM Closed by commit rGc83905a30855: [RISCV] Add inline expansion for vector fround. (authored by craig.topper). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 405994 llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll
llvm/test/CodeGen/RISCV/rvv/fround-sdnode.ll
|
If I remember right, we have to use 0.49999999(mantissa of all ones) instead of 0.5 to deal with at least one corner case.
If the starting value is already 0.499999(mantissa all ones) then adding 0.5 would produce .9999999 with one extra mantissa bit than can be represented before the fadd rounds. The fadd would round that to 1.0. It would still be 1.0 after the conversion to int and back to fp. But the original 0.4999999(mantissa all ones) should produce 0.0.
By using 0.49999999(mantissa of all ones) adding that to 0.49999999 produces 0.49999999*2 which has the same number of ones in the mantissa just increments the exponent by 1. So the addition doesn't round to 1.0 it stays as 0.999999. The float->int->float conversion will turn that into 0.0.