Page MenuHomePhabricator

[RISCV] Add inline expansion for vector fround.

Authored by craig.topper on Jan 13 2022, 1:17 PM.



This avoids a crash for scalable vectors and or scalarization for
fixed vectors.

The algorithm is different enough that I don't think it makes sense
to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86
SSE2 output.

Diff Detail

Event Timeline

craig.topper created this revision.Jan 13 2022, 1:17 PM
craig.topper requested review of this revision.Jan 13 2022, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2022, 1:17 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
craig.topper added inline comments.Jan 13 2022, 1:27 PM

If I remember right, we have to use 0.49999999(mantissa of all ones) instead of 0.5 to deal with at least one corner case.

If the starting value is already 0.499999(mantissa all ones) then adding 0.5 would produce .9999999 with one extra mantissa bit than can be represented before the fadd rounds. The fadd would round that to 1.0. It would still be 1.0 after the conversion to int and back to fp. But the original 0.4999999(mantissa all ones) should produce 0.0.

By using 0.49999999(mantissa of all ones) adding that to 0.49999999 produces 0.49999999*2 which has the same number of ones in the mantissa just increments the exponent by 1. So the addition doesn't round to 1.0 it stays as 0.999999. The float->int->float conversion will turn that into 0.0.

Sorry for dropping the ball on this: floating-point isn't in my comfort zone. It sounds like it should do the right thing to me. I'll try to run this over our downstream (OpenCL) testing and see if it passes.


Is nextafter(0.5, 0.0) actually 0.4999999...? Should it be -1.0 to go backwards?

frasercrmck accepted this revision.Feb 4 2022, 8:18 AM

LGTM, our testing shows a nice improvement with this.

Do you think we should backport this to LLVM 14, given it fixes a scalable-vector crash?

This revision is now accepted and ready to land.Feb 4 2022, 8:18 AM
craig.topper added inline comments.Feb 4 2022, 8:59 AM

-1.0 would also work. The second parameter is used to determine the direction. Since both 0.0 and -1.0 are less then 0.5, either should work.

This revision was landed with ongoing or failed builds.Feb 4 2022, 9:12 AM
This revision was automatically updated to reflect the committed changes.