Page MenuHomePhabricator

[NVPTX] Fix the codegen for llvm.round.
ClosedPublic

Authored by bixia on Mar 28 2019, 10:16 AM.

Details

Summary

Previously, we translate llvm.round to PTX cvt.rni, which rounds to the
even interger when the source is equidistant between two integers. This
is not correct as llvm.round should round away from zero. This change
replaces llvm.round with a round away from zero implementation through
target specific custom lowering.

Modify a few affected tests to not check for cvt.rni. Instead, we check
for the use of a few constants used in implementing round. We are also
adding CUDA runnable tests to check for the values produced by
llvm.round to test-suites/External/CUDA.

Diff Detail

Repository
rL LLVM

Event Timeline

bixia created this revision.Mar 28 2019, 10:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2019, 10:16 AM
bixia added a reviewer: tra.Mar 28 2019, 10:18 AM

Does XLA:GPU built with this change pass the exhaustive test you added for kRound after we revert the change that made kRound call into libdevice?

tra added inline comments.Mar 28 2019, 3:36 PM
llvm/test/CodeGen/NVPTX/math-intrins.ll
77 ↗(On Diff #192674)

Why do we end up with .rzi (round to nearest integer *in the direction of zero*) here? Wasn't this patch supposed to change rounding to be *away* from zero?

bixia marked an inline comment as done.Mar 28 2019, 5:19 PM

Does XLA:GPU built with this change pass the exhaustive test you added for kRound after we revert the change that made kRound call into libdevice?

Yes, I took off the current workaround and ran exhaustive test.

llvm/test/CodeGen/NVPTX/math-intrins.ll
77 ↗(On Diff #192674)

The emulate implementation is eventually translated to cvt.rzi. I really don't have a good idea on how to modify this test, was also thinking to modify the original CHECK to CHECK-NOT, or delete the test. What do you suggest?

bixia updated this revision to Diff 192871.Mar 29 2019, 11:17 AM

Modify test cases to check for the use of some constants.

bixia updated this revision to Diff 192872.Mar 29 2019, 11:25 AM

Fix commit message.

bixia edited the summary of this revision. (Show Details)Mar 29 2019, 11:29 AM
tra added inline comments.Mar 29 2019, 1:41 PM
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
2107 ↗(On Diff #192872)

Do we have FP32-related constants defined somewhere in LLVM tree?
It would be easier to understand if we could use 1 << SIGN_BIT_SHIFT here or ((0 + EXP_OFFSET) << EXP_SHIFT) | mantissa below.

2130–2132 ↗(On Diff #192872)

This explains what's going on, but not why. Could you elaborate what's the reason behind double using different algorithm?

bixia updated this revision to Diff 192936.Mar 29 2019, 4:11 PM
bixia edited the summary of this revision. (Show Details)

Rewrite a few magic constants in a different form, hoping this can make it easier to understand what is going on.

tra accepted this revision.Mar 29 2019, 4:16 PM
This revision is now accepted and ready to land.Mar 29 2019, 4:16 PM
Closed by commit rL357407: [NVPTX] Fix the codegen for llvm.round. (authored by bixia1, committed by ). · Explain WhyApr 1 2019, 9:08 AM
This revision was automatically updated to reflect the committed changes.