This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Implement llvm.set.rounding
AcceptedPublic

Authored by arsenm on Jun 19 2023, 3:28 AM.

Details

Reviewers
foad
rampitec
Pierre-vh
cdevadas
jhuber6
b-sumner
sepavloff
Group Reviewers
Restricted Project
Summary

Use a shift of a magic constant and some offseting to convert from
flt_rounds values.

I don't know why the enum defines Dynamic = 7. The standard suggests
-1 is the cannot determine value. If we could start the extended
values at 4 we wouldn't the extra compare sub and select.

Diff Detail

Event Timeline

arsenm created this revision.Jun 19 2023, 3:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2023, 3:28 AM
arsenm requested review of this revision.Jun 19 2023, 3:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2023, 3:28 AM
Herald added a subscriber: wdng. · View Herald Transcript
Pierre-vh added inline comments.Jun 23 2023, 2:11 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
3630–3631

optional nit: can use the constructor form to make it less verbose (same below for the uint64_t)

3635

Can you add a short explanation of how the lowering works, with the bit table?

arsenm added inline comments.Jun 23 2023, 5:27 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
3635

There's more explanation of that in the get_rounding implementation.

I'm wondering if we can just fix the use of value 7 in the generic enum. The standard says -1 is for unknown, and I don't know why FloatingPointMode uses 7 for "dynamic". We wouldn't have to have this offset of 3 if the full range of target reserved values were available

arsenm updated this revision to Diff 539271.Jul 11 2023, 1:29 PM
arsenm edited the summary of this revision. (Show Details)
arsenm added a reviewer: sepavloff.
foad added inline comments.Jul 27 2023, 3:52 AM
llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
17–19

Simpler:

s_add_i32 s34, s4, -4
s_min_u32 s34, s4, s34
arsenm added inline comments.Aug 1 2023, 5:37 PM
llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
17–19

I would have hoped the combiner would deal with that

foad added inline comments.Aug 2 2023, 12:25 AM
llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
17–19

That would be way more likely if you changed the operands of the cmp to match the operands of the select. For the combiner to optimize it the way you've written it, it would need to understand the relationship between s4 and s34.

rampitec accepted this revision.Oct 5 2023, 1:32 PM
This revision is now accepted and ready to land.Oct 5 2023, 1:32 PM