This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix non-flushing, pre-gfx9 implementation of fcanonicalize
ClosedPublic

Authored by arsenm on Apr 23 2020, 10:54 AM.

Details

Reviewers
rampitec
Summary

This fixes conformance failures when the library implementation of
fmin/fmax were accidentally not inlined, forcing the assumption of no
flushing on targets where denormals are not enabled by default.

If f32 denormals were enabled pre-gfx9, we would still try to
implement this with v_max_f32. Pre-gfx9, these instructions ignored
the denormal mode and did not flush. Switch to the multiply form,
which should always work in this case.

Now this will always use max to implement canonicalize on
gfx9+. Pre-gfx9, it will depend on the denormal mode and only use max
if flushing isn't enabled. We probably should only use max for f64 though.

For f32/f16 it's a neutral choice (and worse in terms of code size in
1 case for f16), but possibly worse for the compiler since it does add
an extra register use operand. Leave this change for later.

Diff Detail

Event Timeline

arsenm created this revision.Apr 23 2020, 10:54 AM
arsenm planned changes to this revision.Apr 23 2020, 10:57 AM

Just realized I broke this again

arsenm updated this revision to Diff 259657.Apr 23 2020, 11:47 AM

Apply workaround which was the original goal

rampitec accepted this revision.Apr 23 2020, 12:00 PM
This revision is now accepted and ready to land.Apr 23 2020, 12:00 PM