This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] fcaninicalize optimization for GFX9+
ClosedPublic

Authored by rampitec on Jul 12 2017, 4:22 PM.

Details

Summary

Since GFX9 supports denorm modes for v_min_f32/v_max_f32 that
is possible to further optimize fcanonicalize and remove it
if applied to min/max given their operands are known not to be
an sNaN or that sNaNs are not supported.

Additionally we can remove fcanonicalize if denorms are supported
for the VT and we know that its argument is never a NaN.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Jul 12 2017, 4:22 PM
arsenm added inline comments.Jul 12 2017, 4:25 PM
lib/Target/AMDGPU/SIISelLowering.cpp
4680 ↗(On Diff #106341)

Doesn't this need to check whether deforms are enabled or not

test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
353 ↗(On Diff #106341)

The not check isn't useful here

rampitec marked an inline comment as done.Jul 12 2017, 4:46 PM
rampitec added inline comments.
lib/Target/AMDGPU/SIISelLowering.cpp
4680 ↗(On Diff #106341)

Such check is not required but could extend the optimization for f16/f64 (where denorms are always enabled and not flushed) and for f32 on pre-GFX9 if denorms are enabled (which we do not do). Consider:

  1. Denorms are supported -> no need to use use canonicalize to care about denorms at all, only care about sNaNs.
  2. Denorms are flushed -> use the above check supportsMinMaxDenormModes() to see if they are flushed correctly.

So the actual condition would be (if_denorms_supported(VT) || ST->supportsMinMaxDenormModes()).

Actually I think it can be a part of bigger optimization, because if denorms are always enabled for a VT we can just ignore the actual opcode of fcanonicalize's operand and just optimize it out completely if signaling NaNs are not supported.

rampitec updated this revision to Diff 106348.Jul 12 2017, 4:47 PM

Removed one check-not in the test.

rampitec added inline comments.Jul 12 2017, 5:12 PM
lib/Target/AMDGPU/SIISelLowering.cpp
4681 ↗(On Diff #106341)

I believe is KnownNeverSNan is too much here, because it actually returns true if just fp-exceptions disabled. We still must support fmin/fmax correct behavior with signaling nans.

rampitec updated this revision to Diff 106355.Jul 12 2017, 5:17 PM
rampitec marked an inline comment as done.

Replaced isKnownNeverSNan with stricter check isKnownNeverNaN to correctly handle snans with fmin/fmax even with disabled exceptions.

rampitec updated this revision to Diff 106476.Jul 13 2017, 11:19 AM
rampitec marked 2 inline comments as done.
rampitec retitled this revision from [AMDGPU] fcaninicalize min/max optimization for GFX9+ to [AMDGPU] fcaninicalize optimization for GFX9+.
rampitec edited the summary of this revision. (Show Details)

Added general optimization for the case if denorms are supported for the VT (so we do not need to flush) and the value is a known never NaN.

artem.tamazov accepted this revision.Jul 13 2017, 3:04 PM

Looks good.

This revision is now accepted and ready to land.Jul 13 2017, 3:04 PM
This revision was automatically updated to reflect the committed changes.