Page MenuHomePhabricator

InstCombine: Handle folding some negated is_fpclass mask test cases

Authored by arsenm on Nov 30 2022, 10:09 AM.



Jay requested handling these in

I realized these cases that turn class into fcmp are likely broken if
input denormals are treated as 0. However, this is so poorly specified
I'm not sure it's really wrong. alive2 says they're wrong for
denormals. Assuming is_fpclass does not read canonical inputs (as
might be implied by not trapping on snans), under DAZ,
is_fpclass(denorm, zero) is false and fcmp denorm, 0 is true.

The LangRef doesn't say anything about this for The
AMDGPU class instruction does not consider the FP mode, and will see
denormal inputs as-is rather than the flush the input.

The only spec I know around denormal behavior is OpenCL's
-cl-denorms-are-zero, which doesn't clarify anything. It merely states
an implementation "can choose not to flush denorms to zero". It
doesn't specify whether this means outputs can be flushed, or if
inputs can be treated as 0s or what operations this applies to.

std::fpclassify does have a FP_SUBNORMAL result type. Under clang's
fcmp based implementation, it will return 0 for a denormal under DAZ.

Diff Detail

Event Timeline

arsenm created this revision.Nov 30 2022, 10:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 10:09 AM
arsenm requested review of this revision.Nov 30 2022, 10:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 10:09 AM
Herald added a subscriber: wdng. · View Herald Transcript
foad added inline comments.Nov 30 2022, 1:18 PM

That's also equivalent to != 0 - at least for the usual mapping of C-like comparison operators onto fp predicates.

If DAZ mode is semantically "denormals are non-canonical zero", then isfpclass(denormal, Zero) arguably should be true in DAZ mode, so lowering that to fcmp 0.0 would be legal. On the other hand, the actual lowering of an unoptimized isfpclass is likely to rely on bit patterns which would ignore the current dynamic value of DAZ. On the other other hand, there's no clear indication of what isfpclass returns for noncanonical values... and on the other other other hand, leaving the result of isfpclass as unspecified behavior for noncanonical inputs is incredibly unfriendly to users.


They're not equivalent: !(x == 0) is true if x is NaN, and x != 0 is false if x is NaN.

foad added inline comments.Nov 30 2022, 11:40 PM

They are equivalent. The usual mapping is that == means oeq and != means une. See for example F.9.3 in the C spec or

foad added a comment.Dec 1 2022, 2:22 AM

Seems OK. A more thorough handling of the ninf/nnan stuff could go like:

BitsToIgnore = 0;
if (isKnownNeverNaN)
  BitsToIgnore |= fcNan;
if (isKnownNeverInfinity)
  BitsToIgnore |= fcInf;

if ((Mask & ~BitsToIgnore) == 0) { convert to false }
if ((Mask | BitsToIgnore) == fcAllFlags) { convert to true }
if ((Mask & ~BitsToIgnore) == fcNan) { convert to isnan }
if ((Mask | BitsToIgnore) == (~fcNan & fcAllFlags)) { convert to !isnan }
if ((Mask & ~BitsToIgnore) == fcZero) { convert to == 0.0 }
if ((Mask | BitsToIgnore) == (~fcZero & fcAllFlags)) { convert to != 0.0 }
arsenm abandoned this revision.Dec 10 2022, 8:43 AM

Resplitting the patches so the pairs of negated cases are handled together