This is an archive of the discontinued LLVM Phabricator instance.

InstCombine: Fold negations of is_fpclass intrinsics
ClosedPublic

Authored by arsenm on Nov 16 2022, 9:36 PM.

Details

Summary

Can invert the result by inverting the test mask.

Diff Detail

Event Timeline

arsenm created this revision.Nov 16 2022, 9:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 9:36 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
arsenm requested review of this revision.Nov 16 2022, 9:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 9:36 PM
Herald added a subscriber: wdng. · View Herald Transcript
sepavloff added inline comments.Nov 17 2022, 4:11 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
3715–3717

It must work for IEEE numbers. But for non-IEEE it is possible that a value does not belong to any of the classes known to is_fpclass. For example, type x86_fp80 has so-called unsupported values. Do you think this transformation can be safely applied to such numbers also or it is better to limit it to IEEE numbers only?

arsenm added inline comments.Nov 17 2022, 7:44 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
3715–3717

I don't know anything about x87. How is the intrinsic lowered for it?

Is it just these "pseudo-X" cases? I'd assume those are covered by the non-pseudo tests?

sepavloff accepted this revision.Nov 18 2022, 2:50 AM

LGTM.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
3715–3717

Yes, these are pseudo numbers. They are mapped to IEEE classes in more complex way, for example glibc recognizes pseudo-infinity as a NaN, and pseudo-denormals as normals.

Nevertheless any such number is mapped to one of IEEE classes and this optimization must work no matter how the mapping is realized.

This revision is now accepted and ready to land.Nov 18 2022, 2:50 AM
jcranmer-intel added inline comments.Dec 2 2022, 2:11 PM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
3715–3717

From what I can tell, all of the non-standard values for x86_fp80 are mapped to returns true iff is_fpclass checks for both sNaN and qNaN, but not solely one.

I think ~is_fpclass(x, mask) => is_fpclass(x, ~mask) is a valid transformation even for x86_fp80, even with this weird quirk.