This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Remove redundant f{min,max}nm intrinsics.
ClosedPublic

Authored by fhahn on May 9 2022, 7:31 AM.

Details

Summary

The patch extends AArch64TTIImpl::instCombineIntrinsic to simplify
llvm.aarch64.neon.f{min,max}nm(a, a) -> a.

This helps with simplifying code written using the ACLE, e.g.
see https://godbolt.org/z/jYxsoc89c

Diff Detail

Event Timeline

fhahn created this revision.May 9 2022, 7:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2022, 7:31 AM
fhahn requested review of this revision.May 9 2022, 7:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2022, 7:31 AM
dmgreen accepted this revision.May 10 2022, 3:02 AM

Sounds good to me.

Do we need aarch64_neon_fmaxnm? Or is it equivalent to llvm.fmaxnum? We convert it late in the backend, maybe we should be doing that in the frontend instead, and removing the need for aarch64_neon_fmaxnm entirely. It would allow a lot more optimizations like this without the need to implement them all specifically.

This revision is now accepted and ready to land.May 10 2022, 3:02 AM
fhahn added a subscriber: scanon.May 10 2022, 3:23 AM

Sounds good to me.

Do we need aarch64_neon_fmaxnm? Or is it equivalent to llvm.fmaxnum? We convert it late in the backend, maybe we should be doing that in the frontend instead, and removing the need for aarch64_neon_fmaxnm entirely. It would allow a lot more optimizations like this without the need to implement them all specifically.

I *think* they behave differently with respect to signaling NaNs. IIUC llvm.maxnum always returns quite NaNs:

‘llvm.maxnum.*’ Intrinsic¶

Follows the IEEE-754 semantics for maxNum except for the handling of signaling NaNs. This matches the behavior of libm’s fmax.

If either operand is a NaN, returns the other non-NaN operand. Returns NaN only if both operands are NaN. The returned NaN is always quiet. If the operands compare equal, returns a value that compares equal to both operands. This means that fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.

AArch64's FMAXNM's says NaNs are handled according to the IEEE 754-2008 standard., so it seems like it would handle signaling NaNs, but I may be wrong. Perhaps @scanon knows more.

This revision was landed with ongoing or failed builds.May 10 2022, 12:00 PM
This revision was automatically updated to reflect the committed changes.