The patch extends AArch64TTIImpl::instCombineIntrinsic to simplify
llvm.aarch64.neon.f{min,max}nm(a, a) -> a.
This helps with simplifying code written using the ACLE, e.g.
see https://godbolt.org/z/jYxsoc89c
Paths
| Differential D125234
[AArch64] Remove redundant f{min,max}nm intrinsics. ClosedPublic Authored by fhahn on May 9 2022, 7:31 AM.
Details Summary The patch extends AArch64TTIImpl::instCombineIntrinsic to simplify This helps with simplifying code written using the ACLE, e.g.
Diff Detail
Event TimelineComment Actions Sounds good to me. Do we need aarch64_neon_fmaxnm? Or is it equivalent to llvm.fmaxnum? We convert it late in the backend, maybe we should be doing that in the frontend instead, and removing the need for aarch64_neon_fmaxnm entirely. It would allow a lot more optimizations like this without the need to implement them all specifically. This revision is now accepted and ready to land.May 10 2022, 3:02 AM Comment Actions
I *think* they behave differently with respect to signaling NaNs. IIUC llvm.maxnum always returns quite NaNs: ‘llvm.maxnum.*’ Intrinsic¶ Follows the IEEE-754 semantics for maxNum except for the handling of signaling NaNs. This matches the behavior of libm’s fmax. If either operand is a NaN, returns the other non-NaN operand. Returns NaN only if both operands are NaN. The returned NaN is always quiet. If the operands compare equal, returns a value that compares equal to both operands. This means that fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0. AArch64's FMAXNM's says NaNs are handled according to the IEEE 754-2008 standard., so it seems like it would handle signaling NaNs, but I may be wrong. Perhaps @scanon knows more. This revision was landed with ongoing or failed builds.May 10 2022, 12:00 PM Closed by commit rG17a73992dd8b: [AArch64] Remove redundant f{min,max}nm intrinsics. (authored by fhahn). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 428457 llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
llvm/test/Transforms/InstCombine/AArch64/neon-min-max-intrinsics.ll
|