This is a follow up of D92940.
We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.
Paths
| Differential D93179
[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) ClosedPublic Authored by pengfei on Dec 13 2020, 8:15 AM.
Details Summary This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
Diff Detail
Event TimelineComment Actions Whatever the final decision is, maybe add a doxygen comment explaining the semantics? Comment Actions Yes, it is using maxpd/minpd here. Sorry for missing the nan cases. Comment Actions If we're going by existing behavior/compatibility, gcc/icc use packed ops too: Comment Actions
I agreed. I have filed a bug internally for intrinsic guide. Comment Actions Hi Simon, I found we have the same problem for fadd/fmul. See https://godbolt.org/z/3YKaGx Comment Actions
I'm not surprised - my current plan (after the holidays) is to add doxygen descriptions for all the reduction intrinsics and then update them making it clear what fast-math flags are assumed. Comment Actions
That's great. We are also updating intrinsic guide for such information. Anyway, have a good holiday. Comment Actions @pengfei I'm sorry I haven't gotten back to looking at this yet - it makes sense to create a patch to revert the fadd/fmul reduction changes for trunk/12.x. Comment Actions
Thanks @RKSimon, I had a try and found maybe we don't need to revert it. See D96231.
Comment Actions Address Sanjay's comment.
This revision is now accepted and ready to land.Feb 14 2021, 7:32 AM This revision was landed with ongoing or failed builds.Feb 14 2021, 5:16 PM Closed by commit rG61da20575d6c: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction… (authored by Wang, Pengfei <pengfei.wang@intel.com>). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 323653 clang/include/clang/Basic/BuiltinsX86.def
clang/lib/CodeGen/CGBuiltin.cpp
clang/lib/Headers/avx512fintrin.h
clang/test/CodeGen/X86/avx512-reduceMinMaxIntrin.c
|
If I understand correctly, there's nothing preventing -0.0 or NaN values in these ops. But if either of those are present, then the result is potentially indeterminate.
For the LLVM intrinsic, we have this text in LangRef:
"The result will always be a number unless all elements of the vector are NaN. For a vector with minimum element magnitude 0.0 and containing both +0.0 and -0.0 elements, the sign of the result is unspecified."