This is a follow up of D92940.
We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.
Differential D93179
[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) pengfei on Dec 13 2020, 8:15 AM. Authored by
Details This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
Diff Detail
Event TimelineComment Actions Whatever the final decision is, maybe add a doxygen comment explaining the semantics? Comment Actions Yes, it is using maxpd/minpd here. Sorry for missing the nan cases. Comment Actions If we're going by existing behavior/compatibility, gcc/icc use packed ops too: Comment Actions I agreed. I have filed a bug internally for intrinsic guide. Comment Actions Hi Simon, I found we have the same problem for fadd/fmul. See https://godbolt.org/z/3YKaGx Comment Actions I'm not surprised - my current plan (after the holidays) is to add doxygen descriptions for all the reduction intrinsics and then update them making it clear what fast-math flags are assumed. Comment Actions That's great. We are also updating intrinsic guide for such information. Anyway, have a good holiday. Comment Actions @pengfei I'm sorry I haven't gotten back to looking at this yet - it makes sense to create a patch to revert the fadd/fmul reduction changes for trunk/12.x.
Comment Actions Address Sanjay's comment.
|