On x86, a simplified F128 maxval ends up calling fmaxl that does not
work properly for F128 arguments. It is probably an LLVM issue, but
we also should not use arith.maxf if NaN or -0.0 operands are possible.
The change is to use cmpf and select. Unfortunately, these arith ops
do not support FastMathFlags currently, so I will have to fix this
sooner or later (depending on how this affects performance).
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Just for information: I could not make clang generate a call to fmaxl. I tried the following test:
#include <stdio.h> void test(__float128 *x) { int i; __float128 max = -1000.0; for (i = 0; i < 4; ++i) { max = (x[i] > max) ? x[i] : max; } printf("%e\n", (double)max); } int main() { __float128 c[4] = {1.0, 1.5, 2.0, 0.5}; test(c); }
clang -fno-signed-zeros -fno-honor-nans maxval.c -O2: InstCombine can recognize the select into call nnan nsz fp128 @llvm.maxnum.f128, but then the instruction selection takes a different route and lowers the call back into select using __gttf2 for FP128 comparison:
CALL64pcrel32 target-flags(x86-plt) &__gttf2, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $xmm0, implicit $xmm1, implicit-def $rsp, implicit-def $ssp, implicit-def $eax %6:gr32 = COPY $eax TEST32rr %6:gr32, %6:gr32, implicit-def $eflags %7:vr128 = CMOV_VR128 %5:vr128, %1:vr128, 15, implicit $eflags
I think generating the llvm.maxnum without nnan nsz is never possible with clang, but if this happens the isel will produce fmaxl call.
Comment Actions
Looks OK.
FYI, there are some changes going on in the arith.maxf area.
https://reviews.llvm.org/D137786
https://reviews.llvm.org/D137655
Comment Actions
Thank you for the links, Kiran! I did not see this.
I think it should be okay to use the select in the meantime.