In the special case, X = -1, Y = 0 and Z = 0 that (-(-1))*0+(-0)=0 is not equal to -(-1*0+0)=-0.
So it is a incorrect transform.
Details
- Reviewers
steven.zhang RKSimon spatel qiucf
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
40 ms | linux > Clang.CodeGen::lto-newpm-pipeline.c Script:
--
: 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple x86_64-unknown-linux-gnu -emit-llvm-bc -o /dev/null -fexperimental-new-pass-manager -fdebug-pass-manager -flto=full -O0 /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/lto-newpm-pipeline.c 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/lto-newpm-pipeline.c -check-prefix=CHECK-FULL-O0
| |
380 ms | linux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp Script:
--
: 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp
| |
720 ms | linux > LLVM.Other::new-pass-manager.ll Script:
--
: 'RUN: at line 8'; /mnt/disks/ssd0/agent/llvm-project/build/bin/opt -disable-output -disable-verify -debug-pass-manager -passes=no-op-module /mnt/disks/ssd0/agent/llvm-project/llvm/test/Other/new-pass-manager.ll 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/Other/new-pass-manager.ll --check-prefix=CHECK-MODULE-PASS
|
Event Timeline
No test changes?
The transform would be legal with no signed zeros fast math flag right? I believe that's checked in getNegatedExpression. But not checked in the X86 specific override.
I think the same issue may also exist in ARMInstrVFP.td. It matches both (fneg (fma x, y, z)) and (fma (fneg x), y, (fneg z)) to the same instruction.
// Match @llvm.fma.* intrinsics // (fneg (fma x, y, z)) -> (vfnma z, x, y) def : Pat<(fneg (fma (f64 DPR:$Dn), (f64 DPR:$Dm), (f64 DPR:$Ddin))), (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>, Requires<[HasVFP4,HasDPVFP]>; def : Pat<(fneg (fma (f32 SPR:$Sn), (f32 SPR:$Sm), (f32 SPR:$Sdin))), (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>, Requires<[HasVFP4]>; def : Pat<(fneg (fma (f16 HPR:$Sn), (f16 HPR:$Sm), (f16 (f16 HPR:$Sdin)))), (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>, Requires<[HasFullFP16]>; // (fma (fneg x), y, (fneg z)) -> (vfnma z, x, y) def : Pat<(f64 (fma (fneg DPR:$Dn), DPR:$Dm, (fneg DPR:$Ddin))), (VFNMAD DPR:$Ddin, DPR:$Dn, DPR:$Dm)>, Requires<[HasVFP4,HasDPVFP]>; def : Pat<(f32 (fma (fneg SPR:$Sn), SPR:$Sm, (fneg SPR:$Sdin))), (VFNMAS SPR:$Sdin, SPR:$Sn, SPR:$Sm)>, Requires<[HasVFP4]>; def : Pat<(f16 (fma (fneg (f16 HPR:$Sn)), (f16 HPR:$Sm), (fneg (f16 HPR:$Sdin)))), (VFNMAH (f16 HPR:$Sdin), (f16 HPR:$Sn), (f16 HPR:$Sm))>, Requires<[HasFullFP16]>;
Yes, it is legal with no signed zeros fast math flag. I see that PowerPC deal with this case specifically.
Should I add the condition with no signed zero to permit this transform instead of deleting it?
I think so. As Craig pointed out, the default implementation of getNegatedExpression will take care of the fast-math flags. You need check the nsz inside X86::getNegatedExpression() when perform some folding that might change the sign bit of zero.
Also, target specific test is missing. The powerpc part test change is unexpected as there is already nsz flags there.
+1. The bug(s) appear to be in target-specific code, so we need a minimal test for x86 (and ARM and possibly others) to go with the code fix. IIUC, something like this:
define double @fneg_fma(double %x, double %y, double %z) { %negx = fneg double %x %negz = fneg double %z %fma = call double @llvm.fma.f64(double %negx, double %y, double %negz) %n = fneg double %fma ret double %n }
Currently, that is transformed to vfmadd213sd, but that's a miscompile. We can simulate that in IR with Alive2:
https://alive2.llvm.org/ce/z/XxwBAJ