This is to avoid performance regressions when the default attribute
behavior is fixed to assume ieee.
I tested the default on x86_64 ubuntu, which seems to default to
FTZ/DAZ, but am guessing for x86 and PS4.
Differential D69979
clang: Guess at some platform FTZ/DAZ default settings arsenm on Nov 7 2019, 5:45 PM. Authored by
Details This is to avoid performance regressions when the default attribute I tested the default on x86_64 ubuntu, which seems to default to
Diff Detail Event TimelineComment Actions I checked Redhat 7.4 that's on the server I'm using for work. And I had a coworker check his Ubuntu 18.04 system with this program. And both systems printed 1f80 as the value of MXCSR which shows FTZ and DAZ are both 0. Are you seeing something different? #include <x86intrin.h> #include <stdio.h> int main() { int csr = _mm_getcsr(); printf("%x\n", csr); return 0; } Comment Actions AFAIK, x86(-64) Linux is IEEE-compliant by default. It's only when compiling with -ffast-math that clang/gcc link in the startup routine to set FTZ/DAZ. So this patch should use that same mechanism to set the denorm mode. See: @RKSimon - is it the same on PS4? Comment Actions Also, I may have missed some discussions. Does this patch series replace the proposal to add instruction-level FMF for denorms? Ie, did we decide that a function-level attribute is good enough? Comment Actions I think this is an orthogonal question. I would still find a ftz flag useful even in the presence of this attribute indicating flushing. For AMDGPU it would be useful with a specific instruction context to allow flushing even when the default mode is set to not flush. For example llvm.fmuladd could be emitted with an ftz flag which would select to an instruction that would ordinarily be illegal if denormals are enabled Comment Actions I see the value as 1f80. However the test program I wrote suggests the default is to flush (and what the comments in bug 34994 suggest?): In default FP mode neg_subnormal + neg_subnormal: -0x0p+0 neg_subnormal + neg_zero: -0x0p+0 sqrtf subnormal: 0x0p+0 sqrtf neg_subnormal: -0x0p+0 sqrtf neg_zero: -0x0p+0 With denormals disabled neg_subnormal + neg_subnormal: -0x0p+0 neg_subnormal + neg_zero: -0x0p+0 sqrtf subnormal: 0x0p+0 sqrtf neg_subnormal: -0x0p+0 sqrtf neg_zero: -0x0p+0 With denormals enabled neg_subnormal + neg_subnormal: -0x1p-126 neg_subnormal + neg_zero: -0x1p-127 sqrtf subnormal: 0x1.6a09e6p-64 sqrtf neg_subnormal: -nan sqrtf neg_zero: -0x0p+0 With daz only neg_subnormal + neg_subnormal: -0x0p+0 neg_subnormal + neg_zero: -0x0p+0 sqrtf subnormal: 0x0p+0 sqrtf neg_subnormal: -0x0p+0 sqrtf neg_zero: -0x0p+0 With ftz only neg_subnormal + neg_subnormal: -0x1p-126 neg_subnormal + neg_zero: -0x0p+0 sqrtf subnormal: 0x1.6a09e6p-64 sqrtf neg_subnormal: -nan sqrtf neg_zero: -0x0p+0 Comment Actions Is the test program attached somewhere? Comment Actions Thanks. I tried compiling with gcc (can't trust clang since it doesn't honor #pragma STDC FENV_ACCESS ON?). With denormals disabled a.out: subnormal_test.cpp:33: void fp32_denorm_test(): Assertion `std::fpclassify(subnormal) == FP_SUBNORMAL' failed. And if you compile with -ffast-math, it asserts: In default FP mode a.out: subnormal_test.cpp:33: void fp32_denorm_test(): Assertion `std::fpclassify(subnormal) == FP_SUBNORMAL' failed. This is what I see compiling Craig's csr tester: $ cc -O2 csr.c && ./a.out 1f80 $ cc -O2 csr.c -ffast-math && ./a.out 9fc0 FZ is bit 15 (0x8000) and DAZ is bit 6 (0x0040), so they are clear in default (IEEE) mode and set with -ffast-math. Comment Actions DAZ/FTZ seem to be set in crtfastmath.o, so try to reproduce the logic for linking that
|
Formatting nit - prefer to start with verb and lower-case: isFastMathRuntimeAvailable() or hasFastMathRuntime().