Added the remaining macros and corresponding regression tests.
rengolin t.p.northover richard.barton.arm rsmith
- rGf5a8e6c5ab13: Implement ACLE 2.0 macros of chapters 6.6 and 6.7 for [ARM] and [Aarch64]…
rL249140: Implement ACLE 2.0 macros of chapters 6.6 and 6.7 for [ARM] and [Aarch64]…
rC249140: Implement ACLE 2.0 macros of chapters 6.6 and 6.7 for [ARM] and [Aarch64]…
|4785 ↗||(On Diff #34029)|
I'm not convinced by the runtime rounding being depending on C99, or why it's not possible in freestanding environments.
|4790 ↗||(On Diff #34029)|
Er, well, the ACLE doc says it wasn't adopted in C11, but that doesn't mean that it was present in all others, just that there is a proposal for C11 which wasn't accepted.
I think this macro needs some other mechanism to be set, and unless Clang/LLVM supports signaling NANs unconditionally, we shouldn't add it.
|5270 ↗||(On Diff #34029)|
same comments as above.
I am uncomfortable with this patch for a number of reasons. These macros seem to me to be defined by the ACLE as describing the behaviour of the combination of library and compiler. For example, the STDC_IEC_599 macro would need some standards compliance from the compiler in terms of optimisations on floating point functions, and from the libraries in terms of the behaviour of the fp functions there. Thus, I don't think it is safe for clang to define this macro for all ARM compilation regardless of which library is being used to link with. Perhaps I have misunderstood that one - so it would be worth running this past Tim directly as the __ARM_FP_FENV_ROUNDING definition comes from his original AArch64 patches from early 2013.
If I am right - then clang would want to adopt an approach similar to GCC's described here https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html (grep for __GCC_IEC_559) where the compiler sets a macro which presumably is read by a compliant c library header which then sets the ACLE macro.
However, there are multiple bug reports in the area of FP math - see https://llvm.org/bugs/show_bug.cgi?id=8100 - which makes me think that for Clang to really be ACLE compliant, we should not be defining any of these macros apart from __ARM_FP_FAST. Additionally a bit of research on the devs list suggest that LLVM does not support signaling NaNs e.g. http://lists.llvm.org/pipermail/llvm-dev/2014-April/071699.html and http://lists.llvm.org/pipermail/llvm-dev/2014-August/075749.html from last year
I suggest that the patch needs to be resubmitted without the setting of ARM_FP_FENV_ROUNDING, STDC_IEC_559 and SUPPORT_SNAN, and that the setting of ARM_FP_FENV_ROUNDING in the aarch64 target be clarified.
The original patch was following a lot from the AArch64, so I suspect we should first fix the AArch64 side, so that we can follow in a similar way. I don't want to have two different behaviour for ARM and AArch64 regarding macros and ACLE support. But I also don't think we should include AArch64 code with this patch.
Yes - I would be interested in knowing the rationale behind the original AArch64 setting of __ARM_FP_FPENV_ROUNDING.
I also question whether our pre-conditions for __ARM_FP_FAST are sufficient. There are many -f... options relating to maths operations, are these two the only ones that might not preserve the order of operations? I think clang ought to take a conservative stance when setting these flags.
GCC online docs state:
Might allow some programs designed to not be too dependent on IEEE behavior for floating-point to run faster, or die trying. Sets -funsafe-math-optimizations, -ffinite-math-only, and -fno-trapping-math.
Allow optimizations that may be give incorrect results for certain IEEE inputs.
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs.
Allow the compiler to assume that floating-point arithmetic will not generate traps on any inputs. This is useful, for example, when running a program using IEEE "non-stop" floating-point arithmetic.
Does this mean, define _ARM_FP_FAST when all these three flags are set (-funsafe-math-optimizations, -ffinite-math-only, and -fno-trapping-math)? We are currently checking ("-ffast-math" or "-ffinite-math-only").
ACLE docs state:
__ARM_FP_FAST is defined to 1 if floating-point optimizations may occur such that the computed results are different from those prescribed by the order of operations according to the C standard. Examples of such optimizations would be reassociation of expressions to reduce depth, and replacement of a division by constant with multiplication by its reciprocal.
I can't explain the Opts.C99 (fesetround is provided by conforming C++11 implementations too), but I think I was just trying to capture the fact that decent hosted platforms support fesetround but freestanding ones don't (not having any library).
Supporting "C89 but we also have math.h" (or more realistically "C++03 with math.h") wasn't really something that interested me: users should be encouraged to write to standards.
This seems to be specific to the order of operations, which is related to "-funsafe-math-optimizations", but not necessarily the other three. If Clang has, and sets the same flags as GCC docs imply, we should only set the macro for that one flag.
@t.p.northover I think we should not be defining _ARM_FP_FENV_ROUNDING on neither of ARM and AArch64 targets since "-frounding-math" is not available on clang: clang/llvm don't support C99 FP rounding mode pragmas (FENV_ACCESS etc)
@rengolin Clang translates frontend GCC compliant flags to cc1 flags:
- -fno-honor-infinities => -menable-no-infs
- -fno-honor-nans => -menable-no-nans
- AND( -fassociative-math, -freciprocal-math, -fno-signed-zeros, -fno-trapping-math, -fno-math-errno ) => -menable-unsafe-fp-math
Those flags seem to affect an LLVM IR in two ways:
- function attributes (no-infs-fp-math = yes|no, no-nans-fp-math = yes|no, unsafe-fp-math = yes|no)
- fast-math flags used with generated fp-math operators (this doc summarizes the LLVM-IR fast-math flags)
I think we should not be checking Opts.FiniteMathOnly since it only indicates whether optimizations assume the arguments and result to be +/-inf. We should not be checking Opts.FastMath either since it might be the case where -menable-unsafe-fp-math is set but -ffast-math is not set. Then the user would rely on _ARM_FP_FAST being absent but in reality the backend would emit unsafe fp math IR. My patch introduces a new lang option to handle this case.
|1711 ↗||(On Diff #35628)|
Well there is no such flag (-mno-enable-unsafe-fp-math) but probably what you want to say is that the order of the front-end flags matters. In llvm/tools/clang/test/Driver/fast-math.c there are tests checking whether "-menable-unsafe-fp-math" is correctly set depending on the order of the front-end flags. I could modify my tests so that they pass cc1 flags. Then we would be checking if _ARM_FP_FAST gets defined in the presence of "-menable-unsafe-fp-math". Would that be preferable?
In the overall approach, whilst this is the pragmatic approach to get the job done, it clearly takes liberties with the definition of LangOpt as this your new flag is most certainly a CodeGenOpt rather than a LangOpt. I'd like to see what the others say as I don't see a better solution.
|1711 ↗||(On Diff #35628)|
Ok - I did not realise that by this point in the code we know this has been taken care of. If that argument processing is already covered by tests then there is no need to add more and I retract my suggestion.
What do you think about this Renato? Although Alexandros has added a LangOption which is strictly speaking to do with CodeGen rather than Lang, I think this is the only pragmatica way to do this without needing re-engineering work to expose the codegen options to this phase in clang.
Are you and I able to make the call to let Alexandros commit, or do you think we ought to seek guidance from elsewhere? Do you have an idea who we could go to?