Page MenuHomePhabricator

Implement ACLE 2.0 macros of chapters 6.6 and 6.7 for [ARM] and [Aarch64] targets
ClosedPublic

Authored by labrinea on Sep 4 2015, 7:53 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

labrinea updated this revision to Diff 34029.Sep 4 2015, 7:53 AM
labrinea retitled this revision from to Implement ACLE 2.0 macros of chapters 6.6 and 6.7 for [ARM] and [Aarch64] targets.
labrinea updated this object.
labrinea added a subscriber: cfe-commits.
rengolin added inline comments.Sep 9 2015, 6:33 AM
lib/Basic/Targets.cpp
4785 ↗(On Diff #34029)

I'm not convinced by the runtime rounding being depending on C99, or why it's not possible in freestanding environments.

4790 ↗(On Diff #34029)

Er, well, the ACLE doc says it wasn't adopted in C11, but that doesn't mean that it was present in all others, just that there is a proposal for C11 which wasn't accepted.

I think this macro needs some other mechanism to be set, and unless Clang/LLVM supports signaling NANs unconditionally, we shouldn't add it.

5270 ↗(On Diff #34029)

same comments as above.

labrinea added inline comments.Sep 9 2015, 7:35 AM
lib/Basic/Targets.cpp
4785 ↗(On Diff #34029)

I copied the same logic from AArch64.

4786 ↗(On Diff #34029)

Shall we define this unconditionally?

4790 ↗(On Diff #34029)

Maybe we could skip this macro in this patch? @richard.barton.arm

5270 ↗(On Diff #34029)

This was present already.

Adding Richard Smith, as he was the one choosing C99/!freestanding for runtime rounding.

lib/Basic/Targets.cpp
4790 ↗(On Diff #34029)

If this item sparks controversy, we should keep it out. For now, I'm interested in what Tim and Richard(s) have to say.

richard.barton.arm edited edge metadata.EditedSep 17 2015, 5:43 AM

Hi all

I am uncomfortable with this patch for a number of reasons. These macros seem to me to be defined by the ACLE as describing the behaviour of the combination of library and compiler. For example, the STDC_IEC_599 macro would need some standards compliance from the compiler in terms of optimisations on floating point functions, and from the libraries in terms of the behaviour of the fp functions there. Thus, I don't think it is safe for clang to define this macro for all ARM compilation regardless of which library is being used to link with. Perhaps I have misunderstood that one - so it would be worth running this past Tim directly as the __ARM_FP_FENV_ROUNDING definition comes from his original AArch64 patches from early 2013.

If I am right - then clang would want to adopt an approach similar to GCC's described here https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html (grep for __GCC_IEC_559) where the compiler sets a macro which presumably is read by a compliant c library header which then sets the ACLE macro.

However, there are multiple bug reports in the area of FP math - see https://llvm.org/bugs/show_bug.cgi?id=8100 - which makes me think that for Clang to really be ACLE compliant, we should not be defining any of these macros apart from __ARM_FP_FAST. Additionally a bit of research on the devs list suggest that LLVM does not support signaling NaNs e.g. http://lists.llvm.org/pipermail/llvm-dev/2014-April/071699.html and http://lists.llvm.org/pipermail/llvm-dev/2014-August/075749.html from last year

I suggest that the patch needs to be resubmitted without the setting of ARM_FP_FENV_ROUNDING, STDC_IEC_559 and SUPPORT_SNAN, and that the setting of ARM_FP_FENV_ROUNDING in the aarch64 target be clarified.

labrinea updated this revision to Diff 35223.Sep 21 2015, 3:55 AM
labrinea edited edge metadata.

Patch updated.

@rsmith @t.p.northover could you please give us some feedback on this? Thanks.

Richard,

The original patch was following a lot from the AArch64, so I suspect we should first fix the AArch64 side, so that we can follow in a similar way. I don't want to have two different behaviour for ARM and AArch64 regarding macros and ACLE support. But I also don't think we should include AArch64 code with this patch.

cheers,
--renato

Yes - I would be interested in knowing the rationale behind the original AArch64 setting of __ARM_FP_FPENV_ROUNDING.

I also question whether our pre-conditions for __ARM_FP_FAST are sufficient. There are many -f... options relating to maths operations, are these two the only ones that might not preserve the order of operations? I think clang ought to take a conservative stance when setting these flags.

I think clang ought to take a conservative stance when setting these flags.

Exactly my point. This is one of the few examples where being pragmatic may lead to big headaches in the future. :)

GCC online docs state:

-ffast-math
Might allow some programs designed to not be too dependent on IEEE behavior for floating-point to run faster, or die trying. Sets -funsafe-math-optimizations, -ffinite-math-only, and -fno-trapping-math.

-funsafe-math-optimizations
Allow optimizations that may be give incorrect results for certain IEEE inputs.

-ffinite-math-only
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs.

-fno-trapping-math
Allow the compiler to assume that floating-point arithmetic will not generate traps on any inputs. This is useful, for example, when running a program using IEEE "non-stop" floating-point arithmetic.

Does this mean, define _ARM_FP_FAST when all these three flags are set (-funsafe-math-optimizations, -ffinite-math-only, and -fno-trapping-math)? We are currently checking ("-ffast-math" or "-ffinite-math-only").

ACLE docs state:

__ARM_FP_FAST is defined to 1 if floating-point optimizations may occur such that the computed results are different from those prescribed by the order of operations according to the C standard. Examples of such optimizations would be reassociation of expressions to reduce depth, and replacement of a division by constant with multiplication by its reciprocal.

t.p.northover edited edge metadata.Sep 22 2015, 8:38 AM

I can't explain the Opts.C99 (fesetround is provided by conforming C++11 implementations too), but I think I was just trying to capture the fact that decent hosted platforms support fesetround but freestanding ones don't (not having any library).

Supporting "C89 but we also have math.h" (or more realistically "C++03 with math.h") wasn't really something that interested me: users should be encouraged to write to standards.

Cheers.

Tim.

__ARM_FP_FAST is defined to 1 if floating-point optimizations may occur such that the computed results are different from those prescribed by the order of operations according to the C standard. Examples of such optimizations would be reassociation of expressions to reduce depth, and replacement of a division by constant with multiplication by its reciprocal.

This seems to be specific to the order of operations, which is related to "-funsafe-math-optimizations", but not necessarily the other three. If Clang has, and sets the same flags as GCC docs imply, we should only set the macro for that one flag.

labrinea updated this revision to Diff 35628.Sep 24 2015, 7:53 AM
labrinea edited edge metadata.

@t.p.northover I think we should not be defining _ARM_FP_FENV_ROUNDING on neither of ARM and AArch64 targets since "-frounding-math" is not available on clang: clang/llvm don't support C99 FP rounding mode pragmas (FENV_ACCESS etc)

@rengolin Clang translates frontend GCC compliant flags to cc1 flags:

  • -fno-honor-infinities => -menable-no-infs
  • -fno-honor-nans => -menable-no-nans
  • AND( -fassociative-math, -freciprocal-math, -fno-signed-zeros, -fno-trapping-math, -fno-math-errno ) => -menable-unsafe-fp-math

Those flags seem to affect an LLVM IR in two ways:

  • function attributes (no-infs-fp-math = yes|no, no-nans-fp-math = yes|no, unsafe-fp-math = yes|no)
  • fast-math flags used with generated fp-math operators (this doc summarizes the LLVM-IR fast-math flags)

I think we should not be checking Opts.FiniteMathOnly since it only indicates whether optimizations assume the arguments and result to be +/-inf. We should not be checking Opts.FastMath either since it might be the case where -menable-unsafe-fp-math is set but -ffast-math is not set. Then the user would rely on _ARM_FP_FAST being absent but in reality the backend would emit unsafe fp math IR. My patch introduces a new lang option to handle this case.

lib/Frontend/CompilerInvocation.cpp
1711 ↗(On Diff #35628)

What happens if I do -menable-unsafe-fp-math then -mno-enable-unsafe-fp-math? I don't think you have enough regression tests.

labrinea added inline comments.Sep 24 2015, 8:17 AM
lib/Frontend/CompilerInvocation.cpp
1711 ↗(On Diff #35628)

Well there is no such flag (-mno-enable-unsafe-fp-math) but probably what you want to say is that the order of the front-end flags matters. In llvm/tools/clang/test/Driver/fast-math.c there are tests checking whether "-menable-unsafe-fp-math" is correctly set depending on the order of the front-end flags. I could modify my tests so that they pass cc1 flags. Then we would be checking if _ARM_FP_FAST gets defined in the presence of "-menable-unsafe-fp-math". Would that be preferable?

In the overall approach, whilst this is the pragmatic approach to get the job done, it clearly takes liberties with the definition of LangOpt as this your new flag is most certainly a CodeGenOpt rather than a LangOpt. I'd like to see what the others say as I don't see a better solution.

lib/Frontend/CompilerInvocation.cpp
1711 ↗(On Diff #35628)

Ok - I did not realise that by this point in the code we know this has been taken care of. If that argument processing is already covered by tests then there is no need to add more and I retract my suggestion.

@t.p.northover I think we should not be defining _ARM_FP_FENV_ROUNDING on neither of ARM and AArch64 targets since "-frounding-math" is not available on clang: clang/llvm don't support C99 FP rounding mode pragmas (FENV_ACCESS etc)

@t.p.northover : "Ah, that constant folding is definitely a problem. Changing it sounds reasonable.
Tim."

(attaching Tim's comment since it was missing from the phabricator thread)

What do you think about this Renato? Although Alexandros has added a LangOption which is strictly speaking to do with CodeGen rather than Lang, I think this is the only pragmatica way to do this without needing re-engineering work to expose the codegen options to this phase in clang.

Are you and I able to make the call to let Alexandros commit, or do you think we ought to seek guidance from elsewhere? Do you have an idea who we could go to?

rengolin accepted this revision.Oct 2 2015, 7:23 AM
rengolin edited edge metadata.

Sorry, I've been redirected elsewhere for a few days.

I agree that this is the best option, since that's the meaning of the flag, anyway.

LGTM.

--renato

This revision is now accepted and ready to land.Oct 2 2015, 7:23 AM
This revision was automatically updated to reflect the committed changes.