Implement ACLE 2.0 macros of chapters 6.4 and 6.5 for [ARM] and [Aarch64] targets

Authored by labrinea on Aug 21 2015, 10:08 AM.



Implementation and testing of ACLE 2.0 macros, for details see the above document:
ACLE 2.0

Diff Detail

labrinea retitled this revision from to Implement ACLE 2.0 macros of chapters 6.4 and 6.5 for [ARM] and [Aarch64] targets.Aug 21 2015, 10:08 AM
labrinea updated this object.
labrinea added subscribers: cfe-commits, llvm-commits.

I'm surprised you decided to rearrange the output of the macros. It makes reviewing a lot harder. Is there some special reason to have that in any specific order?


Is it truly impossible for CPUAttr to be empty? Or just coincidental given our tests?

labrinea updated this revision to Diff 32934.Aug 24 2015, 1:20 AM

Reordered macro definitions to make review easier.

labrinea added inline comments.Aug 24 2015, 2:23 AM

According to the ACLE document, current AArch32 NEON implementations do not support double-precision floating-point even when it is present in VFP.

olista01 added inline comments.

The __fp16 type is accepted even if it is not supported by the hardware (library calls are emitted instead), so this can be defined unconditionally. I'm currently working on D12148, which also sets this and _ARM_FP16_ARGS.

labrinea added inline comments.Aug 24 2015, 6:27 AM

Good point. As far as I know there is hardware support for having _fp16 type values as function arguments but not as return values (yet). If so, should _ARM_FP16_ARGS macro be defined or not? (ACLE 2.0: "_ARM_FP16_ARGS is defined to 1 if __fp16 can be used as an argument and result.")

olista01 added inline comments.Aug 24 2015, 6:32 AM

Support for __fp16 args/returns should be independent of hardware support (at least in clang/llvm), as we can always lower to library function calls.

For AArch32: There is currently no support for __fp16 arguments or returns, but D12148 adds both of these, and sets the predefine.

For AArch64: There has been support for __fp16 args and returns for about a year, but we only recently turned the predefine on (D12240/rL245833).

labrinea added inline comments.Aug 24 2015, 6:56 AM

Good. I will update this patch accordingly.

labrinea updated this revision to Diff 33054.EditedAug 25 2015, 1:44 AM

_ARM_FP16_FORMAT_IEEE and _ARM_FP16_ARGS should be defined unconditionally. When hardware does not support it library calls are emitted.

Could I have some feedback please?

Hi Alexandros,

Sorry, I was on holidays. Apart from my comment, everything else looks good.



Isn't there a combination where you'll emit this macro twice?

Also, the target parser code has changed, please make sure it still works with the new version.

labrinea added inline comments.Sep 3 2015, 5:33 AM

This could be addressed as:

bool hasDSP = false;
if (is5EOrAbove && is32Bit && (CPUProfile != "M" || CPUAttr  == "7EM")) {
  Builder.defineMacro("__ARM_FEATURE_DSP", "1");
  hasDSP = true;

bool hasSAT = false;
if ((ArchVersion == 6 && CPUProfile != "M") || ArchVersion > 6 ) {
  Builder.defineMacro("__ARM_FEATURE_SAT", "1");
  hasSAT = true;

if (hasDSP || hasSAT)
  Builder.defineMacro("__ARM_FEATURE_QBIT", "1");
rengolin added inline comments.Sep 3 2015, 5:48 AM

Yup, that looks good.

rengolin accepted this revision.Sep 3 2015, 5:49 AM

With the changes you proposed, LGTM. Thanks!

This revision is now accepted and ready to land.Sep 3 2015, 5:49 AM
This comment was removed by labrinea.
labrinea closed this revision.Oct 9 2015, 3:52 AM

committed as r246768