This is an archive of the discontinued LLVM Phabricator instance.

Allow __fp16 as a function arg or return type for AArch64
ClosedPublic

Authored by olista01 on Jul 10 2014, 2:54 AM.

Details

Reviewers
olista01
Summary

ACLE 2.0 [1] allows __fp16 to be used as a function argument or return type. This enables this for AArch64.

I have not enabled this for 32-bit ARM targets yet, as we are expecting to release an updated version of the 32-bit AAPCS soon, which changes the handling of __fp16 in a non backwards-compatible way.

This also fixes an existing bug that causes clang to not allow homogeneous floating-point aggregates with a base type of __fp16. This is valid for AAPCS64, but not for AAPCS-VFP.

[1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

Diff Detail

Event Timeline

olista01 updated this revision to Diff 11255.Jul 10 2014, 2:54 AM
olista01 retitled this revision from to Allow __fp16 as a function arg or return type for AArch64.
olista01 updated this object.
olista01 edited the test plan for this revision. (Show Details)
olista01 added a subscriber: Unknown Object (MLST).

Hi Oliver,

I've done some more thinking about this, and posted an RFC to llvmdev and cfe-dev with the direction I'd like to pursue here longer term.

Regardless of the outcome of that more general proposal, I don't think the AArch64 backend is ready to cope with the code this patch produces yet. Allowing "half" as a function argument immediately introduces more possible operations to the IR we have to deal with: bitcast and load/store for one.

For example, all three of these functions exhibit poor behaviour at the moment:

__fp16 varFloat;

short foo(__fp16 in) {
  // __fp16 bitcast lowered to load/store
  return *(short *)∈
}

__fp16 bar(short in) {
  // __fp16 bitcast lowered to load/store
  return *(__fp16 *)∈
}

float baz() {
  // extload would be created in DAG and crash ISel. Crashes clang instead.
  return varFloat;
}

The last one actually crashes in Clang itself, which is worrying, since I was expecting an LLVM backend crash and hadn't spotted anything obviously wrong with this patch. I've not investigated exactly what's wrong there though.

The bitcasts are particularly nasty because, as far as I can see, there *is* no generic way to express a "half <-> i16" bitcast when the latter type is illegal (as it is for us). The best I've come up with this afternoon is EXTRACT_SUBREG or SUBREG_TO_REG, but creating those during ISelLowering is really ugly; if you have any suggestions...

Cheers.

Tim.

I've also just noticed this highly disturbing example:

#include <arm_neon.h>
#if WTF
short varInt;
__fp16 varFloat;

short foo(__fp16 in) {
  return *(short *)&in;
}

__fp16 bar(short in) {
  return *(__fp16 *)&in;
}

float baz() {
  //  return varFloat;
}
#endif


int16x4_t tim(float16x4_t a) {
  return vreinterpret_s16_f16(a);
  //  return vreinterpret_s16_f16(a);
}

The parameter type for "tim" changes for me depending on whether WTF is defined (and if it's "<4 x half>", AArch64 copes badly).

Cheers.

Tim.

olista01 updated this revision to Diff 11981.Jul 29 2014, 8:00 AM

The changing parameter type in your second example was due to clang caching the mapping between clang::Type and llvm::Type. This also hid a bug where HFAs would still be emitted using i16 unless there was a bare __fp16 parameter earlier in the file.

Hi Oliver,

I don't think the right way to handle this is by mapping __fp16 to
different types depending on the context. It's a hack in the front-end
to work around some not particularly difficult backend problems.

If we need __fp16 to map to half, we should make the extra effort so
that it can be a blanket change. It's the right direction to go in
anyway.

Cheers.

Tim.

olista01 updated this revision to Diff 12436.Aug 13 2014, 2:21 AM

This patch now makes half a valid IR type that can be emitted for AArch64. It can be passed to and returned from functions by value, and we use the normal IR instructions to convert between half and float, but we retain the C semantics of always promoting to float before performing any arithmetic operations on them.

olista01 accepted this revision.Sep 29 2014, 9:56 AM
olista01 added a reviewer: olista01.

This was accepted on the list and committed > 1 month ago, but this did not make it into phab.

This revision is now accepted and ready to land.Sep 29 2014, 9:56 AM
olista01 closed this revision.Sep 29 2014, 9:56 AM