This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Pass _Float16 as int or float
ClosedPublic

Authored by SjoerdMeijer on Jan 19 2018, 2:35 PM.

Details

Summary

Pass and return _Float16 as if it were an int or float for ARM, but with the
top 16 bits unspecified, similarly like we already do for __fp16.

We will implement proper half-precision function argument lowering in the ARM backend
soon, but want to use this workaround in the mean time.

Diff Detail

Repository
rL LLVM

Event Timeline

SjoerdMeijer created this revision.Jan 19 2018, 2:35 PM
samparker added a subscriber: samparker.EditedJan 22 2018, 8:27 AM

Hi Sjoerd,

Seems sensible to me to treat these two types the same way, though I must admit having different half types confuses me... So a few questions for my understanding:

  • What issue are you trying to workaround?
  • What would the ideal solution be?
  • Why do we need a workaround instead of implementing the ideal solution?

cheers!

test/CodeGen/arm-float16-arguments.c
4 ↗(On Diff #130694)

Probably worth keeping these in arm-fp16-arguments.c since they're basically the same.

Thanks for reviewing!

We are trying to achieve correct AAPCS parameter passing:

"If the argument is a Half-precision Floating Point Type its size is set to 4 bytes as if it
had been copied to the least significant bits of a 32-bit register and the remaining bits filled
with unspecified values"

and for returning results:

"A Half-precision Floating Point Type is returned in the least significant 16 bits of r0."

Summarising: AAPCS compliance for passing/returning _Float16 values.

Ideal solution would be to lower this:

_Float16 sub(_Float16 a, _Float16 b) {
  return a + b;
}

to this:

define half @sub(half %a, half %b) local_unnamed_addr {
entry:
  %add = fadd half %a, %b
  ret half %add
}

but with this patch we are generating:

define float @sub(float %a.coerce, float %b.coerce) local_unnamed_addr #0 {
entry:
  %0 = bitcast float %a.coerce to i32
  %tmp.0.extract.trunc = trunc i32 %0 to i16
  %1 = bitcast i16 %tmp.0.extract.trunc to half
  <SNIP>
  %add = fadd half %1, %3
  <SNIP>
}

With this we achieve that we pass a float, and interpret only the lower 16 bits (and
similar approach for the return value that I've omitted here).

Thus, we are working around the problem of legalizing f16 arguments/return values;
we are now doing this in Clang and thus don't have to do anything at all in the backend.
This is a 2-lines change, and enables to make progress with the Armv8.2-A FP16 tablegen
descriptions and also to start testing/using them; adjusting the calling conventions in the backend are
a bit more involved. I will start working on this ideal solution now, and once that is in place,
we can properly pass the half types and remove this workaround.

Moved the tests to the existing file (and fixed a few typos in the tests).

samparker accepted this revision.Jan 23 2018, 1:36 AM

Thanks for the explanation, LGTM, thanks!

This revision is now accepted and ready to land.Jan 23 2018, 1:36 AM
This revision was automatically updated to reflect the committed changes.