Page MenuHomePhabricator

[CodeGen][ARM] Coerce FP16 vectors to integer vectors when needed

Authored by miyuki on Aug 9 2018, 5:03 AM.



On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:

void foo(int, ...);

typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
void bar(float16x4_t x) {
  foo(42, x);

According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.

Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.

One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).

Diff Detail

rC Clang

Event Timeline

miyuki created this revision.Aug 9 2018, 5:03 AM

Do we need to check for homogeneous aggregates of half vectors somewhere?

miyuki updated this revision to Diff 160121.EditedAug 10 2018, 9:04 AM
miyuki edited the summary of this revision. (Show Details)

Fixed handling of homogeneous aggregates of FP16 vectors

efriedma added inline comments.Aug 10 2018, 12:53 PM

Do we need equivalent code in classifyReturnType?

miyuki updated this revision to Diff 160317.Aug 13 2018, 3:18 AM

Handle return of homogeneous aggregates

miyuki marked an inline comment as done.Aug 13 2018, 3:26 AM
javed.absar accepted this revision.Sep 11 2018, 1:53 AM
This revision is now accepted and ready to land.Sep 11 2018, 1:53 AM
This revision was automatically updated to reflect the committed changes.