This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Handle vec4 sitofp and uitofp for half
ClosedPublic

Authored by pirama on Apr 21 2015, 12:33 PM.

Details

Summary

Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32
inputs to allow promotion of v4f16 results.

The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64. I am
adding a test here for completeness.

Diff Detail

Event Timeline

pirama updated this revision to Diff 24155.Apr 21 2015, 12:33 PM
pirama retitled this revision from to [AArch64] Handle vec4 sitofp and uitofp for half.
pirama updated this object.
pirama edited the test plan for this revision. (Show Details)
pirama added reviewers: aemerson, rengolin, ab, jmolloy, srhines.
pirama added a subscriber: Unknown Object (MLST).

IIUC, this patch shouldn't affect v4i32 to v4f32 or v4f64 operations, but couldn't find any tests for those operations. I confirmed that locally, and can add tests with this patch if requested.

In a test of v4f16 operations (similar to the comprehensive test in http://reviews.llvm.org/D8648), only sitofp and uitofp cause problems, and will be fixed with this patch.

pirama updated this revision to Diff 24250.Apr 22 2015, 12:42 PM
  • Handled i8, i16 conversions
  • Added tests for v8f16 and v16f16 tests.
jmolloy accepted this revision.Apr 22 2015, 12:50 PM
jmolloy edited edge metadata.

Hi,

This looks fine to me, but please wait for Ahmed's LGTM.

Cheers,

James

This revision is now accepted and ready to land.Apr 22 2015, 12:50 PM
ab accepted this revision.Apr 22 2015, 2:17 PM
ab edited edge metadata.

LGTM as well (with a couple of nits), thanks!

lib/Target/AArch64/AArch64ISelLowering.cpp
554–566

Like you did below, another comment for these might be useful, to make it clear v8iN is possible when you do ->v8f16, like v4iN is when you do ->v4f32.

test/CodeGen/AArch64/fp16-v16-instructions.ll
25–32

And this kind of thing is why I abhor the ARM-style NEON syntax, and much prefer Apple's.

Though conversions aren't a very good example, and you can just match the register number rather than the entire thing, like you do for S0 below. Anyway, enough ranting, I'm fine with this ;)

test/CodeGen/AArch64/fp16-v8-instructions.ll
340–341

What about the higher lanes? (another rant: that's why I like over-using -NEXT, it avoids this kind of question entirely).

pirama added inline comments.Apr 22 2015, 2:37 PM
test/CodeGen/AArch64/fp16-v8-instructions.ll
340–341

Good catch. I'll fix this.

FileCheck should support CHECK-DAG-STRICT, or something similar, and reject any line without an entry in the CHECK-DAG-STRICT block. Then we no longer have to compromise between CHECK-NEXT and hardening against schedule variations.

pirama updated this revision to Diff 24259.Apr 22 2015, 2:41 PM
pirama edited edge metadata.

Added comment and fixed checks in a test.

ab added a comment.Apr 22 2015, 5:32 PM

Better indeed, please commit. Thanks!

(Also, when you receive a conditional LGTM like this, and agree with all the proposed changes, I don't think there's need for another review)

This revision was automatically updated to reflect the committed changes.