This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add nvcast patterns for v4f16 and v8f16
ClosedPublic

Authored by pirama on Apr 22 2015, 10:45 AM.

Details

Summary

Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants. This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Diff Detail

Event Timeline

pirama updated this revision to Diff 24240.Apr 22 2015, 10:45 AM
pirama retitled this revision from to [AArch64] Add nvcast patterns for v4f16 and v8f16.
pirama updated this object.
pirama edited the test plan for this revision. (Show Details)
pirama added reviewers: jmolloy, srhines, ab.
pirama added a subscriber: Unknown Object (MLST).

According to AArchISelLowering::LowerBUILD_VECTOR, nvcast from v2f32, v4f32, v2f64 can also be generated. However, I couldn't create the correct constant pattern that creates these ops.

For completeness, I can add patterns for v2f32 to v4f16 and also v4f32 and v2f64 to v8f16, but I might need some assistance adding tests for them, though.

jmolloy accepted this revision.Apr 22 2015, 11:05 AM
jmolloy edited edge metadata.

LGTM

This revision is now accepted and ready to land.Apr 22 2015, 11:05 AM
ab accepted this revision.Apr 22 2015, 11:16 AM
ab edited edge metadata.
ab added inline comments.
test/CodeGen/AArch64/fp16-vector-nvcast.ll
6

Note that you can add the nounwind attribute to the function (in a group) to avoid matching those pesky CFI directives.

pirama updated this revision to Diff 24251.EditedApr 22 2015, 12:54 PM
pirama edited edge metadata.

Clean up test by setting nounwind attribute to functions.

This revision was automatically updated to reflect the committed changes.