This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Added FP16 VREV Vector Instrinsic CodeGen support
ClosedPublic

Authored by LukeGeeson on Aug 3 2018, 6:30 AM.

Details

Summary

It adds FP16 vrev codegen support from test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll

In line with Sjoerd Meijers' bug hunting report: https://bugs.llvm.org/show_bug.cgi?id=38404 and https://reviews.llvm.org/rL338568

Diff Detail

Event Timeline

LukeGeeson created this revision.Aug 3 2018, 6:30 AM

Looks like a straight forward fix, but how about the v4f16 variant?

Ah Yep, for v4f16 I am waiting on the result of https://reviews.llvm.org/D49987

SjoerdMeijer added inline comments.Aug 6 2018, 6:20 AM
lib/Target/ARM/ARMInstrNEON.td
6561

I think we just need

def : Pat<(v4f16 (NEONvrev64 (v4f16 DPR:$Vm))), (VREV64d16 DPR:$Vm)>;

looks as though the vext codegen was added in L339241. Will add just the vrev64 codegen...

-removed vext since it was committed in rL339241
-added vrev64 4xf16 codegen support

LukeGeeson retitled this revision from [ARM] Added Codegen support for vrev64q_f16 8xf16 and vextq_f16 8xf16 vector intrinsics to [ARM] Added Codegen support for vrev64q_f16 vector intrinsics.Aug 10 2018, 3:01 AM
LukeGeeson edited the summary of this revision. (Show Details)
SjoerdMeijer accepted this revision.Aug 10 2018, 3:18 AM

Looks like a straight forward fix to me now.

Apologies for fixing vext, I forgot and was also not expecting you were fixing both vrev and vext in one ticket.

One nit you could think about in e.g. your commit message: it's true this adds support for vrev64q_f16, but the added pattern are generic patterns for vrev on f16 vector with 4 or 8 elements; so something along the lines of "FP16 vrev codegen support" is more generic, and perhaps capturing it better.

This revision is now accepted and ready to land.Aug 10 2018, 3:18 AM
LukeGeeson edited the summary of this revision. (Show Details)Aug 10 2018, 3:53 AM
LukeGeeson retitled this revision from [ARM] Added Codegen support for vrev64q_f16 vector intrinsics to [ARM] Added FP16 VREV Vector Instrinsic CodeGen support.
LukeGeeson closed this revision.Aug 13 2018, 1:39 AM