Page MenuHomePhabricator

[aarch64] Add combine patterns for fp16 fmla
ClosedPublic

Authored by sebpop on Sep 6 2019, 12:03 PM.

Details

Summary

This patch enables generation of fused multiply add/sub for instructions operating on fp16.
Tested on aarch64-linux.

There are 7 CHECK-FIXME for patterns for which I was not able to create a testcase to exercise the added code paths.
Those 7 patterns are mixing v[4|8]i16 with v[4|8]fp16 types with the help of a bitcast.
I am not sure how to write a testcase without the bitcast, and to generate coverage over those combine patterns,
so I would appreciate help on rewriting those testcases.

Diff Detail

Repository
rL LLVM

Event Timeline

sebpop created this revision.Sep 6 2019, 12:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 6 2019, 12:03 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
SjoerdMeijer accepted this revision.Sep 6 2019, 1:44 PM

Hi Sebastian, thanks for fixing this.
This looks reasonable to me as an initial commit. This instcombiner part is a real copy-paste mess, but there's enough prior art here that this should be okay for now. I think we should follow up though to clean this up, and actually it's not bad to have a reference for now.
Bit of nit: instead of the CHECK-FIXME, perhaps it's better to just match the current output for now and have a FIXME as comment so that it is obvious when codegen changes.
And lastly, related to this that can be addressed separately, I noticed an llvm fma intrinsics when I looked into this. I haven't looked into details yet, but probably we need to support the f16 variant for completeness.

This revision is now accepted and ready to land.Sep 6 2019, 1:44 PM
This revision was automatically updated to reflect the committed changes.