This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add FP16 broadcast and transpose costs
ClosedPublic

Authored by dmgreen on Mar 14 2023, 5:05 AM.

Details

Summary

The FP16 broadcast and transpose can always use the same instructions as are used for i16 vectors, with or without +fullfp16. This fills in some extra costs to make sure we get it right.

Diff Detail

Event Timeline

dmgreen created this revision.Mar 14 2023, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2023, 5:05 AM
dmgreen requested review of this revision.Mar 14 2023, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2023, 5:05 AM
SjoerdMeijer accepted this revision.Mar 14 2023, 5:20 AM

Looks like a good fix to me.

This revision is now accepted and ready to land.Mar 14 2023, 5:20 AM
This revision was landed with ongoing or failed builds.Mar 14 2023, 2:25 PM
This revision was automatically updated to reflect the committed changes.