The FP16 broadcast and transpose can always use the same instructions as are used for i16 vectors, with or without +fullfp16. This fills in some extra costs to make sure we get it right.
Details
Details
Diff Detail
Diff Detail
Paths
| Differential D146035
[AArch64] Add FP16 broadcast and transpose costs ClosedPublic Authored by dmgreen on Mar 14 2023, 5:05 AM.
Details Summary The FP16 broadcast and transpose can always use the same instructions as are used for i16 vectors, with or without +fullfp16. This fills in some extra costs to make sure we get it right.
Diff Detail Event TimelineThis revision is now accepted and ready to land.Mar 14 2023, 5:20 AM This revision was landed with ongoing or failed builds.Mar 14 2023, 2:25 PM Closed by commit rG180865a50085: [AArch64] Add FP16 broadcast and transpose costs (authored by dmgreen). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 505036 llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/AArch64/shuffle-load.ll
llvm/test/Analysis/CostModel/AArch64/shuffle-transpose.ll
|