This is an archive of the discontinued LLVM Phabricator instance.

[CostModel][AMDGPU] Fix intrinsics costs estimations.
ClosedPublic

Authored by dfukalov on Dec 8 2021, 12:57 PM.

Details

Summary
  1. Fixed costs inconsistency for llvm.fma.vXf16 instinsiscs.
  2. Added tests for llvm.sadd.sat, llvm.ssub.sat, llvm.uadd.sat, llvm.usub.sat intrisics since they have special processing in cost model.
  3. Minor intrisics' costs tests updat and refinement.

Diff Detail

Event Timeline

dfukalov created this revision.Dec 8 2021, 12:57 PM
dfukalov requested review of this revision.Dec 8 2021, 12:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2021, 12:57 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec added inline comments.Dec 8 2021, 1:31 PM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
85

That has nothing to do with packed f32. It is just because f64 is full rate on this target.

dfukalov added inline comments.Dec 8 2021, 2:14 PM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
85

These PACKEDF32 and NOPACKEDF32 are just for distinguish between gfx90a and group of (gfx900, gfx1010) targets. It is not related to f64 or other types, but used for minimizing number of -check-prefixes subsets. Please check the RUN: lines, I added the one for gfx1010 target. It has the same costs as gfx900 so I renamed (GFX900|GFX90A) to (NOPACKEDF32|PACKEDF32).

rampitec added inline comments.Dec 8 2021, 2:17 PM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
85

I understand, but it is misleading.

dfukalov added inline comments.Dec 9 2021, 12:01 PM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
85

So what names for groups (gx90a) and (gfx900|gfx1010) do you suggest for the test?

rampitec added inline comments.Dec 9 2021, 12:31 PM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
85

This should get its own FASTF64/SLOWF64 check.

dfukalov updated this revision to Diff 393414.Dec 10 2021, 2:27 AM

Updated test as requested.

rampitec added inline comments.Dec 10 2021, 11:15 AM
llvm/test/Analysis/CostModel/AMDGPU/fma.ll
84–98

Fast is slow and slow is fast now.

dfukalov updated this revision to Diff 393578.Dec 10 2021, 1:17 PM

Fixed SLOW/FAST misplacement in test

This revision is now accepted and ready to land.Dec 10 2021, 1:53 PM
This revision was automatically updated to reflect the committed changes.