- Fixed costs inconsistency for llvm.fma.vXf16 instinsiscs.
- Added tests for llvm.sadd.sat, llvm.ssub.sat, llvm.uadd.sat, llvm.usub.sat intrisics since they have special processing in cost model.
- Minor intrisics' costs tests updat and refinement.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
85 | That has nothing to do with packed f32. It is just because f64 is full rate on this target. |
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
85 | These PACKEDF32 and NOPACKEDF32 are just for distinguish between gfx90a and group of (gfx900, gfx1010) targets. It is not related to f64 or other types, but used for minimizing number of -check-prefixes subsets. Please check the RUN: lines, I added the one for gfx1010 target. It has the same costs as gfx900 so I renamed (GFX900|GFX90A) to (NOPACKEDF32|PACKEDF32). |
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
85 | I understand, but it is misleading. |
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
85 | So what names for groups (gx90a) and (gfx900|gfx1010) do you suggest for the test? |
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
85 | This should get its own FASTF64/SLOWF64 check. |
llvm/test/Analysis/CostModel/AMDGPU/fma.ll | ||
---|---|---|
84–98 | Fast is slow and slow is fast now. |
That has nothing to do with packed f32. It is just because f64 is full rate on this target.