This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation.
ClosedPublic

Authored by dfukalov on Jul 30 2020, 5:57 PM.

Details

Summary

Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.

Fixed line endings in test.

Diff Detail

Event Timeline

dfukalov created this revision.Jul 30 2020, 5:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2020, 5:57 PM
dfukalov requested review of this revision.Jul 30 2020, 5:57 PM
arsenm added inline comments.Jul 30 2020, 8:21 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
514

I think we don't actually need the hasOneUse case. Only with the conditionally available 2-operand VOP2 forms is there a code size savings by not fusing

517

The !HasFP32Denormals check is low applying to all types?

dfukalov added inline comments.Jul 31 2020, 7:31 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
514

Generally speaking, fused operation cost will be almost the same as FADD/FSUB that is estimated in detail in next case of this switch.
If the FMUL result is used elsewhere than FADD/FSUB that means the FMUL possible will not be eliminated by fusing.

517

Yes, since here we do not estimate FMUL cost, but estimate it will be fused and LLVM_FALLTHROUGH in other cases.

rampitec added inline comments.Jul 31 2020, 10:26 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
517

Yes, since here we do not estimate FMUL cost, but estimate it will be fused and LLVM_FALLTHROUGH in other cases.

It should not affect anything except f32, and you may return Free based on fp32-denormals for any type.

dfukalov updated this revision to Diff 283410.Aug 5 2020, 3:09 PM

Change updated with addressed comments.

rampitec added inline comments.Aug 5 2020, 3:18 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
517–527

Also need to check ST->hasMadMacF32Insts().

519

I think you do not need to check has16BitInsts(). If it does not f16 would be illegal anyway.

rampitec added inline comments.Aug 5 2020, 3:21 PM
llvm/test/Analysis/CostModel/AMDGPU/fused_costs.ll
6

Add a run line with -mcpu=gfx1030 and disabled fp32 denorm support. It does not have f32 mad/mac and shall not be contracted as a result.

dfukalov updated this revision to Diff 283540.Aug 6 2020, 2:35 AM

Check for hasMadMacF32Insts() added.

dfukalov marked 2 inline comments as done.Aug 6 2020, 9:20 AM
dfukalov added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
519

I used it just as it checked in cost estimation for the corresponding FADD/FSUB (below). As I understand, targets without fp16 insts support will not fuse fmul+fadd too. So we should LLVM_FALLTHROUGH for the case.

rampitec accepted this revision.Aug 6 2020, 9:22 AM

LGTM

This revision is now accepted and ready to land.Aug 6 2020, 9:22 AM
This revision was landed with ongoing or failed builds.Aug 6 2020, 11:43 AM
This revision was automatically updated to reflect the committed changes.