This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation.
ClosedPublic

Authored by dfukalov on Jul 30 2020, 5:57 PM.

Details

Summary

Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.

Fixed line endings in test.

Diff Detail

Unit TestsFailed

Event Timeline

dfukalov created this revision.Jul 30 2020, 5:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2020, 5:57 PM
dfukalov requested review of this revision.Jul 30 2020, 5:57 PM
arsenm added inline comments.Jul 30 2020, 8:21 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
514

I think we don't actually need the hasOneUse case. Only with the conditionally available 2-operand VOP2 forms is there a code size savings by not fusing

517

The !HasFP32Denormals check is low applying to all types?

dfukalov added inline comments.Jul 31 2020, 7:31 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
514

Generally speaking, fused operation cost will be almost the same as FADD/FSUB that is estimated in detail in next case of this switch.
If the FMUL result is used elsewhere than FADD/FSUB that means the FMUL possible will not be eliminated by fusing.

517

Yes, since here we do not estimate FMUL cost, but estimate it will be fused and LLVM_FALLTHROUGH in other cases.

rampitec added inline comments.Jul 31 2020, 10:26 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
517

Yes, since here we do not estimate FMUL cost, but estimate it will be fused and LLVM_FALLTHROUGH in other cases.

It should not affect anything except f32, and you may return Free based on fp32-denormals for any type.

dfukalov updated this revision to Diff 283410.Aug 5 2020, 3:09 PM

Change updated with addressed comments.

rampitec added inline comments.Aug 5 2020, 3:18 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
517

Also need to check ST->hasMadMacF32Insts().

519

I think you do not need to check has16BitInsts(). If it does not f16 would be illegal anyway.

rampitec added inline comments.Aug 5 2020, 3:21 PM
llvm/test/Analysis/CostModel/AMDGPU/fused_costs.ll
6

Add a run line with -mcpu=gfx1030 and disabled fp32 denorm support. It does not have f32 mad/mac and shall not be contracted as a result.

dfukalov updated this revision to Diff 283540.Aug 6 2020, 2:35 AM

Check for hasMadMacF32Insts() added.

dfukalov marked 2 inline comments as done.Aug 6 2020, 9:20 AM
dfukalov added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
519

I used it just as it checked in cost estimation for the corresponding FADD/FSUB (below). As I understand, targets without fp16 insts support will not fuse fmul+fadd too. So we should LLVM_FALLTHROUGH for the case.

rampitec accepted this revision.Aug 6 2020, 9:22 AM

LGTM

This revision is now accepted and ready to land.Aug 6 2020, 9:22 AM
This revision was landed with ongoing or failed builds.Aug 6 2020, 11:43 AM
This revision was automatically updated to reflect the committed changes.