This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix -amdgpu-inline-arg-alloca-cost
ClosedPublic

Authored by rampitec on Mar 10 2021, 10:21 AM.

Details

Summary

Before D94153 this threshold was in a pre-scaled units.
After D94153 inlining threshold multiplier is not applied
to this portion of the threshold anymore. Restore the
threshold by applying the multiplier.

Diff Detail

Event Timeline

rampitec created this revision.Mar 10 2021, 10:21 AM
rampitec requested review of this revision.Mar 10 2021, 10:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2021, 10:21 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added inline comments.Mar 10 2021, 11:55 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

Should the just adjust for the scale then?

rampitec added inline comments.Mar 10 2021, 11:57 AM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

I thought about this, but whenever we will adjust the scale the next time we will have to visit it again.

Do you have any test for the fix?

llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

Nit: It seems instead of this modification you can just swap two lines

1582:  Threshold *= TTI.getInliningThresholdMultiplier();
1583:  Threshold += TTI.adjustInliningThreshold(&Call);

in InlineCost.cpp so we'll stay with just one place of * getInliningThresholdMultiplier().

Do you have any test for the fix?

These tests tend to be either unreliable or huge. We cannot measure performance with lit tests.

llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

That would change behavior for all targets.

arsenm added inline comments.Mar 11 2021, 6:07 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

I thought the point of the multiplier was to just amplify the expense of calls. I don't understand scaling up the cost here

rampitec added inline comments.Mar 11 2021, 6:10 PM
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
1192 ↗(On Diff #329708)

It's more like a uniform target cost multiplier which shall be applied to everything. But probably Daniil is correct and I have to swap two lines in the inliner instead, until noone uses it as is.

rampitec updated this revision to Diff 330270.Mar 12 2021, 9:22 AM
rampitec marked 2 inline comments as done.

Swapped order of operations in the InlineCost itself instead. This is still unused by any other target so it is possible to fix it there, which seems to be a more correct way.

rampitec updated this revision to Diff 330283.Mar 12 2021, 9:40 AM

Added test.

dfukalov accepted this revision.Mar 12 2021, 10:01 AM

Thanks!

This revision is now accepted and ready to land.Mar 12 2021, 10:01 AM
This revision was landed with ongoing or failed builds.Mar 12 2021, 10:20 AM
This revision was automatically updated to reflect the committed changes.