This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Tune inlining parameters for AMDGPU target
ClosedPublic

Authored by dfukalov on Jul 12 2019, 9:03 AM.

Details

Summary

Since the target has no significant advantage of vectorization,
vector instructions bous threshold bonus should be optional.

amdgpu-inline-arg-alloca-cost parameter default value and the target
InliningThresholdMultiplier value tuned then respectively.

Diff Detail

Repository
rL LLVM

Event Timeline

dfukalov created this revision.Jul 12 2019, 9:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 9:03 AM
dfukalov added a project: Restricted Project.Jul 12 2019, 9:04 AM
arsenm added inline comments.Jul 12 2019, 9:10 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
276 ↗(On Diff #209505)

I think this need a name indicating it's an inliner control. getInlinerVectorBonusPercent?

llvm/lib/Analysis/InlineCost.cpp
883 ↗(On Diff #209505)

How does it decide what "vector dense" means? We already report costs that approximately say scalarize everything, and scalarization is free

llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
28–41 ↗(On Diff #209505)

Why this test change? I would expect a separate version without the control flow?

Agree with Matt on the callback name change. Otherwise LGTM.

dfukalov marked 2 inline comments as done.Jul 15 2019, 8:58 AM
dfukalov added inline comments.
llvm/lib/Analysis/InlineCost.cpp
883 ↗(On Diff #209505)

They estimate this "dense" by a percent of LLVM IR instructions with vector arguments. So if a function contains more than 50% of vector instructions this bonus added to threshold. For 10%-50% vector instructions cases they add half of the bonus.
I guess this logic of bonuses is based on x86 extensions like MMX and others.

llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
28–41 ↗(On Diff #209505)

Without the modification test @test_inliner_multi_pvt_ptr_cutoff starts to fail since I decreased the threshold multiplier and cost of the function started to be slightly higher.
The test is not about cotrol flow, we should check amdgpu-inline-arg-alloca-cutoff value: foo_private_ptr2 should be inlined in test_inliner_multi_pvt_ptr and shouldn't be inlined in test_inliner_multi_pvt_ptr_cutoff

dfukalov updated this revision to Diff 209878.Jul 15 2019, 8:58 AM

Diff updated as requested

dfukalov marked an inline comment as done.Jul 15 2019, 8:59 AM
This revision is now accepted and ready to land.Jul 15 2019, 9:16 AM
eraman added inline comments.Jul 16 2019, 9:26 PM
llvm/lib/Analysis/InlineCost.cpp
883 ↗(On Diff #209878)

The comment block explaining vector bonuses is still relevant after this change. Instead of removing it, you should modify it to say the bonus percentage is target dependent.

dfukalov marked 2 inline comments as done.Jul 17 2019, 7:20 AM
dfukalov added inline comments.
llvm/lib/Analysis/InlineCost.cpp
883 ↗(On Diff #209878)

the comment was not removed but moved to TargetTransformInfo.h where to new function. And note about target was added also.

This revision was automatically updated to reflect the committed changes.
dfukalov marked an inline comment as done.