Page MenuHomePhabricator

Keep import function list for inlinee profile update
ClosedPublic

Authored by wenlei on Nov 1 2019, 1:22 PM.

Details

Summary

When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining.

Separate but related, we don't need to update/scale profile for callee or cloned inlinee if we decided to inline a callee from sample profile loader with context-sensitive profile. This is fixed by passing a flag through InlineFunctionInfo to tell inliner when not to update profile.

Event Timeline

wenlei created this revision.Nov 1 2019, 1:22 PM
wenlei added a comment.Nov 1 2019, 1:31 PM

This is a minimal point fix (was trying to keep internal patch minimum), but it might actually make sense to have setEntryCount always copy over the existing import list (if it exists), instead of adding updateEntryCount? Or even further, we could have a separate metadata for the import list, instead of piggyback on function_entry_count? To me, it seems a bit weird to embed import list as part of function_entry_count - wondering what's the reason behind that?

wmi added a comment.Nov 3 2019, 7:50 PM

Thanks for the patch!

This is a minimal point fix (was trying to keep internal patch minimum), but it might actually make sense to have setEntryCount always copy over the existing import list (if it exists), instead of adding updateEntryCount? Or even further, we could have a separate metadata for the import list, instead of piggyback on function_entry_count? To me, it seems a bit weird to embed import list as part of function_entry_count - wondering what's the reason behind that?

I agree setEntryCount always copying the existing import list is better than adding updateEntryCount. Adding a separate metadata also sounds good to me. I don't know the history why enbedding import list in function_entry_count. @tejohnson may know.

llvm/include/llvm/Transforms/Utils/Cloning.h
208–211 ↗(On Diff #227522)

I don't see where UpdateProfile is used.

llvm/test/Transforms/SampleProfile/inline-callee-update.ll
78–81

I am curious why these two functions have non empty import lists because all the functions are defined within the module.

Then I find indirect call targets with inline instance are all added to import list. That is unnecessary if the call targets are defined in current module. That could be improved. At least, the test shouldn't depend on it.

wenlei marked 2 inline comments as done.Nov 4 2019, 8:50 AM

I agree setEntryCount always copying the existing import list is better than adding updateEntryCount. Adding a separate metadata also sounds good to me. I don't know the history why enbedding import list in function_entry_count. @tejohnson may know.

Updated the diff to have setEntryCount always copying the existing import list for now. I'd be happy to add a new metadata for import list too.. Thanks for reviewing!

llvm/include/llvm/Transforms/Utils/Cloning.h
208–211 ↗(On Diff #227522)

oops, merge error, thanks for catching this. fixed.

llvm/test/Transforms/SampleProfile/inline-callee-update.ll
78–81

Makes sense, changed the test case to rely only on cross module (declare-only) import. (we can trim same module indirect callee import in a separate diff)

wenlei updated this revision to Diff 227718.Nov 4 2019, 9:00 AM

address review feedback

wenlei updated this revision to Diff 227728.Nov 4 2019, 9:32 AM

update the right version of setEntryCount

Nice catch on the dropped import lists!

llvm/lib/Transforms/IPO/SampleProfile.cpp
876 ↗(On Diff #227728)

Document constant parameters.

llvm/lib/Transforms/Utils/InlineFunction.cpp
1723

Can you split this into a separate patch (with a test) since it isn't related to the headline of this patch? Are we currently getting incorrect profiles without this fix?

Suggest pulling the if (IFI.UpdateProfile) checks out into a single if for clarity.

llvm/test/Transforms/SampleProfile/inline-callee-update.ll
2

Is this intended to be here?

wmi added a comment.Nov 4 2019, 2:17 PM

Could you add a testcase to verify that no profile update for inlining in sampleloader pass?

llvm/test/Transforms/SampleProfile/inline-callee-update.ll
7

Please use "opt -instnamer" to change %1, %2, ... to names more than just numbers.

wenlei marked 4 inline comments as done.Nov 4 2019, 7:58 PM
wenlei added inline comments.
llvm/lib/Transforms/IPO/SampleProfile.cpp
876 ↗(On Diff #227728)

Reverted this as part of the split.

llvm/lib/Transforms/Utils/InlineFunction.cpp
1723

Sure.. the profile we are getting is still incorrect for some cases because line 1517 is still executed for inlining with CS profile. But I will put that into a separate diff later with its own test case.

llvm/test/Transforms/SampleProfile/inline-callee-update.ll
2

oops.. removed internal marker, but left the comment there.

7

Done.

wenlei updated this revision to Diff 227813.Nov 4 2019, 8:04 PM

Split UpdateProfile part, address other feedbacks.

Thanks for splitting the patch. This one lgtm, but please wait to see if @wmi has any more comments. Also the summary needs an update (to remove the bit about the CS profile fix).

wmi added a comment.Nov 5 2019, 7:37 AM

I don't have further comment. It LGTM. I am doing some performance test for it and will report back its performance impact on our side (expect to be positive).

wmi added a comment.Nov 5 2019, 5:42 PM

I saw 0.3% improvement in our server benchmark. That is a great improvement!

wmi accepted this revision.Nov 5 2019, 5:47 PM
This revision is now accepted and ready to land.Nov 5 2019, 5:47 PM
wenlei added a comment.Nov 5 2019, 6:04 PM
In D69736#1734892, @wmi wrote:

I saw 0.3% improvement in our server benchmark. That is a great improvement!

Thanks for sharing the data! Glad that it helped.

This revision was automatically updated to reflect the committed changes.