This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format
ClosedPublic

Authored by wmi on Sep 18 2019, 12:16 PM.

Details

Summary

Sometimes we want to limit the size of the profile by stripping some functions with low sample count or by stripping some function names with small text size from profile symbol list. That requires the profile reader to have the interfaces returning the size of a section or the size of total profile. The patch add those interfaces.

At the same time, add some dump facility to show the size of each section.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Sep 18 2019, 12:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2019, 12:16 PM
wmi added a subscriber: congliu.Sep 18 2019, 1:06 PM

Missing tests.

wmi updated this revision to Diff 220767.Sep 18 2019, 4:10 PM

Add test to verify the size of the profile matching the size shown by llvm-profdata and add an assertion to verify the size of individual section and header adds up to total file size.

Is the intended workflow to create a profile with full symbol list and then have an option to trim symbol list according to size impact? Why not do the trimming at the time of the profile creation?

wmi added a comment.Sep 19 2019, 11:16 AM

Is the intended workflow to create a profile with full symbol list and then have an option to trim symbol list according to size impact? Why not do the trimming at the time of the profile creation?

Yes, that is the intended workflow. Because before creating the profile, we don't know how much size the profile will have especially after compression. For the size of profile symbol list section, we can estimate the size after compression, but since we already trim the profile iteratively by dropping symbols in FunctionSamples, regenerating the profile and reevaluating the size, I tend to use the same way to handle symbol list. The cons include longer profile generation time. The pros include preciseness -- when we limit the profile size to be 10M, the profile generated will be close to it but won't exceed it. If we estimate the size of the profile, there may be outliers.

We can have some methods to reduce the iterations of cutting the profile in the future. Currently it is not a problem.

Sounds a tricky situation to handle. Suppose the profile size increases some limit, what criterial should be used to trim the size? Trim the symbol list or drop more samples?

wmi added a comment.Sep 19 2019, 11:51 AM

Sounds a tricky situation to handle. Suppose the profile size increases some limit, what criterial should be used to trim the size? Trim the symbol list or drop more samples?

We plan to use separate limits for symbol list and the rest. If size of symbol list is over the limit, trim symbol list, otherwise drop samples.

davidxl added inline comments.Sep 19 2019, 2:02 PM
lib/ProfileData/SampleProfReader.cpp
683

Perhaps makes this a virtual function and let the base impl return 'false' (so that caller can warn).

wmi updated this revision to Diff 220917.Sep 19 2019, 4:10 PM

Address David's comment.

davidxl accepted this revision.Sep 19 2019, 4:17 PM

lgtm

This revision is now accepted and ready to land.Sep 19 2019, 4:17 PM
wmi closed this revision.Sep 20 2019, 4:29 PM

Committed at rL372439. Sorry I missed to add Differential Revision msg in the commit so close it manually.