Page MenuHomePhabricator

[SampleFDO] Enable sample-profile-top-down-load by default.
ClosedPublic

Authored by wmi on Jun 30 2020, 3:32 PM.

Details

Summary

sample-profile-top-down-load is an internal option which can enable top-down order of inlining and profile annotation in sample profile load pass. It was found to be beneficial for better profile annotation.

Recently we found it could also solve some build time issue. Suppose function A has many callsites in function B. In the last release binary where sample profile was collected, the outline copy of A is large because there are many other functions inlined into A. However although all the callsites calling A in B are inlined, but every inlined body is small (A was inlined into B before other functions are inlined into A), there is no build time issue in last release.

In an optimized build using the sample profile collected from last release, without top-down inlining, we saw a case that A got very large because of inlining, and then multiple callsites of A got inlined into B, and that led to a huge B which caused significant build time issue besides profile annotation issue.

To solve that problem, the patch proposes to enable the flag sample-profile-top-down-load by default.

I reevaluated the performance again in two server benchmarks. Run one benchmark 6 times, it had no performance change in 4 runs and had 0.2% improvement in 2 runs. Run another benchmark 6 times and it had no performance change.

Diff Detail

Event Timeline

wmi created this revision.Jun 30 2020, 3:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2020, 3:32 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Great. We use this switch together with -sample-profile-merge-inlinee, so the profile to drive top-down inlining is more accurate and that often gives best performance. May want to turn both on together?

wmi added a comment.Jun 30 2020, 4:53 PM

Sounds good to me. I will redo the performance test for it.

wmi updated this revision to Diff 275836.Jul 6 2020, 2:45 PM

Enable -sample-profile-merge-inlinee by default together with -sample-profile-top-down-load.

wmi added a comment.Jul 6 2020, 2:54 PM

I tested the performance with sample-profile-top-down-load and sample-profile-merge-inlinee both enabled. In different compiler versions I got different result. In one version about three weeks older, I got 0.4% improvement for one benchmark steadily in multiple runs and neutral for another. In the head llvm version, I saw neutral result for both benchmarks.

wmi updated this revision to Diff 275867.Jul 6 2020, 5:36 PM

Disable sample-profile-merge-inlinee when sample-profile-top-down-load is not effective (Currently sample-profile-top-down-load is only effective for new pass manager).

wenlei accepted this revision.Jul 6 2020, 8:03 PM

Thanks for measurement. LGTM.

This revision is now accepted and ready to land.Jul 6 2020, 8:03 PM
davidxl accepted this revision.Jul 8 2020, 8:30 AM

lgtm

This revision was automatically updated to reflect the committed changes.