HomePhabricator

[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining

Authored by wenlei on Mar 23 2020, 11:50 PM.

Description

[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining

This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.

Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.

Interface

There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.

  • Query base profile (getBaseSamplesFor)

Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.

  • Query context profile (getContextSamplesFor)

Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.

  • Track inlined context profile (markContextSamplesInlined)

When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.

  • Track not-inlined context profile (promoteMergeContextSamplesTree)

When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.

Implementation

Implementation-wise, SampleContext is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each SampleContext also has a ContextState indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each FunctionSamples now has a SampleContext that tells whether it's base profile or context profile, and for context profile what is the context and state.

On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, SampleContextTracker is implemented that encapsulates a trie tree with ContextTireNode as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track FunctionSamples for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.

Integration

SampleContextTracker is now also integrated with AutoFDO, SampleProfileReader and SampleProfileLoader. When we detected input profile contains context-sensitive profile, SampleContextTracker will be used to track profiles, and all profile query will go to SampleContextTracker instead of SampleProfileReader automatically. Tracking APIs are called automatically for each inline decision from SampleProfileLoader.

Differential Revision: https://reviews.llvm.org/D90125