As we plan to support both CSSPGO and AutoFDO for llvm-profgen, we will have different kinds of perf sample and different kinds of sample counter(cs/non-cs, with/without pseudo probe) which both need to do aggregation in hash map. This change implements the hashable interface(Hashable) and the unified base class for them to have better extensibility and reusability.
Currently perf trace sample and sample counter with context implemented this Hashable and the class hierarchy is like:
| Hashable | PerfSample | HybridSample | LBRSample | ContextKey | StringBasedCtxKey | ProbeBasedCtxKey | CallsiteBasedCtxKey | ...
For perf sample, HybridSample includes the call stack with LBR stack and LBRSample only includes LBR stack.
For context key used for virtual unwinding counter aggregation, we use string based context(StringBasedCtxKey) by default for good debug experience. For pseudo probe, we switch to use a stack of probe pointer(ProbeBasedCtxKey) to avoid redundant string handling. In the future, we can also speed up StringBasedCtxKey to use the original function frame stack, which is here named CallsiteBasedCtxKey
Implementation
- Class specifying Hashable should implement getHashCode and isEqual. Here we make getHashCode a non-virtual function to avoid vtable overhead, so derived class should calculate and assign the base class's HashCode manually. This also provides the flexibility for calculating the hash code incrementally(like rolling hash) during frame stack unwinding
- isEqual is a virtual function, which will have perf overhead. In the future, if we redesign a better hash function, then we can just skip this or switch to non-virtual function.
- Added PerfSample and ContextKey as base class for perf sample and counter context key, leveraging llvm-style RTTI for this.
- Added StringBasedCtxKey class extending ContextKey to use string as context id.
- Refactor AggregationCounter to take all kinds of PerfSample as key
- Refactor ContextSampleCounter to take all kinds of ContextKey as key
- Other refactoring work:
- Create a wrapper class SampleCounter to wrap RangeCounter and BranchCounter
- Hoist ContextId and FunctionProfile out of populateFunctionBodySamples and populateFunctionBoundarySamples to reuse them in ProfileGenerator
Nit: could this be defined as an overloaded == operator?