This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -
- Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely.
- Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity.
- Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production).
I think I will need to read your first LLVM patch to get some basic understanding of machine function splitter. I'd suggest you split the lld patch from the codegen one.
For your future lld patch: why can't the split sections be placed in .test.cold? This just affects how -z keep-text-section-prefix groups input sections into output sections. With the additional element, there may be an output section .text.split but if its purpose is similar to the others I am not sure you need it.