Page MenuHomePhabricator

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.

Authored by snehasish on Sep 16 2020, 9:54 PM.



This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -

  • Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely.
  • Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity.
  • Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production).

Diff Detail

Event Timeline

snehasish created this revision.Sep 16 2020, 9:54 PM
snehasish requested review of this revision.Sep 16 2020, 9:54 PM
MaskRay added inline comments.Sep 16 2020, 10:57 PM
134 ↗(On Diff #292405)

I think I will need to read your first LLVM patch to get some basic understanding of machine function splitter. I'd suggest you split the lld patch from the codegen one.

For your future lld patch: why can't the split sections be placed in .test.cold? This just affects how -z keep-text-section-prefix groups input sections into output sections. With the additional element, there may be an output section .text.split but if its purpose is similar to the others I am not sure you need it.

Drop lld/ELF/Writer.cpp changes.

  • Moved the changes to D87840.
134 ↗(On Diff #292405)

Thanks, here's a link to the RFC for your reference:
I've split out the LLD change as a separate patch: D87840

I'll add the discussion for the rationale behind a new output section in the D87840 (assuming that you mean ".text.unlikely" instead of ".test.cold").

snehasish edited the summary of this revision. (Show Details)Sep 17 2020, 10:18 AM
tmsriram added inline comments.Sep 17 2020, 10:31 AM
134 ↗(On Diff #292405)

A couple of thoughts on this which you could help clarify:

  • Currently, known cold code is put into .text.unlikely and lukewarm stuff is put into .text.
  • Hugepage mappers that I am aware of avoid mapping .text.unlikely into hugepages for max hugepage itlb utilization.
  • For function splitting, when you play with the thresholds more there is a good chance you are going to end up splitting code that is lukewarm and .text.unlikely is not the ideal place for it.
  • In which case, such code can be put into .text itself. Is there a good reason to put this into .text.split? Does having a new output section give you more leverage on how to manage the mapping of such code? I think this is the part that is not fully clear to me. Thanks.
snehasish updated this revision to Diff 293258.Sep 21 2020, 2:41 PM

Rebase and update git commit message.

snehasish retitled this revision from [llvm][lld] Add an option to emit cold clusters to a different section. to [llvm]Add an option to emit cold clusters to a different section..Sep 21 2020, 2:42 PM
snehasish edited the summary of this revision. (Show Details)
snehasish removed a reviewer: espindola.
snehasish added a subscriber: espindola.

This LGTM after I understood the rationale:

  1. It allows fine-grained control of cold split part placement for spatial locality and either mapping to or excluding from huge pages. This becomes very relevant as we play with the coldness thresholds where lukewarm parts could be split from hot function bodies. For these reasons, reusing an existing prefix is not ideal.
  2. There is a precedent with ".text.unknown" which has similar motivations. The lld patch for this is also rather simple.
MaskRay added inline comments.Sep 24 2020, 10:47 AM

Worth a comment why the toggle exists.


The description usually does not end with a full stop


This is probably better declared in a header and named something with BB as a prefix, otherwise it can lead to confusion that this applies to the generic .text.unlikely


If -NEXT: applies please add it.

[llvm]Add an option to emit cold clusters to a different section.

Instead of saying "Add an option", "Add -bbsections-cold-prefix" will convey more information. (Not rarely I grep logs for when an option was introduced. It is helpful if I can find the option name in the message)

MaskRay added inline comments.Sep 24 2020, 10:58 AM

I think it'd be better to check .LBB0_2: and .LBB_END0_2: as well to enhance the test.

snehasish updated this revision to Diff 294161.Sep 24 2020, 2:17 PM
snehasish marked 5 inline comments as done.

Document flag, tighten test, rename var and option for clarity.

  • Flag name and var name: bbsection-cold-text-prefix - BBSectionColdTextPrefix
  • Add comment about why this option is useful.
  • Update test to be more rigourous.
  • Rebase.
snehasish updated this revision to Diff 294162.Sep 24 2020, 2:19 PM

Add another check for the test.

PTAL, thanks!

snehasish updated this revision to Diff 294165.Sep 24 2020, 2:25 PM

Update git commit message to specify option.

snehasish retitled this revision from [llvm]Add an option to emit cold clusters to a different section. to [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section..Sep 24 2020, 2:25 PM
MaskRay accepted this revision.Sep 24 2020, 2:58 PM



I know your intention is that in practice users should specify .text.split. , however, the comment is a bit unclear that normally users should not use .text.unlikely.


I usually indent instructions to emphasize that they are in the region covered by a label (like _Z3bazb.cold:)

This revision is now accepted and ready to land.Sep 24 2020, 2:58 PM
snehasish updated this revision to Diff 294184.Sep 24 2020, 3:13 PM

Update comment, test.

  • Specify how users can use the flag and take advantage of keep-text-section-prefix.
  • Indent the instructions in check for clarity.
snehasish marked 2 inline comments as done.Sep 24 2020, 3:16 PM

Thanks for the review.


Good idea, documented the prefix and relevant lld flag.

This revision was landed with ongoing or failed builds.Sep 24 2020, 3:30 PM
This revision was automatically updated to reflect the committed changes.
snehasish marked an inline comment as done.