Download Raw Diff

Details

Reviewers

tmsriram
MaskRay

Commits

rGd2696dec45cd: [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different…

Summary

This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -

Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely.
Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity.
Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production).

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	90 ms	windows > LLVM.Other::change-printer.ll

Event Timeline

snehasish created this revision.Sep 16 2020, 9:54 PM

Herald added a reviewer: • espindola. · View Herald TranscriptSep 16 2020, 9:54 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, hiraditya, arichardson, emaste. · View Herald Transcript

snehasish requested review of this revision.Sep 16 2020, 9:54 PM

Harbormaster completed remote builds in B71972: Diff 292405.Sep 16 2020, 10:31 PM

MaskRay added inline comments.Sep 16 2020, 10:57 PM

lld/ELF/Writer.cpp
134	I think I will need to read your first LLVM patch to get some basic understanding of machine function splitter. I'd suggest you split the lld patch from the codegen one. For your future lld patch: why can't the split sections be placed in `.test.cold`? This just affects how `-z keep-text-section-prefix` groups input sections into output sections. With the additional element, there may be an output section .text.split but if its purpose is similar to the others I am not sure you need it.

Drop lld/ELF/Writer.cpp changes.

Moved the changes to D87840.

lld/ELF/Writer.cpp
134	Thanks, here's a link to the RFC for your reference: https://groups.google.com/g/llvm-dev/c/RUegaMg-iqc/m/wFAVxa6fCgAJ I've split out the LLD change as a separate patch: D87840 I'll add the discussion for the rationale behind a new output section in the D87840 (assuming that you mean ".text.unlikely" instead of ".test.cold").

snehasish edited the summary of this revision. (Show Details)Sep 17 2020, 10:18 AM

tmsriram added inline comments.Sep 17 2020, 10:31 AM

lld/ELF/Writer.cpp
134	A couple of thoughts on this which you could help clarify: Currently, known cold code is put into .text.unlikely and lukewarm stuff is put into .text. Hugepage mappers that I am aware of avoid mapping .text.unlikely into hugepages for max hugepage itlb utilization. For function splitting, when you play with the thresholds more there is a good chance you are going to end up splitting code that is lukewarm and .text.unlikely is not the ideal place for it. In which case, such code can be put into .text itself. Is there a good reason to put this into .text.split? Does having a new output section give you more leverage on how to manage the mapping of such code? I think this is the part that is not fully clear to me. Thanks.

Harbormaster completed remote builds in B72042: Diff 292553.Sep 17 2020, 10:51 AM

snehasish mentioned this in D87840: [lld] Make -z keep-text-section-prefix recognize .text.split. as a prefix..Sep 17 2020, 12:21 PM

Rebase and update git commit message.

ping @tmsriram @MaskRay

This LGTM after I understood the rationale:

It allows fine-grained control of cold split part placement for spatial locality and either mapping to or excluding from huge pages. This becomes very relevant as we play with the coldness thresholds where lukewarm parts could be split from hot function bodies. For these reasons, reusing an existing prefix is not ideal.
There is a precedent with ".text.unknown" which has similar motivations. The lld patch for this is also rather simple.

Harbormaster completed remote builds in B72428: Diff 293258.Sep 21 2020, 3:22 PM

MaskRay added inline comments.Sep 24 2020, 10:47 AM

llvm/lib/CodeGen/BasicBlockSections.cpp
82	Worth a comment why the toggle exists.
84	The description usually does not end with a full stop
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
69	This is probably better declared in a header and named something with `BB` as a prefix, otherwise it can lead to confusion that this applies to the generic `.text.unlikely`
llvm/test/CodeGen/X86/basic-block-sections-cold.ll
39	If `-NEXT:` applies please add it.

[llvm]Add an option to emit cold clusters to a different section.

Instead of saying "Add an option", "Add -bbsections-cold-prefix" will convey more information. (Not rarely I grep logs for when an option was introduced. It is helpful if I can find the option name in the message)

MaskRay added inline comments.Sep 24 2020, 10:58 AM

llvm/test/CodeGen/X86/basic-block-sections-cold.ll
39	I think it'd be better to check `.LBB0_2:` and `.LBB_END0_2:` as well to enhance the test.

Document flag, tighten test, rename var and option for clarity.

Flag name and var name: bbsection-cold-text-prefix - BBSectionColdTextPrefix
Add comment about why this option is useful.
Update test to be more rigourous.
Rebase.

Add another check for the test.

PTAL, thanks!

Update git commit message to specify option.

snehasish retitled this revision from [llvm]Add an option to emit cold clusters to a different section. to [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section..Sep 24 2020, 2:25 PM

Harbormaster completed remote builds in B72869: Diff 294161.Sep 24 2020, 2:31 PM

Harbormaster completed remote builds in B72870: Diff 294162.Sep 24 2020, 2:33 PM

Harbormaster completed remote builds in B72873: Diff 294165.Sep 24 2020, 2:44 PM

LGTM.

llvm/lib/CodeGen/BasicBlockSections.cpp
82	I know your intention is that in practice users should specify .text.split. , however, the comment is a bit unclear that normally users should not use .text.unlikely.
llvm/test/CodeGen/X86/basic-block-sections-cold.ll
40	I usually indent instructions to emphasize that they are in the region covered by a label (like `_Z3bazb.cold:`)

This revision is now accepted and ready to land.Sep 24 2020, 2:58 PM

Update comment, test.

Specify how users can use the flag and take advantage of keep-text-section-prefix.
Indent the instructions in check for clarity.

Thanks for the review.

llvm/lib/CodeGen/BasicBlockSections.cpp
82	Good idea, documented the prefix and relevant lld flag.

This revision was landed with ongoing or failed builds.Sep 24 2020, 3:30 PM

Closed by commit rGd2696dec45cd: [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different… (authored by snehasish). · Explain Why

This revision was automatically updated to reflect the committed changes.

snehasish marked an inline comment as done.

snehasish added a commit: rGd2696dec45cd: [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different….

Harbormaster completed remote builds in B72883: Diff 294184.Sep 24 2020, 3:34 PM

Diff 292405

lld/ELF/Writer.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	StringRef elf::getOutputSectionName(const InputSectionBase *s) {
// ".text.unlikely.", ".text.startup." or ".text.exit." before others.		// ".text.unlikely.", ".text.startup." or ".text.exit." before others.
// We provide an option -z keep-text-section-prefix to group such sections		// We provide an option -z keep-text-section-prefix to group such sections
// into separate output sections. This is more flexible. See also		// into separate output sections. This is more flexible. See also
// sortISDBySectionOrder().		// sortISDBySectionOrder().
// ".text.unknown" means the hotness of the section is unknown. When		// ".text.unknown" means the hotness of the section is unknown. When
// SampleFDO is used, if a function doesn't have sample, it could be very		// SampleFDO is used, if a function doesn't have sample, it could be very
// cold or it could be a new function never being sampled. Those functions		// cold or it could be a new function never being sampled. Those functions
// will be kept in the ".text.unknown" section.		// will be kept in the ".text.unknown" section.
		// ".text.split." holds symbols which are split out from functions in other
		MaskRayUnsubmitted Not Done Reply Inline Actions I think I will need to read your first LLVM patch to get some basic understanding of machine function splitter. I'd suggest you split the lld patch from the codegen one. For your future lld patch: why can't the split sections be placed in `.test.cold`? This just affects how `-z keep-text-section-prefix` groups input sections into output sections. With the additional element, there may be an output section .text.split but if its purpose is similar to the others I am not sure you need it. MaskRay: I think I will need to read your first LLVM patch to get some basic understanding of machine…
		snehasishAuthorUnsubmitted Done Reply Inline Actions Thanks, here's a link to the RFC for your reference: https://groups.google.com/g/llvm-dev/c/RUegaMg-iqc/m/wFAVxa6fCgAJ I've split out the LLD change as a separate patch: D87840 I'll add the discussion for the rationale behind a new output section in the D87840 (assuming that you mean ".text.unlikely" instead of ".test.cold"). snehasish: Thanks, here's a link to the RFC for your reference: https://groups.google.com/g/llvm…
		tmsriramUnsubmitted Not Done Reply Inline Actions A couple of thoughts on this which you could help clarify: Currently, known cold code is put into .text.unlikely and lukewarm stuff is put into .text. Hugepage mappers that I am aware of avoid mapping .text.unlikely into hugepages for max hugepage itlb utilization. For function splitting, when you play with the thresholds more there is a good chance you are going to end up splitting code that is lukewarm and .text.unlikely is not the ideal place for it. In which case, such code can be put into .text itself. Is there a good reason to put this into .text.split? Does having a new output section give you more leverage on how to manage the mapping of such code? I think this is the part that is not fully clear to me. Thanks. tmsriram: A couple of thoughts on this which you could help clarify: * Currently, known cold code is put…
		// input sections. For example, with -fsplit-machine-functions, placing the
		// cold parts in .text.split instead of .text.unlikely mitigates against poor
		// profile inaccuracy. Techniques such as hugepage remapping can make
		// conservative decisions at the section granularity. Additionally we find
		// small improvement in icache and tlb metrics due to improved locality.
if (config->zKeepTextSectionPrefix)		if (config->zKeepTextSectionPrefix)
for (StringRef v : {".text.hot.", ".text.unknown.", ".text.unlikely.",		for (StringRef v : {".text.hot.", ".text.unknown.", ".text.unlikely.",
".text.startup.", ".text.exit."})		".text.startup.", ".text.exit.", ".text.split."})
if (isSectionPrefix(v, s->name))		if (isSectionPrefix(v, s->name))
return v.drop_back();		return v.drop_back();

for (StringRef v :		for (StringRef v :
{".text.", ".rodata.", ".data.rel.ro.", ".data.", ".bss.rel.ro.",		{".text.", ".rodata.", ".data.rel.ro.", ".data.", ".bss.rel.ro.",
".bss.", ".init_array.", ".fini_array.", ".ctors.", ".dtors.", ".tbss.",		".bss.", ".init_array.", ".fini_array.", ".ctors.", ".dtors.", ".tbss.",
".gcc_except_table.", ".tdata.", ".ARM.exidx.", ".ARM.extab."})		".gcc_except_table.", ".tdata.", ".ARM.exidx.", ".ARM.extab."})
if (isSectionPrefix(v, s->name))		if (isSectionPrefix(v, s->name))
▲ Show 20 Lines • Show All 2,852 Lines • Show Last 20 Lines

llvm/lib/CodeGen/BasicBlockSections.cpp

	Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	#include "llvm/Target/TargetMachine.h"			#include "llvm/Target/TargetMachine.h"

	using llvm::SmallSet;			using llvm::SmallSet;
	using llvm::SmallVector;			using llvm::SmallVector;
	using llvm::StringMap;			using llvm::StringMap;
	using llvm::StringRef;			using llvm::StringRef;
	using namespace llvm;			using namespace llvm;

				cl::opt<std::string> ColdSectionTextPrefix(
				MaskRayUnsubmitted Done Reply Inline Actions Worth a comment why the toggle exists. MaskRay: Worth a comment why the toggle exists.
				MaskRayUnsubmitted Done Reply Inline Actions I know your intention is that in practice users should specify .text.split. , however, the comment is a bit unclear that normally users should not use .text.unlikely. MaskRay: I know your intention is that in practice users should specify .text.split. , however, the…
				snehasishAuthorUnsubmitted Done Reply Inline Actions Good idea, documented the prefix and relevant lld flag. snehasish: Good idea, documented the prefix and relevant lld flag.
				"bbsections-cold-prefix",
				cl::desc("The text prefix to use for cold basic block clusters."),
				MaskRayUnsubmitted Done Reply Inline Actions The description usually does not end with a full stop MaskRay: The description usually does not end with a full stop
				cl::init(".text.unlikely."), cl::Hidden);

	namespace {			namespace {

	// This struct represents the cluster information for a machine basic block.			// This struct represents the cluster information for a machine basic block.
	struct BBClusterInfo {			struct BBClusterInfo {
	// MachineBasicBlock ID.			// MachineBasicBlock ID.
	unsigned MBBNumber;			unsigned MBBNumber;
	// Cluster ID this basic block belongs to.			// Cluster ID this basic block belongs to.
	unsigned ClusterID;			unsigned ClusterID;
	▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetMachine.h"		#include "llvm/Target/TargetMachine.h"
#include <cassert>		#include <cassert>
#include <string>		#include <string>

using namespace llvm;		using namespace llvm;
using namespace dwarf;		using namespace dwarf;

		extern cl::opt<std::string> ColdSectionTextPrefix;
		MaskRayUnsubmitted Done Reply Inline Actions This is probably better declared in a header and named something with `BB` as a prefix, otherwise it can lead to confusion that this applies to the generic `.text.unlikely` MaskRay: This is probably better declared in a header and named something with `BB` as a prefix…

static void GetObjCImageInfo(Module &M, unsigned &Version, unsigned &Flags,		static void GetObjCImageInfo(Module &M, unsigned &Version, unsigned &Flags,
StringRef &Section) {		StringRef &Section) {
SmallVector<Module::ModuleFlagEntry, 8> ModuleFlags;		SmallVector<Module::ModuleFlagEntry, 8> ModuleFlags;
M.getModuleFlagsMetadata(ModuleFlags);		M.getModuleFlagsMetadata(ModuleFlags);

for (const auto &MFE: ModuleFlags) {		for (const auto &MFE: ModuleFlags) {
// Ignore flags with 'Require' behaviour.		// Ignore flags with 'Require' behaviour.
if (MFE.Behavior == Module::Require)		if (MFE.Behavior == Module::Require)
▲ Show 20 Lines • Show All 791 Lines • ▼ Show 20 Lines	MCSection *TargetLoweringObjectFileELF::getSectionForMachineBasicBlock(

// For cold sections use the .text.unlikely prefix along with the parent		// For cold sections use the .text.unlikely prefix along with the parent
// function name. All cold blocks for the same function go to the same		// function name. All cold blocks for the same function go to the same
// section. Similarly all exception blocks are grouped by symbol name		// section. Similarly all exception blocks are grouped by symbol name
// under the .text.eh prefix. For regular sections, we either use a unique		// under the .text.eh prefix. For regular sections, we either use a unique
// name, or a unique ID for the section.		// name, or a unique ID for the section.
SmallString<128> Name;		SmallString<128> Name;
if (MBB.getSectionID() == MBBSectionID::ColdSectionID) {		if (MBB.getSectionID() == MBBSectionID::ColdSectionID) {
Name += ".text.unlikely.";		Name += ColdSectionTextPrefix;
Name += MBB.getParent()->getName();		Name += MBB.getParent()->getName();
} else if (MBB.getSectionID() == MBBSectionID::ExceptionSectionID) {		} else if (MBB.getSectionID() == MBBSectionID::ExceptionSectionID) {
Name += ".text.eh.";		Name += ".text.eh.";
Name += MBB.getParent()->getName();		Name += MBB.getParent()->getName();
} else {		} else {
Name += MBB.getParent()->getSection()->getName();		Name += MBB.getParent()->getSection()->getName();
if (TM.getUniqueBasicBlockSectionNames()) {		if (TM.getUniqueBasicBlockSectionNames()) {
Name += ".";		Name += ".";
▲ Show 20 Lines • Show All 1,320 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/basic-block-sections-cold.ll

	; Check if basic blocks that don't get unique sections are placed in cold sections.			; Check if basic blocks that don't get unique sections are placed in cold sections.
	; Basic block with id 1 and 2 must be in the cold section.			; Basic block with id 1 and 2 must be in the cold section.
	; RUN: echo '!_Z3bazb' > %t			; RUN: echo '!_Z3bazb' > %t
	; RUN: echo '!!0' >> %t			; RUN: echo '!!0' >> %t
	; RUN: llc < %s -mtriple=x86_64-pc-linux -function-sections -basic-block-sections=%t -unique-basic-block-section-names \| FileCheck %s -check-prefix=LINUX-SECTIONS			; RUN: llc < %s -mtriple=x86_64 -function-sections -basic-block-sections=%t -unique-basic-block-section-names \| FileCheck %s -check-prefix=LINUX-SECTIONS
				; RUN: llc < %s -mtriple=x86_64 -function-sections -basic-block-sections=%t -unique-basic-block-section-names -bbsections-cold-prefix=".text.split." \| FileCheck %s -check-prefix=LINUX-SPLIT

	define void @_Z3bazb(i1 zeroext) nounwind {			define void @_Z3bazb(i1 zeroext %0) nounwind {
	%2 = alloca i8, align 1			br i1 %0, label %2, label %4
	%3 = zext i1 %0 to i8
	store i8 %3, i8* %2, align 1
	%4 = load i8, i8* %2, align 1
	%5 = trunc i8 %4 to i1
	br i1 %5, label %6, label %8

	6: ; preds = %1
	%7 = call i32 @_Z3barv()
	br label %10

	8: ; preds = %1
	%9 = call i32 @_Z3foov()
	br label %10

	10: ; preds = %8, %6			2: ; preds = %1
				%3 = call i32 @_Z3barv()
				br label %6

				4: ; preds = %1
				%5 = call i32 @_Z3foov()
				br label %6

				6: ; preds = %2, %4
	ret void			ret void
	}			}

	declare i32 @_Z3barv() #1			declare i32 @_Z3barv() #1

	declare i32 @_Z3foov() #1			declare i32 @_Z3foov() #1

	; LINUX-SECTIONS: .section .text._Z3bazb,"ax",@progbits			; LINUX-SECTIONS: .section .text._Z3bazb,"ax",@progbits
	; LINUX-SECTIONS: _Z3bazb:			; LINUX-SECTIONS: _Z3bazb:
	; Check that the basic block with id 1 doesn't get a section.			; Check that the basic block with id 1 doesn't get a section.
	; LINUX-SECTIONS-NOT: .section .text._Z3bazb._Z3bazb.1,"ax",@progbits,unique			; LINUX-SECTIONS-NOT: .section .text._Z3bazb._Z3bazb.1,"ax",@progbits,unique
	; Check that a single cold section is started here and id 1 and 2 blocks are placed here.			; Check that a single cold section is started here and id 1 and 2 blocks are placed here.
	; LINUX-SECTIONS: .section .text.unlikely._Z3bazb,"ax",@progbits			; LINUX-SECTIONS: .section .text.unlikely._Z3bazb,"ax",@progbits
	; LINUX-SECTIONS: _Z3bazb.cold:			; LINUX-SECTIONS: _Z3bazb.cold:
	; LINUX-SECTIONS-NOT: .section .text._Z3bazb._Z3bazb.2,"ax",@progbits,unique			; LINUX-SECTIONS-NOT: .section .text._Z3bazb._Z3bazb.2,"ax",@progbits,unique
	; LINUX-SECTIONS: .LBB0_2:			; LINUX-SECTIONS: .LBB0_2:
	; LINUX-SECTIONS: .size _Z3bazb, .Lfunc_end{{[0-9]}}-_Z3bazb			; LINUX-SECTIONS: .size _Z3bazb, .Lfunc_end{{[0-9]}}-_Z3bazb

				; LINUX-SPLIT: .section .text.split._Z3bazb,"ax",@progbits
				; LINUX-SPLIT: _Z3bazb.cold:
				MaskRayUnsubmitted Done Reply Inline Actions If `-NEXT:` applies please add it. MaskRay: If `-NEXT:` applies please add it.
				MaskRayUnsubmitted Done Reply Inline Actions I think it'd be better to check `.LBB0_2:` and `.LBB_END0_2:` as well to enhance the test. MaskRay: I think it'd be better to check `.LBB0_2:` and `.LBB_END0_2:` as well to enhance the test.
				MaskRayUnsubmitted Done Reply Inline Actions I usually indent instructions to emphasize that they are in the region covered by a label (like `_Z3bazb.cold:`) MaskRay: I usually indent instructions to emphasize that they are in the region covered by a label (like…

This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 292405

lld/ELF/Writer.cpp

llvm/lib/CodeGen/BasicBlockSections.cpp

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

llvm/test/CodeGen/X86/basic-block-sections-cold.ll

This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 292405

lld/ELF/Writer.cpp

llvm/lib/CodeGen/BasicBlockSections.cpp

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

llvm/test/CodeGen/X86/basic-block-sections-cold.ll

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
ClosedPublic