This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/CodeGen/
-
lib/
-
CodeGen/
-
MachineFunctionSplitter.cpp

Differential D145212

Only split cold blocks with more than a given number of instructions
AbandonedPublic

Authored by dhoekwater on Mar 2 2023, 8:23 PM.

Download Raw Diff

Details

Reviewers

snehasish

Summary

On Arm, splitting a cold block may incur a thunk, a 16-byte snippet of code
that extends the range of an unconditional branch. Consequently, splitting
the block may actually inflate cold code size and hurt performance.

While thunk-aware splitting is a complex problem, only splitting cold blocks
larger than a thunk will get some wins without the risk of regression.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dhoekwater created this revision.Mar 2 2023, 8:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2023, 8:23 PM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B217105: Diff 502043.Mar 2 2023, 9:04 PM

dhoekwater retitled this revision from Only split cold blocks with more than a given number of instructions. to Only split cold blocks with more than a given number of instructions.Mar 2 2023, 11:18 PM

dhoekwater published this revision for review.Mar 6 2023, 2:28 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 2:28 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Adding Snehasish as a reviewer as git blame suggests snehasish has designed and written most of the MachineFunctionSplitter.

It seems to me that this patch would also need:

Some regression test to test it actually changes code generation (only?) when targeting Arm.
As is, this patch will not result in changed code generation IIUC?

In D145212#4174199, @kristof.beyls wrote:

It seems to me that this patch would also need:

Some regression test to test it actually changes code generation (only?) when targeting Arm.

As is, this patch will not result in changed code generation IIUC?

Yes, a small test for this would be great. The existing tests live in llvm-project/llvm/test/CodeGen/X86/machine-function-splitter.ll. I don't think this enhancement has much utility on X86, so perhaps the tests should be under CodeGen/AArch64 instead?

Dropping for the time being as it doesn't look like thresholding is necessary for MFS performance.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

MachineFunctionSplitter.cpp

9 lines

Diff 502043

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	PercentileCutoff("mfs-psi-cutoff",
cl::init(999950), cl::Hidden);		cl::init(999950), cl::Hidden);

static cl::opt<unsigned> ColdCountThreshold(		static cl::opt<unsigned> ColdCountThreshold(
"mfs-count-threshold",		"mfs-count-threshold",
cl::desc(		cl::desc(
"Minimum number of times a block must be executed to be retained."),		"Minimum number of times a block must be executed to be retained."),
cl::init(1), cl::Hidden);		cl::init(1), cl::Hidden);

		static cl::opt<unsigned>
		ColdSizeThreshold("mfs-size-threshold",
		cl::desc("Maximum number of instructions a cold block "
		"may have and still be retained."),
		cl::init(0), cl::Hidden);

static cl::opt<bool> SplitAllEHCode(		static cl::opt<bool> SplitAllEHCode(
"mfs-split-ehcode",		"mfs-split-ehcode",
cl::desc("Splits all EH code and it's descendants by default."),		cl::desc("Splits all EH code and it's descendants by default."),
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);

namespace {		namespace {

class MachineFunctionSplitter : public MachineFunctionPass {		class MachineFunctionSplitter : public MachineFunctionPass {
Show All 22 Lines	static void setDescendantEHBlocksCold(MachineFunction &MF) {
for (auto Block : EHBlocks) {		for (auto Block : EHBlocks) {
Block->setSectionID(MBBSectionID::ColdSectionID);		Block->setSectionID(MBBSectionID::ColdSectionID);
}		}
}		}

static bool isColdBlock(const MachineBasicBlock &MBB,		static bool isColdBlock(const MachineBasicBlock &MBB,
const MachineBlockFrequencyInfo *MBFI,		const MachineBlockFrequencyInfo *MBFI,
ProfileSummaryInfo *PSI) {		ProfileSummaryInfo *PSI) {
		if (MBB.size() <= ColdSizeThreshold)
		return false;

std::optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB);		std::optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB);
if (!Count)		if (!Count)
return true;		return true;

if (PercentileCutoff > 0) {		if (PercentileCutoff > 0) {
return PSI->isColdCountNthPercentile(PercentileCutoff, *Count);		return PSI->isColdCountNthPercentile(PercentileCutoff, *Count);
}		}
return (*Count < ColdCountThreshold);		return (*Count < ColdCountThreshold);
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines