This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
BasicBlockSectionUtils.h
-
CommandFlags.h
-
MachineFunction.h
-
Passes.h
-
InitializePasses.h
-
Target/
-
TargetOptions.h
-
lib/CodeGen/
-
CodeGen/
1
BBSectionsPrepare.cpp
-
CMakeLists.txt
-
CommandFlags.cpp
8/15
MachineFunctionSplitter.cpp
5/8
TargetPassConfig.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
1/1
machine-function-splitter.ll

Differential D85368

[llvm][CodeGen] Machine Function Splitter
ClosedPublic

Authored by snehasish on Aug 5 2020, 3:41 PM.

Download Raw Diff

Details

Reviewers

davidxl
eli.friedman
tmsriram
hiraditya

Commits

rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter

Summary

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
introduced in LLVM from the Propeller project. The pass targets
functions with profile coverage, identifies cold blocks and moves them
to a separate section. The linker groups all cold blocks across
functions together, decreasing fragmentation and improving icache and
itlb utilization.

We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017.
For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses. Additionally, l1 icache misses
reduced by 9.5% while l2 instruction misses reduced by 20%.
For SPECInt we report the change in IntRate the C/C++
benchmarks. All benchmarks apart from mcf and x264 improve, on average
by 0.6% with the max for deepsjeng at 1.6%.

Benchmark               % Change (IntRate)
500.perlbench_r          0.78
502.gcc_r                0.82
505.mcf_r               -0.30
520.omnetpp_r            0.18
523.xalancbmk_r          0.37
525.x264_r              -0.46
531.deepsjeng_r          1.61
541.leela_r              0.83
557.xz_r                 0.15

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	70 ms	linux > Polly.Isl/Ast::alias_checks_with_empty_context.ll

Event Timeline

snehasish created this revision.Aug 5 2020, 3:41 PM

Herald added subscribers: llvm-commits, mgrang, mgorny. · View Herald TranscriptAug 5 2020, 3:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2020, 3:41 PM

snehasish requested review of this revision.Aug 5 2020, 3:42 PM

tmsriram added inline comments.Aug 5 2020, 3:55 PM

llvm/lib/CodeGen/BBSectionsPrepare.cpp
72	Should we rename the .h and .cpp so that they have the same prefix? Maybe BasicBlockSections.h and BasicBlockSections.cpp?
llvm/lib/CodeGen/TargetPassConfig.cpp
218	Why not call this split-machine-functions too for consistency?
llvm/test/CodeGen/X86/machine-function-splitter.ll
8	Also, check if the block is moved to the cold region has the expected call instruction?

We probably need to discuss how to make basic-block-section stuff work on non-X86 targets at some point, but I guess we don't have to do it in this patch if it's off by default.

I am wondering what is is your opinion on machine unroller/reroller? Aggressive loop unrolling may destroy code cache too.

Regarding reroller -- compiler with PGO will adjust the agressiveness of the unroller based on instruction workset size estimation. Doing this in later pass or in Propeller can help catch cases that are mis-handled.

snehasish mentioned this in D85380: [NFC] Rename BBSectionsPrepare -> BasicBlockSections..Aug 5 2020, 5:16 PM

davidxl added inline comments.Aug 5 2020, 5:23 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
89	Add an internal option here (the coldness threshold) for experimental purpose. I also suggest add an option to specify programSummary based coldness threshold such as 99.99 percentile coldness. The default cutoff is 99.9999% defined in ProfileSummaryInfo.cpp: ProfileSummaryCutoffCold

Harbormaster completed remote builds in B67210: Diff 283424.Aug 5 2020, 5:39 PM

dmajor added a subscriber: dmajor.Aug 5 2020, 6:12 PM

Please share performance numbers for publicly available workload(s).

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
77	Do we need to renumber?

snehasish mentioned this in rG8d943a928d25: [NFC] Rename BBSectionsPrepare -> BasicBlockSections..Aug 6 2020, 1:12 PM

tmsriram added inline comments.Aug 6 2020, 8:21 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
77	Renumbering makes the sorting easy. The sorting will preserve the basic block order for the blocks that are not split.

Updated diff based on review comments.

Added two mllvm options to control cold count and threshold based split.
Added tests for the new options.
Updated test to check for the absence of unexpected blocks.
Renamed BBSectionsPrepare pass and rebased this diff on the change.

snehasish edited the summary of this revision. (Show Details)Aug 7 2020, 12:14 PM

snehasish edited the summary of this revision. (Show Details)

snehasish edited the summary of this revision. (Show Details)Aug 7 2020, 12:17 PM

snehasish marked 5 inline comments as done.Aug 7 2020, 12:24 PM

snehasish added inline comments.

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
77	We need to ensure that the order is preserved so that we don't perturb the decisions made by prior passes such as MachineBlockPlacement. Renumbering simplifies the code that needs to be shared with the BasicBlockSections pass.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	We can't register two options with the same string, i.e. "split-machine-functions".

Simplify the cold count check.

Harbormaster completed remote builds in B67499: Diff 283979.Aug 7 2020, 12:44 PM

Harbormaster completed remote builds in B67509: Diff 283991.Aug 7 2020, 1:26 PM

hiraditya added inline comments.Aug 7 2020, 5:23 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
77	I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections

We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017.

Could you share the details of the machine as well? The improvements are well within noise.

For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses

While itlb reduction looks quite impressive, it doesn't seem to translate quite well to the runtime improvement. Did we see consistent >2% improvement with multiple runs? Please share the numbers.

tmsriram added inline comments.Aug 7 2020, 5:32 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
77	The "bbsections-prepare" pass and the machine function splitter pass are intentionally made mutually exclusive. If bbsections is explicitly requested, machine function splitter does not apply. Please see the change in TargetPassConfig.cpp

Could you share the details of the machine as well?

Sure, these were measured on a Lenovo P920 workstation -- Intel Skylake based Xeon(R) Gold 6154 CPU.

The improvements are well within noise.

For SPEC, the reported intrate improvement numbers are an average across 5 iterations. Note that SPEC binaries are tiny in size may only improve code locality in some cases.

While itlb reduction looks quite impressive, it doesn't seem to translate quite well to the runtime improvement.

It stands to reason that removing the itlb bottleneck will expose the next one :) We could dig deeper by looking into how the top down profile changes with and without splitting.

Did we see consistent >2% improvement with multiple runs? Please share the numbers.

We see consistent 2%+ improvements over FDO optimized binaries. The numbers reported are averaged across 10 runs, here is the data for one such experiment where 500 invocations of clang were executed and the overall end to end user time was measured. For completeness, I have included the data for a hot-cold-split optimized binary as well. Note this particular experiment does not use ThinLTO for any of the builds since I had some trouble running the hot-cold-split pass with ThinLTO enabled.

|----------------------------------|----------------------|----------------|-----------|
|                                  | User time in seconds ($ time run-commands.sh)     |
|----------------------------------|----------------------|----------------|-----------|
| Run #                            | FDO baseline         | Hot cold split | MFS       |
|                                1 |               484.65 |          479.2 |    466.93 |
|                                2 |                483.4 |         478.28 |    470.25 |
|                                3 |               485.57 |         479.15 |    470.36 |
|                                4 |               480.37 |         480.34 |    469.85 |
|                                5 |               482.97 |         478.18 |    471.93 |
|                                6 |               484.06 |         479.74 |    473.27 |
|                                7 |               482.67 |         477.42 |    472.56 |
|                                8 |               483.53 |         476.99 |    474.58 |
|                                9 |               486.43 |         480.76 |    473.92 |
|                               10 |               489.94 |         480.11 |    471.42 |
|----------------------------------|----------------------|----------------|-----------|
| 2 Tail Paired T-Test vs Baseline |                      |      0.0000636 | 0.0000006 |
|----------------------------------|----------------------|----------------|-----------|
| Average                          |              484.359 |        479.017 |   471.507 |
|----------------------------------|----------------------|----------------|-----------|
| % Change                         |                      |           1.10 |      2.65 |
|----------------------------------|----------------------|----------------|-----------|

Here is the data for TLB and icache. Each event was collected independently along with instructions to ensure no multiplexing. The variance reported by perf was less than 1% for each event (often less than 0.5%).

|-----------|--------------------------------------------|--------------------------------------------------------|
|           | $ perf stat -r 3 -e frontend_retired.${EVENT}:u,instructions:u -- run-commands.sh                   |
|-----------|--------------------------------------------|--------------------------------------------------------|
|           | Machine Function Splitter                  | FDO Baseline                                           |
|-----------|--------------------------------------------|--------------------------------------------------------|
| EVENT     | Misses        | Instructions      | MPKI   | Misses         | Instructions      | MPKI   | % Change |
| itlb_miss | 1,411,325,040 | 1,618,495,692,919 | 0.8720 |  2,066,003,373 | 1,618,097,715,534 | 1.2768 |    31.70 |
| stlb_miss |   131,949,440 | 1,618,466,757,079 | 0.0815 |    195,471,938 | 1,618,061,281,016 | 0.1208 |    32.51 |
| l1i_miss  | 9,678,255,804 | 1,618,479,987,914 | 5.9798 | 10,698,143,090 | 1,618,081,273,918 | 6.6116 |     9.56 |
| l2_miss   |   434,287,963 | 1,618,443,723,597 | 0.2683 |    542,869,835 | 1,618,081,904,973 | 0.3355 |    20.02 |
|-----------|--------------------------------------------|--------------------------------------------------------|

Update PSI metadata to fix assert failure.

Harbormaster completed remote builds in B67614: Diff 284169.Aug 8 2020, 11:50 PM

Thanks for adding the results, could you share the script to measure bootstrap numbers?

In HCS the ability to keep cold functions in a separate section was added in: D85331 (cc: @rjf ), can we try with -mllvm -enable-cold-section to compare with MachineFuncionSplitter.

In D85368#2205653, @hiraditya wrote:

Thanks for adding the results, could you share the script to measure bootstrap numbers?

I've uploaded a Makefile here which will allow you to run the bootstrap benchmarks. Applying this patch on a local llvm repo and pointing the Makefile at it should be sufficient to get you going.

In HCS the ability to keep cold functions in a separate section was added in: D85331 (cc: @rjf ), can we try with -mllvm -enable-cold-section to compare with MachineFuncionSplitter.

We already incorporate this in our evaluation since we link using lld along with the flag -z,keep-text-section-prefix. Since the extracted functions are marked cold, they are assigned a .text.unlikely prefix. Passing -z,keep-text-section-prefix to lld ensures that these functions are placed in the appropriate output section achieving the same goal of improving locality for hot code. The impact of this can be seen in the binary characteristics we shared in the original RFC which showed a 41% and 47% decrease in size of .text and .text.hot respectively for the hot cold split pass.

We applied patch D85331 and find similar results. Comparing the sections of the binary (hot cold split vs PGO baseline), we find a new __llvm_cold section along with similar fractions of code extracted from .text and .text.hot --

   FILE SIZE        VM SIZE    
--------------  -------------- 
 [NEW] +7.31Mi  [NEW] +7.31Mi    __llvm_cold
  +64% +3.21Mi   +64% +3.21Mi    .eh_frame
  +26% +2.77Mi  [ = ]       0    .strtab
  +27%  +711Ki  [ = ]       0    .symtab
  +31%  +236Ki   +31%  +236Ki    .eh_frame_hdr
 +3.1%     +12  [ = ]       0    .shstrtab
  +29%      +9   +29%      +9    [LOAD #3 [RX]]
 +0.8%      +6  +0.8%      +6    [LOAD #2 [R]]
 -7.1%      -1  [ = ]       0    [Unmapped]
 -0.0%    -246  -0.0%    -246    .dynstr
 -0.1% -4.05Ki  -0.1% -4.05Ki    .rodata
 -1.4% -5.73Ki  -1.4% -5.73Ki    .text.startup
 -1.6%  -536Ki  -1.6%  -536Ki    .text.unlikely
-46.7% -2.51Mi -46.7% -2.51Mi    .text.hot
-43.8% -2.89Mi -43.8% -2.89Mi    .text
  +10% +8.28Mi  +7.2% +4.82Mi    TOTAL

Running the benchmarks with the patch enabled and comparing against the MachineFunctionSplitter. We see similar performance numbers --

|----------|----------|----------------|---------------------------|
| Run #    | Baseline | Hot Cold Split | Machine Function Splitter |
|----------|----------|----------------|---------------------------|
|        1 |   501.25 |         490.84 |                    489.48 |
|        2 |   504.22 |         491.66 |                    493.42 |
|        3 |   500.04 |          492.7 |                    489.18 |
|        4 |    499.4 |         493.31 |                    489.47 |
|        5 |   495.62 |          496.1 |                    488.79 |
|        6 |   500.62 |         495.61 |                    488.41 |
|        7 |   501.81 |         494.45 |                    487.67 |
|        8 |   496.96 |         495.91 |                    490.91 |
|        9 |   500.22 |         497.17 |                       488 |
|       10 |   499.66 |         493.81 |                    489.65 |
|----------|----------|----------------|---------------------------|
| Average  |   499.98 |        494.156 |                   489.498 |
| % Change |          |           1.16 |                      2.10 |
|----------|----------|----------------|---------------------------|

Add an option to exclude specific sections.

Add option -mfs-excluded-sections to allow users to specify section names to exclude.
Add a test for the option.

Harbormaster completed remote builds in B67836: Diff 284560.Aug 10 2020, 8:17 PM

Overall approach looks good to me even when we don't see good SPEC-17 numbers as the optimization is intended to reduce page faults. The improvements would be more pronounced in large applications. I'll review the code in more detail in the next few days. Thanks for working on this.

hiraditya added a reviewer: hiraditya.Aug 14 2020, 12:07 AM

please run clang-format on the patch.

llvm/lib/CodeGen/BasicBlockSections.cpp
232 ↗	(On Diff #284560)	nit: tab?
277 ↗	(On Diff #284560)	Is this comment necessary?

This revision now requires changes to proceed.Aug 14 2020, 12:10 AM

Can we add more test cases to include eh_pad, invokeinst,

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
114	std::find?

tmsriram added inline comments.Aug 18 2020, 12:35 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	I think excluding sections needs a bit more thought and we should do this as a separate patch if it is useful but I think a linker solution would be more favorable. From what I understand, when a user specifies section names using the section keyword, then the expectation is that all functions marked with that section name will be grouped together. With function splitting, since you attach the ".cold" suffix to such sections that are split, there is no guarantee that the linker will place them together as these are not prefixed as ".text". To overcome the above problem, the option to exclude such sections from being split is not ideal either as it moves the burden to the user to get this right with appropriate options. I think the temporary fix is to not split sections which are not prefixed as ".text". You can add a "FIXME:" comment here to describe why you are doing this. Moving forward, we can look at a linker solution where '.' is treated as a valid section name separator and sections with identical prefix before the "." are always grouped together even if they are not named ".text". I think we can move this handling as an enhancement in another patch.

tmsriram added inline comments.Aug 18 2020, 12:37 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	Correction: I meant ".unlikely" and not ".cold".

tschuett added a subscriber: tschuett.Aug 18 2020, 12:38 PM

efriedma added inline comments.Aug 18 2020, 1:02 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	I think I'd rather make splitting for functions with an explicit section attribute opt-in, rather than opt-out. The user might have a strong need to emit a function in a particular section (for example, if the name is mentioned in a linker script). If someone is messing with section attributes in the first place, I'd like to be conservative by default.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	Why do we need two options to control the same thing?

Address review comments.

Remove excluded section attribute option.
Don't split functions which have a section attribute.
Add a test for ehpads.
Remove redundant comments and clang-format.

Thanks for the reviews!
@hiraditya I've added a test to ensure ehpads are not split out. Let me know if there are additional cases you want to cover.

llvm/lib/CodeGen/BasicBlockSections.cpp
232 ↗	(On Diff #284560)	This seems to be the output from clang-format. The diff does not have a tab but it looks like phabricator is showing it as one?
llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	@tmsriram I think you meant to exclude such functions from being split. I agree, taking into consideration @efriedma's comment, I've removed the option and made this a conservative check on the section attribute. Any function with the section attribute set is not split.
114	Code referencing this comment was removed.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	In this patch we added two options An option in llvm/lib/CodeGen/CommandFlags.cpp "split-machine-functions" so that llc can be used to invoke it in the tests. We added a temporary option in llvm/lib/CodeGen/TargetPassConfig.cpp so that it can be invoked when running with clang or lld (for LTO). AFAICT we cant use (2) for tests and having (1) makes it easy to compile things without an intermediate llc step. We plan on removing (2) in a future patch which will add appropriate options to clang (-fsplit-machine-functions) and lld (--lto-split-machine-functions).

Harbormaster completed remote builds in B68836: Diff 286459.Aug 18 2020, 8:34 PM

snehasish added inline comments.Aug 18 2020, 9:33 PM

llvm/lib/CodeGen/TargetPassConfig.cpp
218	Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc however since we plan to introduce clang and lld flags in the near future it seems cleaner to leave the llc flag in place and just remove (2) when that happens rather than reintroduce it. WDYT?

efriedma added subscribers: serge-sans-paille, MaskRay.Aug 19 2020, 4:22 PM

efriedma added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
218	clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing.

snehasish added inline comments.Aug 20 2020, 1:05 PM

llvm/lib/CodeGen/TargetPassConfig.cpp

218

The clang driver does not register the codegen flags, the only clang tool which does is clang-fuzzer. A small patch like the one below would do the trick for basic functionality. More plumbing might be needed to print the appropriate flags from the driver. I think this is probably worth more discussion and beyond the scope of this patch.

diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp
index 87047be3c2b..0b9b5673d3e 100644
--- a/clang/tools/driver/cc1as_main.cpp
+++ b/clang/tools/driver/cc1as_main.cpp
@@ -21,6 +21,7 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/ADT/Triple.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/MC/MCAsmBackend.h"
 #include "llvm/MC/MCAsmInfo.h"
@@ -61,6 +62,8 @@ using namespace clang::driver::options;
 using namespace llvm;
 using namespace llvm::opt;
 
+static codegen::RegisterCodeGenFlags CGF;
+
 namespace {
 
 /// Helper class for representing a single invocation of the assembler.

LGTM, thanks for doing this!

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
110	Maybe a comment here that this is useful while sorting the blocks?
118	Do we need a fixme comment here to say that we could split out landing pads if the exception patch for bb sections lands?

Add a comment to explain renumbering, FIXME for ehpads.

Added a comment to explain why renumbering blocks is necessary.
Added a FIXME and pointer to exceptions splitting patch.

Thanks for the comments.
@efriedma @hiraditya - Let me know if you have any further comments, I will wait till EOD Thursday 08/26. If not I'll take that as go ahead to commit this change.

Harbormaster completed remote builds in B69514: Diff 287785.Aug 25 2020, 4:50 PM

hiraditya added inline comments.Aug 28 2020, 8:43 AM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
124	nit: redundant braces,
llvm/lib/CodeGen/TargetPassConfig.cpp
216	Remove this FIXME. most passes continue to have cl::opt anyways.

nit: If FIXME's are mostly future works, then please replace them with TODOs.

Address reviewer comments.

Remove redundant braces.
Remove FIXME for cl::opt.
s/FIXME/TODO for future work.

snehasish marked 2 inline comments as done.Aug 28 2020, 10:04 AM

Harbormaster completed remote builds in B69935: Diff 288643.Aug 28 2020, 10:50 AM

Thanks for the comments all! The builds look green and I'm going to go ahead and push this.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 28 2020, 11:13 AM

Closed by commit rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter (authored by snehasish). · Explain Why

This revision was automatically updated to reflect the committed changes.

snehasish added a commit: rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter.

wxiao3 mentioned this in D94215: [PostRASched] Breaking More CriticalAntiDeps.Apr 26 2021, 7:37 AM

shenhan mentioned this in D152399: [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO. .Jun 7 2023, 2:44 PM

shenhan mentioned this in rG8df75969ae70: [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO..Jul 10 2023, 4:02 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

BasicBlockSectionUtils.h

27 lines

2 lines

3 lines

4 lines

1 line

Target/

TargetOptions.h

14 lines

lib/

CodeGen/

BBSectionsPrepare.cpp

88 lines

CMakeLists.txt

1 line

CommandFlags.cpp

9 lines

MachineFunctionSplitter.cpp

114 lines

TargetPassConfig.cpp

14 lines

test/

CodeGen/

X86/

machine-function-splitter.ll

69 lines

Diff 283424

llvm/include/llvm/CodeGen/BasicBlockSectionUtils.h

This file was added.

				//===- BasicBlockSectionUtils.h - Utilities for basic block sections --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H
				#define LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H

				#include "llvm/ADT/STLExtras.h"

				namespace llvm {

				class MachineFunction;
				class MachineBasicBlock;

				using MachineBasicBlockComparator =
				function_ref<bool(const MachineBasicBlock &, const MachineBasicBlock &)>;

				void sortBasicBlocksAndUpdateBranches(MachineFunction &MF,
				MachineBasicBlockComparator MBBCmp);

				} // end namespace llvm

				#endif // LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H

llvm/include/llvm/CodeGen/CommandFlags.h

	Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
	llvm::DebuggerKind getDebuggerTuningOpt();			llvm::DebuggerKind getDebuggerTuningOpt();

	bool getEnableStackSizeSection();			bool getEnableStackSizeSection();

	bool getEnableAddrsig();			bool getEnableAddrsig();

	bool getEmitCallSiteInfo();			bool getEmitCallSiteInfo();

				bool getEnableMachineFunctionSplitter();

	bool getEnableDebugEntryValues();			bool getEnableDebugEntryValues();

	bool getForceDwarfFrameSection();			bool getForceDwarfFrameSection();

	bool getXRayOmitFunctionIndex();			bool getXRayOmitFunctionIndex();

	/// Create this object with static storage to register codegen-related command			/// Create this object with static storage to register codegen-related command
	/// line options.			/// line options.
	Show All 27 Lines

llvm/include/llvm/CodeGen/MachineFunction.h

Show First 20 Lines • Show All 488 Lines • ▼ Show 20 Lines	public:
StringRef getName() const;		StringRef getName() const;

/// getFunctionNumber - Return a unique ID for the current function.		/// getFunctionNumber - Return a unique ID for the current function.
unsigned getFunctionNumber() const { return FunctionNumber; }		unsigned getFunctionNumber() const { return FunctionNumber; }

/// Returns true if this function has basic block sections enabled.		/// Returns true if this function has basic block sections enabled.
bool hasBBSections() const {		bool hasBBSections() const {
return (BBSectionsType == BasicBlockSection::All \|\|		return (BBSectionsType == BasicBlockSection::All \|\|
BBSectionsType == BasicBlockSection::List);		BBSectionsType == BasicBlockSection::List \|\|
		BBSectionsType == BasicBlockSection::Preset);
}		}

/// Returns true if basic block labels are to be generated for this function.		/// Returns true if basic block labels are to be generated for this function.
bool hasBBLabels() const {		bool hasBBLabels() const {
return BBSectionsType == BasicBlockSection::Labels;		return BBSectionsType == BasicBlockSection::Labels;
}		}

void setBBSectionsType(BasicBlockSection V) { BBSectionsType = V; }		void setBBSectionsType(BasicBlockSection V) { BBSectionsType = V; }
▲ Show 20 Lines • Show All 633 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	namespace llvm {
FunctionPass *createUnreachableBlockEliminationPass();		FunctionPass *createUnreachableBlockEliminationPass();

/// createBBSectionsPrepare Pass - This pass assigns sections to machine basic		/// createBBSectionsPrepare Pass - This pass assigns sections to machine basic
/// blocks and is enabled with -fbasic-block-sections.		/// blocks and is enabled with -fbasic-block-sections.
/// Buf is a memory buffer that contains the list of functions and basic		/// Buf is a memory buffer that contains the list of functions and basic
/// block ids to selectively enable basic block sections.		/// block ids to selectively enable basic block sections.
MachineFunctionPass createBBSectionsPreparePass(const MemoryBuffer Buf);		MachineFunctionPass createBBSectionsPreparePass(const MemoryBuffer Buf);

		/// createMachineFunctionSplitterPass - This pass splits machine functions
		/// using profile information.
		MachineFunctionPass *createMachineFunctionSplitterPass();

/// MachineFunctionPrinter pass - This pass prints out the machine function to		/// MachineFunctionPrinter pass - This pass prints out the machine function to
/// the given stream as a debugging tool.		/// the given stream as a debugging tool.
MachineFunctionPass *		MachineFunctionPass *
createMachineFunctionPrinterPass(raw_ostream &OS,		createMachineFunctionPrinterPass(raw_ostream &OS,
const std::string &Banner ="");		const std::string &Banner ="");

/// MIRPrinting pass - this pass prints out the LLVM IR into the given stream		/// MIRPrinting pass - this pass prints out the LLVM IR into the given stream
/// using the MIR serialization format.		/// using the MIR serialization format.
▲ Show 20 Lines • Show All 431 Lines • Show Last 20 Lines

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 274 Lines • ▼ Show 20 Lines
	void initializeMachineBlockPlacementStatsPass(PassRegistry&);			void initializeMachineBlockPlacementStatsPass(PassRegistry&);
	void initializeMachineBranchProbabilityInfoPass(PassRegistry&);			void initializeMachineBranchProbabilityInfoPass(PassRegistry&);
	void initializeMachineCSEPass(PassRegistry&);			void initializeMachineCSEPass(PassRegistry&);
	void initializeMachineCombinerPass(PassRegistry&);			void initializeMachineCombinerPass(PassRegistry&);
	void initializeMachineCopyPropagationPass(PassRegistry&);			void initializeMachineCopyPropagationPass(PassRegistry&);
	void initializeMachineDominanceFrontierPass(PassRegistry&);			void initializeMachineDominanceFrontierPass(PassRegistry&);
	void initializeMachineDominatorTreePass(PassRegistry&);			void initializeMachineDominatorTreePass(PassRegistry&);
	void initializeMachineFunctionPrinterPassPass(PassRegistry&);			void initializeMachineFunctionPrinterPassPass(PassRegistry&);
				void initializeMachineFunctionSplitterPass(PassRegistry &);
	void initializeMachineLICMPass(PassRegistry&);			void initializeMachineLICMPass(PassRegistry&);
	void initializeMachineLoopInfoPass(PassRegistry&);			void initializeMachineLoopInfoPass(PassRegistry&);
	void initializeMachineModuleInfoWrapperPassPass(PassRegistry &);			void initializeMachineModuleInfoWrapperPassPass(PassRegistry &);
	void initializeMachineOptimizationRemarkEmitterPassPass(PassRegistry&);			void initializeMachineOptimizationRemarkEmitterPassPass(PassRegistry&);
	void initializeMachineOutlinerPass(PassRegistry&);			void initializeMachineOutlinerPass(PassRegistry&);
	void initializeMachinePipelinerPass(PassRegistry&);			void initializeMachinePipelinerPass(PassRegistry&);
	void initializeMachinePostDominatorTreePass(PassRegistry&);			void initializeMachinePostDominatorTreePass(PassRegistry&);
	void initializeMachineRegionInfoPassPass(PassRegistry&);			void initializeMachineRegionInfoPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 156 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	enum class BasicBlockSection {
All, // Use Basic Block Sections for all basic blocks. A section		All, // Use Basic Block Sections for all basic blocks. A section
// for every basic block can significantly bloat object file sizes.		// for every basic block can significantly bloat object file sizes.
List, // Get list of functions & BBs from a file. Selectively enables		List, // Get list of functions & BBs from a file. Selectively enables
// basic block sections for a subset of basic blocks which can be		// basic block sections for a subset of basic blocks which can be
// used to control object size bloats from creating sections.		// used to control object size bloats from creating sections.
Labels, // Do not use Basic Block Sections but label basic blocks. This		Labels, // Do not use Basic Block Sections but label basic blocks. This
// is useful when associating profile counts from virtual addresses		// is useful when associating profile counts from virtual addresses
// to basic blocks.		// to basic blocks.
		Preset, // Similar to list but the blocks are identified by passes which
		// seek to use Basic Block Sections, e.g. MachineFunctionSplitter.
		// This option cannot be set via the command line.
None // Do not use Basic Block Sections.		None // Do not use Basic Block Sections.
};		};

enum class EABI {		enum class EABI {
Unknown,		Unknown,
Default, // Default means not specified		Default, // Default means not specified
EABI4, // Target-specific (either 4, 5 or gnu depending on triple).		EABI4, // Target-specific (either 4, 5 or gnu depending on triple).
EABI5,		EABI5,
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	TargetOptions()
GuaranteedTailCallOpt(false), StackSymbolOrdering(true),		GuaranteedTailCallOpt(false), StackSymbolOrdering(true),
EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),		EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),
DisableIntegratedAS(false), RelaxELFRelocations(false),		DisableIntegratedAS(false), RelaxELFRelocations(false),
FunctionSections(false), DataSections(false),		FunctionSections(false), DataSections(false),
UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),		UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),
TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),		TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),
EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),		EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),
EmitStackSizeSection(false), EnableMachineOutliner(false),		EmitStackSizeSection(false), EnableMachineOutliner(false),
SupportsDefaultOutlining(false), EmitAddrsig(false),		EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
EmitCallSiteInfo(false), SupportsDebugEntryValues(false),		EmitAddrsig(false), EmitCallSiteInfo(false),
EnableDebugEntryValues(false), ForceDwarfFrameSection(false),		SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
XRayOmitFunctionIndex(false),		ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),
FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}		FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

/// DisableFramePointerElim - This returns true if frame pointer elimination		/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.		/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;		bool DisableFramePointerElim(const MachineFunction &MF) const;

/// UnsafeFPMath - This flag is enabled when the		/// UnsafeFPMath - This flag is enabled when the
/// -enable-unsafe-fp-math flag is specified on the command line. When		/// -enable-unsafe-fp-math flag is specified on the command line. When
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	public:
unsigned EnableIPRA : 1;		unsigned EnableIPRA : 1;

/// Emit section containing metadata on function stack sizes.		/// Emit section containing metadata on function stack sizes.
unsigned EmitStackSizeSection : 1;		unsigned EmitStackSizeSection : 1;

/// Enables the MachineOutliner pass.		/// Enables the MachineOutliner pass.
unsigned EnableMachineOutliner : 1;		unsigned EnableMachineOutliner : 1;

		/// Enables the MachineFunctionSplitter pass.
		unsigned EnableMachineFunctionSplitter : 1;

/// Set if the target supports default outlining behaviour.		/// Set if the target supports default outlining behaviour.
unsigned SupportsDefaultOutlining : 1;		unsigned SupportsDefaultOutlining : 1;

/// Emit address-significance table.		/// Emit address-significance table.
unsigned EmitAddrsig : 1;		unsigned EmitAddrsig : 1;

/// Emit basic blocks into separate sections.		/// Emit basic blocks into separate sections.
BasicBlockSection BBSections = BasicBlockSection::None;		BasicBlockSection BBSections = BasicBlockSection::None;
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/lib/CodeGen/BBSectionsPrepare.cpp

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/CodeGen/BasicBlockSectionUtils.h"
		tmsriramUnsubmitted Not Done Reply Inline Actions Should we rename the .h and .cpp so that they have the same prefix? Maybe BasicBlockSections.h and BasicBlockSections.cpp? tmsriram: Should we rename the .h and .cpp so that they have the same prefix? Maybe BasicBlockSections.h…
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/CodeGen/TargetInstrInfo.h"		#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/LineIterator.h"		#include "llvm/Support/LineIterator.h"
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
// All explicitly specified clusters of basic blocks will be ordered		// All explicitly specified clusters of basic blocks will be ordered
// accordingly. All non-specified BBs go into a separate "Cold" section.		// accordingly. All non-specified BBs go into a separate "Cold" section.
// Additionally, if exception handling landing pads end up in more than one		// Additionally, if exception handling landing pads end up in more than one
// clusters, they are moved into a single "Exception" section. Eventually,		// clusters, they are moved into a single "Exception" section. Eventually,
// clusters are ordered in increasing order of their IDs, with the "Exception"		// clusters are ordered in increasing order of their IDs, with the "Exception"
// and "Cold" succeeding all other clusters.		// and "Cold" succeeding all other clusters.
// FuncBBClusterInfo represent the cluster information for basic blocks. If this		// FuncBBClusterInfo represent the cluster information for basic blocks. If this
// is empty, it means unique sections for all basic blocks in the function.		// is empty, it means unique sections for all basic blocks in the function.
static bool assignSectionsAndSortBasicBlocks(		static void
MachineFunction &MF,		assignSections(MachineFunction &MF,
const std::vector<Optional<BBClusterInfo>> &FuncBBClusterInfo) {		const std::vector<Optional<BBClusterInfo>> &FuncBBClusterInfo) {
assert(MF.hasBBSections() && "BB Sections is not set for function.");		assert(MF.hasBBSections() && "BB Sections is not set for function.");
// This variable stores the section ID of the cluster containing eh_pads (if		// This variable stores the section ID of the cluster containing eh_pads (if
// all eh_pads are one cluster). If more than one cluster contain eh_pads, we		// all eh_pads are one cluster). If more than one cluster contain eh_pads, we
// set it equal to ExceptionSectionID.		// set it equal to ExceptionSectionID.
Optional<MBBSectionID> EHPadsSectionID;		Optional<MBBSectionID> EHPadsSectionID;

for (auto &MBB : MF) {		for (auto &MBB : MF) {
// With the 'all' option, every basic block is placed in a unique section.		// With the 'all' option, every basic block is placed in a unique section.
Show All 26 Lines	assignSections(MachineFunction &MF,
}		}

// If EHPads are in more than one section, this places all of them in the		// If EHPads are in more than one section, this places all of them in the
// special exception section.		// special exception section.
if (EHPadsSectionID == MBBSectionID::ExceptionSectionID)		if (EHPadsSectionID == MBBSectionID::ExceptionSectionID)
for (auto &MBB : MF)		for (auto &MBB : MF)
if (MBB.isEHPad())		if (MBB.isEHPad())
MBB.setSectionID(EHPadsSectionID.getValue());		MBB.setSectionID(EHPadsSectionID.getValue());
		}

		// This function is exposed externally by BasicBlockSectionUtils.h
		void llvm::sortBasicBlocksAndUpdateBranches(
		MachineFunction &MF, MachineBasicBlockComparator MBBCmp) {
SmallVector<MachineBasicBlock *, 4> PreLayoutFallThroughs(		SmallVector<MachineBasicBlock *, 4> PreLayoutFallThroughs(
MF.getNumBlockIDs());		MF.getNumBlockIDs());
for (auto &MBB : MF)		for (auto &MBB : MF)
PreLayoutFallThroughs[MBB.getNumber()] = MBB.getFallThrough();		PreLayoutFallThroughs[MBB.getNumber()] = MBB.getFallThrough();

		MF.sort(MBBCmp);

		// Set IsBeginSection and IsEndSection according to the assigned section IDs.
		MF.assignBeginEndSections();

		// After reordering basic blocks, we must update basic block branches to
		// insert explicit fallthrough branches when required and optimize branches
		// when possible.
		updateBranches(MF, PreLayoutFallThroughs);
		}

		bool BBSectionsPrepare::runOnMachineFunction(MachineFunction &MF) {
		auto BBSectionsType = MF.getTarget().getBBSectionsType();
		assert(BBSectionsType != BasicBlockSection::None &&
		"BB Sections not enabled!");
		// Renumber blocks before sorting them for basic block sections. This is
		// useful during sorting, basic blocks in the same section will retain the
		// default order. This renumbering should also be done for basic block
		// labels to match the profiles with the correct blocks.
		MF.RenumberBlocks();

		if (BBSectionsType == BasicBlockSection::Labels) {
		MF.setBBSectionsType(BBSectionsType);
		MF.createBBLabels();
		return true;
		}

		std::vector<Optional<BBClusterInfo>> FuncBBClusterInfo;
		if (BBSectionsType == BasicBlockSection::List &&
		!getBBClusterInfoForFunction(MF, FuncAliasMap, ProgramBBClusterInfo,
		FuncBBClusterInfo))
		return true;
		MF.setBBSectionsType(BBSectionsType);
		MF.createBBLabels();
		assignSections(MF, FuncBBClusterInfo);

// We make sure that the cluster including the entry basic block precedes all		// We make sure that the cluster including the entry basic block precedes all
// other clusters.		// other clusters.
auto EntryBBSectionID = MF.front().getSectionID();		auto EntryBBSectionID = MF.front().getSectionID();

// Helper function for ordering BB sections as follows:		// Helper function for ordering BB sections as follows:
// * Entry section (section including the entry block).		// * Entry section (section including the entry block).
// * Regular sections (in increasing order of their Number).		// * Regular sections (in increasing order of their Number).
// ...		// ...
// * Exception section		// * Exception section
// * Cold section		// * Cold section
auto MBBSectionOrder = [EntryBBSectionID](const MBBSectionID &LHS,		auto MBBSectionOrder = [EntryBBSectionID](const MBBSectionID &LHS,
const MBBSectionID &RHS) {		const MBBSectionID &RHS) {
// We make sure that the section containing the entry block precedes all the		// We make sure that the section containing the entry block precedes all the
// other sections.		// other sections.
if (LHS == EntryBBSectionID \|\| RHS == EntryBBSectionID)		if (LHS == EntryBBSectionID \|\| RHS == EntryBBSectionID)
return LHS == EntryBBSectionID;		return LHS == EntryBBSectionID;
return LHS.Type == RHS.Type ? LHS.Number < RHS.Number : LHS.Type < RHS.Type;		return LHS.Type == RHS.Type ? LHS.Number < RHS.Number : LHS.Type < RHS.Type;
};		};

// We sort all basic blocks to make sure the basic blocks of every cluster are		// We sort all basic blocks to make sure the basic blocks of every cluster are
// contiguous and ordered accordingly. Furthermore, clusters are ordered in		// contiguous and ordered accordingly. Furthermore, clusters are ordered in
// increasing order of their section IDs, with the exception and the		// increasing order of their section IDs, with the exception and the
// cold section placed at the end of the function.		// cold section placed at the end of the function.
MF.sort([&](MachineBasicBlock &X, MachineBasicBlock &Y) {		auto Comparator = [&](const MachineBasicBlock &X,
		const MachineBasicBlock &Y) {
auto XSectionID = X.getSectionID();		auto XSectionID = X.getSectionID();
auto YSectionID = Y.getSectionID();		auto YSectionID = Y.getSectionID();
if (XSectionID != YSectionID)		if (XSectionID != YSectionID)
return MBBSectionOrder(XSectionID, YSectionID);		return MBBSectionOrder(XSectionID, YSectionID);
// If the two basic block are in the same section, the order is decided by		// If the two basic block are in the same section, the order is decided by
// their position within the section.		// their position within the section.
if (XSectionID.Type == MBBSectionID::SectionType::Default)		if (XSectionID.Type == MBBSectionID::SectionType::Default)
return FuncBBClusterInfo[X.getNumber()]->PositionInCluster <		return FuncBBClusterInfo[X.getNumber()]->PositionInCluster <
FuncBBClusterInfo[Y.getNumber()]->PositionInCluster;		FuncBBClusterInfo[Y.getNumber()]->PositionInCluster;
return X.getNumber() < Y.getNumber();		return X.getNumber() < Y.getNumber();
});		};

// Set IsBeginSection and IsEndSection according to the assigned section IDs.
MF.assignBeginEndSections();

// After reordering basic blocks, we must update basic block branches to
// insert explicit fallthrough branches when required and optimize branches
// when possible.
updateBranches(MF, PreLayoutFallThroughs);

return true;
}

bool BBSectionsPrepare::runOnMachineFunction(MachineFunction &MF) {
auto BBSectionsType = MF.getTarget().getBBSectionsType();
assert(BBSectionsType != BasicBlockSection::None &&
"BB Sections not enabled!");
// Renumber blocks before sorting them for basic block sections. This is
// useful during sorting, basic blocks in the same section will retain the
// default order. This renumbering should also be done for basic block
// labels to match the profiles with the correct blocks.
MF.RenumberBlocks();

if (BBSectionsType == BasicBlockSection::Labels) {
MF.setBBSectionsType(BBSectionsType);
MF.createBBLabels();
return true;
}

std::vector<Optional<BBClusterInfo>> FuncBBClusterInfo;		sortBasicBlocksAndUpdateBranches(MF, Comparator);
if (BBSectionsType == BasicBlockSection::List &&
!getBBClusterInfoForFunction(MF, FuncAliasMap, ProgramBBClusterInfo,
FuncBBClusterInfo))
return true;
MF.setBBSectionsType(BBSectionsType);
MF.createBBLabels();
assignSectionsAndSortBasicBlocks(MF, FuncBBClusterInfo);
return true;		return true;
}		}

// Basic Block Sections can be enabled for a subset of machine basic blocks.		// Basic Block Sections can be enabled for a subset of machine basic blocks.
// This is done by passing a file containing names of functions for which basic		// This is done by passing a file containing names of functions for which basic
// block sections are desired. Additionally, machine basic block ids of the		// block sections are desired. Additionally, machine basic block ids of the
// functions can also be specified for a finer granularity. Moreover, a cluster		// functions can also be specified for a finer granularity. Moreover, a cluster
// of basic blocks could be assigned to the same section.		// of basic blocks could be assigned to the same section.
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
MachineCSE.cpp		MachineCSE.cpp
MachineDebugify.cpp		MachineDebugify.cpp
MachineDominanceFrontier.cpp		MachineDominanceFrontier.cpp
MachineDominators.cpp		MachineDominators.cpp
MachineFrameInfo.cpp		MachineFrameInfo.cpp
MachineFunction.cpp		MachineFunction.cpp
MachineFunctionPass.cpp		MachineFunctionPass.cpp
MachineFunctionPrinterPass.cpp		MachineFunctionPrinterPass.cpp
		MachineFunctionSplitter.cpp
MachineInstrBundle.cpp		MachineInstrBundle.cpp
MachineInstr.cpp		MachineInstr.cpp
MachineLICM.cpp		MachineLICM.cpp
MachineLoopInfo.cpp		MachineLoopInfo.cpp
MachineLoopUtils.cpp		MachineLoopUtils.cpp
MachineModuleInfo.cpp		MachineModuleInfo.cpp
MachineModuleInfoImpls.cpp		MachineModuleInfoImpls.cpp
MachineOperand.cpp		MachineOperand.cpp
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CommandFlags.cpp

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
CGOPT(bool, EmulatedTLS)		CGOPT(bool, EmulatedTLS)
CGOPT(bool, UniqueSectionNames)		CGOPT(bool, UniqueSectionNames)
CGOPT(bool, UniqueBasicBlockSectionNames)		CGOPT(bool, UniqueBasicBlockSectionNames)
CGOPT(EABI, EABIVersion)		CGOPT(EABI, EABIVersion)
CGOPT(DebuggerKind, DebuggerTuningOpt)		CGOPT(DebuggerKind, DebuggerTuningOpt)
CGOPT(bool, EnableStackSizeSection)		CGOPT(bool, EnableStackSizeSection)
CGOPT(bool, EnableAddrsig)		CGOPT(bool, EnableAddrsig)
CGOPT(bool, EmitCallSiteInfo)		CGOPT(bool, EmitCallSiteInfo)
		CGOPT(bool, EnableMachineFunctionSplitter)
CGOPT(bool, EnableDebugEntryValues)		CGOPT(bool, EnableDebugEntryValues)
CGOPT(bool, ForceDwarfFrameSection)		CGOPT(bool, ForceDwarfFrameSection)
CGOPT(bool, XRayOmitFunctionIndex)		CGOPT(bool, XRayOmitFunctionIndex)

codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {		codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
#define CGBINDOPT(NAME) \		#define CGBINDOPT(NAME) \
do { \		do { \
NAME##View = std::addressof(NAME); \		NAME##View = std::addressof(NAME); \
▲ Show 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	#define CGBINDOPT(NAME) \
CGBINDOPT(EmitCallSiteInfo);		CGBINDOPT(EmitCallSiteInfo);

static cl::opt<bool> EnableDebugEntryValues(		static cl::opt<bool> EnableDebugEntryValues(
"debug-entry-values",		"debug-entry-values",
cl::desc("Enable debug info for the debug entry values."),		cl::desc("Enable debug info for the debug entry values."),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableDebugEntryValues);		CGBINDOPT(EnableDebugEntryValues);

		static cl::opt<bool> EnableMachineFunctionSplitter(
		"split-machine-functions",
		cl::desc("Split out cold basic blocks from machine functions based on "
		"profile information"),
		cl::init(false));
		CGBINDOPT(EnableMachineFunctionSplitter);

static cl::opt<bool> ForceDwarfFrameSection(		static cl::opt<bool> ForceDwarfFrameSection(
"force-dwarf-frame-section",		"force-dwarf-frame-section",
cl::desc("Always emit a debug frame section."), cl::init(false));		cl::desc("Always emit a debug frame section."), cl::init(false));
CGBINDOPT(ForceDwarfFrameSection);		CGBINDOPT(ForceDwarfFrameSection);

static cl::opt<bool> XRayOmitFunctionIndex(		static cl::opt<bool> XRayOmitFunctionIndex(
"no-xray-index", cl::desc("Don't emit xray_fn_idx section"),		"no-xray-index", cl::desc("Don't emit xray_fn_idx section"),
cl::init(false));		cl::init(false));
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	TargetOptions codegen::InitTargetOptionsFromCodeGenFlags() {
Options.BBSections = getBBSectionsMode(Options);		Options.BBSections = getBBSectionsMode(Options);
Options.UniqueSectionNames = getUniqueSectionNames();		Options.UniqueSectionNames = getUniqueSectionNames();
Options.UniqueBasicBlockSectionNames = getUniqueBasicBlockSectionNames();		Options.UniqueBasicBlockSectionNames = getUniqueBasicBlockSectionNames();
Options.TLSSize = getTLSSize();		Options.TLSSize = getTLSSize();
Options.EmulatedTLS = getEmulatedTLS();		Options.EmulatedTLS = getEmulatedTLS();
Options.ExplicitEmulatedTLS = EmulatedTLSView->getNumOccurrences() > 0;		Options.ExplicitEmulatedTLS = EmulatedTLSView->getNumOccurrences() > 0;
Options.ExceptionModel = getExceptionModel();		Options.ExceptionModel = getExceptionModel();
Options.EmitStackSizeSection = getEnableStackSizeSection();		Options.EmitStackSizeSection = getEnableStackSizeSection();
		Options.EnableMachineFunctionSplitter = getEnableMachineFunctionSplitter();
Options.EmitAddrsig = getEnableAddrsig();		Options.EmitAddrsig = getEnableAddrsig();
Options.EmitCallSiteInfo = getEmitCallSiteInfo();		Options.EmitCallSiteInfo = getEmitCallSiteInfo();
Options.EnableDebugEntryValues = getEnableDebugEntryValues();		Options.EnableDebugEntryValues = getEnableDebugEntryValues();
Options.ForceDwarfFrameSection = getForceDwarfFrameSection();		Options.ForceDwarfFrameSection = getForceDwarfFrameSection();
Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();		Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();

Options.MCOptions = mc::InitMCTargetOptionsFromFlags();		Options.MCOptions = mc::InitMCTargetOptionsFromFlags();

▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

This file was added.

				//===-- MachineFunctionSplitter.cpp - Split machine functions //-----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// \file
				// Uses profile information to split out cold blocks.
				//
				// This pass splits out cold machine basic blocks from the parent function. This
				// implementation leverages the basic block section framework. Blocks marked
				// cold by this pass are grouped together in a separate section prefixed with
				// ".text.unlikely.*". The linker can then group these together as a cold
				// section. The split part of the function is a contiguous region identified by
				// the symbol "foo.cold". Grouping all cold blocks across functions together
				// decreases fragmentation and improves icache and itlb utilization. Note that
				// the overall changes to the binary size are negligible; only a small number of
				// additional jump instructions may be introduced.
				//
				// For the original, RFC of this pass please see
				// http://lists.llvm.org/pipermail/llvm-dev/2020-August/144012.html
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/ProfileSummaryInfo.h"
				#include "llvm/CodeGen/BasicBlockSectionUtils.h"
				#include "llvm/CodeGen/MachineBasicBlock.h"
				#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineModuleInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/Function.h"
				#include "llvm/IR/Module.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Support/CommandLine.h"

				using namespace llvm;

				namespace {

				class MachineFunctionSplitter : public MachineFunctionPass {
				public:
				static char ID;
				MachineFunctionSplitter() : MachineFunctionPass(ID) {
				initializeMachineFunctionSplitterPass(*PassRegistry::getPassRegistry());
				}

				StringRef getPassName() const override {
				return "Machine Function Splitter Transformation";
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override;

				bool runOnMachineFunction(MachineFunction &F) override;
				};
				} // end anonymous namespace

				bool MachineFunctionSplitter::runOnMachineFunction(MachineFunction &MF) {
				// FIXME: We only target functions with profile data. Static information may
				// also be considered but we don't see performance improvements yet.
				if (!MF.getFunction().hasProfileData()) {
				return false;
				}

				// We don't want to proceed further for cold functions
				// or functions of unknown hotness. Lukewarm functions have no prefix.
				Optional<StringRef> SectionPrefix = MF.getFunction().getSectionPrefix();
				if (SectionPrefix.hasValue() &&
				(SectionPrefix.getValue().equals(".unlikely") \|\|
				SectionPrefix.getValue().equals(".unknown"))) {
				return false;
				}

				MF.RenumberBlocks();
				hiradityaUnsubmitted Done Reply Inline Actions Do we need to renumber? hiraditya: Do we need to renumber?
				tmsriramUnsubmitted Not Done Reply Inline Actions Renumbering makes the sorting easy. The sorting will preserve the basic block order for the blocks that are not split. tmsriram: Renumbering makes the sorting easy. The sorting will preserve the basic block order for the…
				snehasishAuthorUnsubmitted Done Reply Inline Actions We need to ensure that the order is preserved so that we don't perturb the decisions made by prior passes such as MachineBlockPlacement. Renumbering simplifies the code that needs to be shared with the BasicBlockSections pass. snehasish: We need to ensure that the order is preserved so that we don't perturb the decisions made by…
				hiradityaUnsubmitted Not Done Reply Inline Actions I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections hiraditya: I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections
				tmsriramUnsubmitted Not Done Reply Inline Actions The "bbsections-prepare" pass and the machine function splitter pass are intentionally made mutually exclusive. If bbsections is explicitly requested, machine function splitter does not apply. Please see the change in TargetPassConfig.cpp tmsriram: The "bbsections-prepare" pass and the machine function splitter pass are intentionally made…
				MF.setBBSectionsType(BasicBlockSection::Preset);

				auto *MBFI = &getAnalysis<MachineBlockFrequencyInfo>();

				for (auto &MBB : MF) {
				// We retain the entry block and conservatively keep all landing pad blocks
				// as part of the original function.
				if ((MBB.pred_empty() \|\| MBB.isEHPad()))
				continue;
				// Any block with a non-zero profile count is retained.
				Optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB);
				if (!(Count.hasValue() && Count.getValue() > 0)) {
				davidxlUnsubmitted Done Reply Inline Actions Add an internal option here (the coldness threshold) for experimental purpose. I also suggest add an option to specify programSummary based coldness threshold such as 99.99 percentile coldness. The default cutoff is 99.9999% defined in ProfileSummaryInfo.cpp: ProfileSummaryCutoffCold davidxl: Add an internal option here (the coldness threshold) for experimental purpose. I also suggest…
				MBB.setSectionID(MBBSectionID::ColdSectionID);
				}
				}

				auto Comparator = [](const MachineBasicBlock &X, const MachineBasicBlock &Y) {
				return X.getSectionID().Type < Y.getSectionID().Type;
				};
				llvm::sortBasicBlocksAndUpdateBranches(MF, Comparator);

				return true;
				}

				void MachineFunctionSplitter::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.addRequired<MachineModuleInfoWrapperPass>();
				AU.addRequired<MachineBlockFrequencyInfo>();
				}

				char MachineFunctionSplitter::ID = 0;
				INITIALIZE_PASS(MachineFunctionSplitter, "machine-function-splitter",
				"Split machine functions using profile information", false,
				false)
				tmsriramUnsubmitted Done Reply Inline Actions Maybe a comment here that this is useful while sorting the blocks? tmsriram: Maybe a comment here that this is useful while sorting the blocks?

				MachineFunctionPass *llvm::createMachineFunctionSplitterPass() {
				tmsriramUnsubmitted Not Done Reply Inline Actions I think excluding sections needs a bit more thought and we should do this as a separate patch if it is useful but I think a linker solution would be more favorable. From what I understand, when a user specifies section names using the section keyword, then the expectation is that all functions marked with that section name will be grouped together. With function splitting, since you attach the ".cold" suffix to such sections that are split, there is no guarantee that the linker will place them together as these are not prefixed as ".text". To overcome the above problem, the option to exclude such sections from being split is not ideal either as it moves the burden to the user to get this right with appropriate options. I think the temporary fix is to not split sections which are not prefixed as ".text". You can add a "FIXME:" comment here to describe why you are doing this. Moving forward, we can look at a linker solution where '.' is treated as a valid section name separator and sections with identical prefix before the "." are always grouped together even if they are not named ".text". I think we can move this handling as an enhancement in another patch. tmsriram: I think excluding sections needs a bit more thought and we should do this as a separate patch…
				tmsriramUnsubmitted Not Done Reply Inline Actions Correction: I meant ".unlikely" and not ".cold". tmsriram: Correction: I meant ".unlikely" and not ".cold".
				efriedmaUnsubmitted Not Done Reply Inline Actions I think I'd rather make splitting for functions with an explicit section attribute opt-in, rather than opt-out. The user might have a strong need to emit a function in a particular section (for example, if the name is mentioned in a linker script). If someone is messing with section attributes in the first place, I'd like to be conservative by default. efriedma: I think I'd rather make splitting for functions with an explicit section attribute opt-in…
				snehasishAuthorUnsubmitted Done Reply Inline Actions @tmsriram I think you meant to exclude such functions from being split. I agree, taking into consideration @efriedma's comment, I've removed the option and made this a conservative check on the section attribute. Any function with the section attribute set is not split. snehasish: @tmsriram I think you meant to exclude such functions from being split. I agree, taking into…
				return new MachineFunctionSplitter();
				}
				hiradityaUnsubmitted Not Done Reply Inline Actions std::find? hiraditya: std::find?
				snehasishAuthorUnsubmitted Done Reply Inline Actions Code referencing this comment was removed. snehasish: Code referencing this comment was removed.
				tmsriramUnsubmitted Done Reply Inline Actions Do we need a fixme comment here to say that we could split out landing pads if the exception patch for bb sections lands? tmsriram: Do we need a fixme comment here to say that we could split out landing pads if the exception…
				hiradityaUnsubmitted Done Reply Inline Actions nit: redundant braces, hiraditya: nit: redundant braces,

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	StopAfterOpt(StringRef(StopAfterOptName),
cl::desc("Stop compilation after a specific pass"),		cl::desc("Stop compilation after a specific pass"),
cl::value_desc("pass-name"), cl::init(""), cl::Hidden);		cl::value_desc("pass-name"), cl::init(""), cl::Hidden);

static cl::opt<std::string>		static cl::opt<std::string>
StopBeforeOpt(StringRef(StopBeforeOptName),		StopBeforeOpt(StringRef(StopBeforeOptName),
cl::desc("Stop compilation before a specific pass"),		cl::desc("Stop compilation before a specific pass"),
cl::value_desc("pass-name"), cl::init(""), cl::Hidden);		cl::value_desc("pass-name"), cl::init(""), cl::Hidden);

		/// Enable the machine function splitter pass.
		/// FIXME: Remove this once clang option for this feature has been added.
		hiradityaUnsubmitted Done Reply Inline Actions Remove this FIXME. most passes continue to have cl::opt anyways. hiraditya: Remove this FIXME. most passes continue to have cl::opt anyways.
		static cl::opt<bool> EnableMachineFunctionSplitter(
		"enable-split-machine-functions", cl::Hidden,
		tmsriramUnsubmitted Not Done Reply Inline Actions Why not call this split-machine-functions too for consistency? tmsriram: Why not call this split-machine-functions too for consistency?
		snehasishAuthorUnsubmitted Done Reply Inline Actions We can't register two options with the same string, i.e. "split-machine-functions". snehasish: We can't register two options with the same string, i.e. "split-machine-functions".
		efriedmaUnsubmitted Not Done Reply Inline Actions Why do we need two options to control the same thing? efriedma: Why do we need two options to control the same thing?
		snehasishAuthorUnsubmitted Done Reply Inline Actions In this patch we added two options An option in llvm/lib/CodeGen/CommandFlags.cpp "split-machine-functions" so that llc can be used to invoke it in the tests. We added a temporary option in llvm/lib/CodeGen/TargetPassConfig.cpp so that it can be invoked when running with clang or lld (for LTO). AFAICT we cant use (2) for tests and having (1) makes it easy to compile things without an intermediate llc step. We plan on removing (2) in a future patch which will add appropriate options to clang (-fsplit-machine-functions) and lld (--lto-split-machine-functions). snehasish: In this patch we added two options 1. An option in llvm/lib/CodeGen/CommandFlags.cpp "split…
		snehasishAuthorUnsubmitted Done Reply Inline Actions Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc however since we plan to introduce clang and lld flags in the near future it seems cleaner to leave the llc flag in place and just remove (2) when that happens rather than reintroduce it. WDYT? snehasish: Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc…
		efriedmaUnsubmitted Not Done Reply Inline Actions clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing. efriedma: clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing.
		snehasishAuthorUnsubmitted Done Reply Inline Actions The clang driver does not register the codegen flags, the only clang tool which does is clang-fuzzer. A small patch like the one below would do the trick for basic functionality. More plumbing might be needed to print the appropriate flags from the driver. I think this is probably worth more discussion and beyond the scope of this patch. diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp index 87047be3c2b..0b9b5673d3e 100644 --- a/clang/tools/driver/cc1as_main.cpp +++ b/clang/tools/driver/cc1as_main.cpp @@ -21,6 +21,7 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/ADT/Triple.h" +#include "llvm/CodeGen/CommandFlags.h" #include "llvm/IR/DataLayout.h" #include "llvm/MC/MCAsmBackend.h" #include "llvm/MC/MCAsmInfo.h" @@ -61,6 +62,8 @@ using namespace clang::driver::options; using namespace llvm; using namespace llvm::opt; +static codegen::RegisterCodeGenFlags CGF; + namespace { /// Helper class for representing a single invocation of the assembler. snehasish: The clang driver does not register the codegen flags, the only clang tool which does is clang…
		cl::desc("Split out cold blocks from machine functions based on profile "
		"information."));

/// Allow standard passes to be disabled by command line options. This supports		/// Allow standard passes to be disabled by command line options. This supports
/// simple binary flags that either suppress the pass or do nothing.		/// simple binary flags that either suppress the pass or do nothing.
/// i.e. -disable-mypass=false has no effect.		/// i.e. -disable-mypass=false has no effect.
/// These should be converted to boolOrDefault in order to use applyOverride.		/// These should be converted to boolOrDefault in order to use applyOverride.
static IdentifyingPassPtr applyDisable(IdentifyingPassPtr PassID,		static IdentifyingPassPtr applyDisable(IdentifyingPassPtr PassID,
bool Override) {		bool Override) {
if (Override)		if (Override)
return IdentifyingPassPtr();		return IdentifyingPassPtr();
▲ Show 20 Lines • Show All 786 Lines • ▼ Show 20 Lines	if (TM->Options.EnableMachineOutliner && getOptLevel() != CodeGenOpt::None &&
EnableMachineOutliner != NeverOutline) {		EnableMachineOutliner != NeverOutline) {
bool RunOnAllFunctions = (EnableMachineOutliner == AlwaysOutline);		bool RunOnAllFunctions = (EnableMachineOutliner == AlwaysOutline);
bool AddOutliner = RunOnAllFunctions \|\|		bool AddOutliner = RunOnAllFunctions \|\|
TM->Options.SupportsDefaultOutlining;		TM->Options.SupportsDefaultOutlining;
if (AddOutliner)		if (AddOutliner)
addPass(createMachineOutlinerPass(RunOnAllFunctions));		addPass(createMachineOutlinerPass(RunOnAllFunctions));
}		}

if (TM->getBBSectionsType() != llvm::BasicBlockSection::None)		// Machine function splitter uses the basic block sections feature. Both
		// cannot be enabled at the same time.
		if (TM->Options.EnableMachineFunctionSplitter \|\|
		EnableMachineFunctionSplitter)
		addPass(createMachineFunctionSplitterPass());
		else if (TM->getBBSectionsType() != llvm::BasicBlockSection::None)
addPass(llvm::createBBSectionsPreparePass(TM->getBBSectionsFuncListBuf()));		addPass(llvm::createBBSectionsPreparePass(TM->getBBSectionsFuncListBuf()));

// Add passes that directly emit MI after all other MI passes.		// Add passes that directly emit MI after all other MI passes.
addPreEmitPass2();		addPreEmitPass2();

AddingMachinePasses = false;		AddingMachinePasses = false;
}		}

▲ Show 20 Lines • Show All 255 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/machine-function-splitter.ll

This file was added.

				; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions \| FileCheck %s

				define void @foo1(i1 zeroext %0) nounwind !prof !1 !section_prefix !2 {
				;; Check that cold block is moved to .text.unlikely.
				; CHECK-LABEL: foo1
				; CHECK: .section .text.unlikely.foo1
				; CHECK-NEXT: foo1.cold:
				; CHECK-NOT: callq bar
				tmsriramUnsubmitted Done Reply Inline Actions Also, check if the block is moved to the cold region has the expected call instruction? tmsriram: Also, check if the block is moved to the cold region has the expected call instruction?
				br i1 %0, label %2, label %4, !prof !4

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define void @foo2(i1 zeroext %0) nounwind !prof !1 !section_prefix !3 {
				;; Check that function marked unlikely is not split.
				; CHECK-LABEL: foo2
				; CHECK-NOT: foo2.cold:
				br i1 %0, label %2, label %4, !prof !4

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define void @foo3(i1 zeroext %0) nounwind !section_prefix !2 {
				;; Check that function without profile data is not split.
				; CHECK-LABEL: foo3
				; CHECK-NOT: foo3.cold:
				br i1 %0, label %2, label %4

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				declare i32 @bar()
				declare i32 @baz()
				declare i32 @qux()

				!1 = !{!"function_entry_count", i64 7}
				!2 = !{!"function_section_prefix", !".hot"}
				!3 = !{!"function_section_prefix", !".unlikely"}
				!4 = !{!"branch_weights", i32 7, i32 0}

This is an archive of the discontinued LLVM Phabricator instance.

[llvm][CodeGen] Machine Function SplitterClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 283424

llvm/include/llvm/CodeGen/BasicBlockSectionUtils.h

llvm/include/llvm/CodeGen/CommandFlags.h

llvm/include/llvm/CodeGen/MachineFunction.h

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/Target/TargetOptions.h

llvm/lib/CodeGen/BBSectionsPrepare.cpp

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CommandFlags.cpp

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/test/CodeGen/X86/machine-function-splitter.ll

[llvm][CodeGen] Machine Function Splitter
ClosedPublic