This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
BasicBlockSectionUtils.h
-
CommandFlags.h
-
MachineFunction.h
-
Passes.h
-
InitializePasses.h
-
Target/
-
TargetOptions.h
-
lib/CodeGen/
-
CodeGen/
2/3
BasicBlockSections.cpp
-
CMakeLists.txt
-
CommandFlags.cpp
8/15
MachineFunctionSplitter.cpp
5/8
TargetPassConfig.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
1/1
machine-function-splitter.ll

Differential D85368

[llvm][CodeGen] Machine Function Splitter
ClosedPublic

Authored by snehasish on Aug 5 2020, 3:41 PM.

Download Raw Diff

Details

Reviewers

davidxl
eli.friedman
tmsriram
hiraditya

Commits

rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter

Summary

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
introduced in LLVM from the Propeller project. The pass targets
functions with profile coverage, identifies cold blocks and moves them
to a separate section. The linker groups all cold blocks across
functions together, decreasing fragmentation and improving icache and
itlb utilization.

We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017.
For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses. Additionally, l1 icache misses
reduced by 9.5% while l2 instruction misses reduced by 20%.
For SPECInt we report the change in IntRate the C/C++
benchmarks. All benchmarks apart from mcf and x264 improve, on average
by 0.6% with the max for deepsjeng at 1.6%.

Benchmark               % Change (IntRate)
500.perlbench_r          0.78
502.gcc_r                0.82
505.mcf_r               -0.30
520.omnetpp_r            0.18
523.xalancbmk_r          0.37
525.x264_r              -0.46
531.deepsjeng_r          1.61
541.leela_r              0.83
557.xz_r                 0.15

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

snehasish created this revision.Aug 5 2020, 3:41 PM

Herald added subscribers: llvm-commits, mgrang, mgorny. · View Herald TranscriptAug 5 2020, 3:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2020, 3:41 PM

snehasish requested review of this revision.Aug 5 2020, 3:42 PM

tmsriram added inline comments.Aug 5 2020, 3:55 PM

llvm/lib/CodeGen/BBSectionsPrepare.cpp
72 ↗	(On Diff #283424)	Should we rename the .h and .cpp so that they have the same prefix? Maybe BasicBlockSections.h and BasicBlockSections.cpp?
llvm/lib/CodeGen/TargetPassConfig.cpp
218	Why not call this split-machine-functions too for consistency?
llvm/test/CodeGen/X86/machine-function-splitter.ll
9	Also, check if the block is moved to the cold region has the expected call instruction?

We probably need to discuss how to make basic-block-section stuff work on non-X86 targets at some point, but I guess we don't have to do it in this patch if it's off by default.

I am wondering what is is your opinion on machine unroller/reroller? Aggressive loop unrolling may destroy code cache too.

Regarding reroller -- compiler with PGO will adjust the agressiveness of the unroller based on instruction workset size estimation. Doing this in later pass or in Propeller can help catch cases that are mis-handled.

snehasish mentioned this in D85380: [NFC] Rename BBSectionsPrepare -> BasicBlockSections..Aug 5 2020, 5:16 PM

davidxl added inline comments.Aug 5 2020, 5:23 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
90	Add an internal option here (the coldness threshold) for experimental purpose. I also suggest add an option to specify programSummary based coldness threshold such as 99.99 percentile coldness. The default cutoff is 99.9999% defined in ProfileSummaryInfo.cpp: ProfileSummaryCutoffCold

Harbormaster completed remote builds in B67210: Diff 283424.Aug 5 2020, 5:39 PM

dmajor added a subscriber: dmajor.Aug 5 2020, 6:12 PM

Please share performance numbers for publicly available workload(s).

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
78	Do we need to renumber?

snehasish mentioned this in rG8d943a928d25: [NFC] Rename BBSectionsPrepare -> BasicBlockSections..Aug 6 2020, 1:12 PM

tmsriram added inline comments.Aug 6 2020, 8:21 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
78	Renumbering makes the sorting easy. The sorting will preserve the basic block order for the blocks that are not split.

Updated diff based on review comments.

Added two mllvm options to control cold count and threshold based split.
Added tests for the new options.
Updated test to check for the absence of unexpected blocks.
Renamed BBSectionsPrepare pass and rebased this diff on the change.

snehasish edited the summary of this revision. (Show Details)Aug 7 2020, 12:14 PM

snehasish edited the summary of this revision. (Show Details)

snehasish edited the summary of this revision. (Show Details)Aug 7 2020, 12:17 PM

snehasish marked 5 inline comments as done.Aug 7 2020, 12:24 PM

snehasish added inline comments.

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
78	We need to ensure that the order is preserved so that we don't perturb the decisions made by prior passes such as MachineBlockPlacement. Renumbering simplifies the code that needs to be shared with the BasicBlockSections pass.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	We can't register two options with the same string, i.e. "split-machine-functions".

Simplify the cold count check.

Harbormaster completed remote builds in B67499: Diff 283979.Aug 7 2020, 12:44 PM

Harbormaster completed remote builds in B67509: Diff 283991.Aug 7 2020, 1:26 PM

hiraditya added inline comments.Aug 7 2020, 5:23 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
78	I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections

We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017.

Could you share the details of the machine as well? The improvements are well within noise.

For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses

While itlb reduction looks quite impressive, it doesn't seem to translate quite well to the runtime improvement. Did we see consistent >2% improvement with multiple runs? Please share the numbers.

tmsriram added inline comments.Aug 7 2020, 5:32 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
78	The "bbsections-prepare" pass and the machine function splitter pass are intentionally made mutually exclusive. If bbsections is explicitly requested, machine function splitter does not apply. Please see the change in TargetPassConfig.cpp

Could you share the details of the machine as well?

Sure, these were measured on a Lenovo P920 workstation -- Intel Skylake based Xeon(R) Gold 6154 CPU.

The improvements are well within noise.

For SPEC, the reported intrate improvement numbers are an average across 5 iterations. Note that SPEC binaries are tiny in size may only improve code locality in some cases.

While itlb reduction looks quite impressive, it doesn't seem to translate quite well to the runtime improvement.

It stands to reason that removing the itlb bottleneck will expose the next one :) We could dig deeper by looking into how the top down profile changes with and without splitting.

Did we see consistent >2% improvement with multiple runs? Please share the numbers.

We see consistent 2%+ improvements over FDO optimized binaries. The numbers reported are averaged across 10 runs, here is the data for one such experiment where 500 invocations of clang were executed and the overall end to end user time was measured. For completeness, I have included the data for a hot-cold-split optimized binary as well. Note this particular experiment does not use ThinLTO for any of the builds since I had some trouble running the hot-cold-split pass with ThinLTO enabled.

|----------------------------------|----------------------|----------------|-----------|
|                                  | User time in seconds ($ time run-commands.sh)     |
|----------------------------------|----------------------|----------------|-----------|
| Run #                            | FDO baseline         | Hot cold split | MFS       |
|                                1 |               484.65 |          479.2 |    466.93 |
|                                2 |                483.4 |         478.28 |    470.25 |
|                                3 |               485.57 |         479.15 |    470.36 |
|                                4 |               480.37 |         480.34 |    469.85 |
|                                5 |               482.97 |         478.18 |    471.93 |
|                                6 |               484.06 |         479.74 |    473.27 |
|                                7 |               482.67 |         477.42 |    472.56 |
|                                8 |               483.53 |         476.99 |    474.58 |
|                                9 |               486.43 |         480.76 |    473.92 |
|                               10 |               489.94 |         480.11 |    471.42 |
|----------------------------------|----------------------|----------------|-----------|
| 2 Tail Paired T-Test vs Baseline |                      |      0.0000636 | 0.0000006 |
|----------------------------------|----------------------|----------------|-----------|
| Average                          |              484.359 |        479.017 |   471.507 |
|----------------------------------|----------------------|----------------|-----------|
| % Change                         |                      |           1.10 |      2.65 |
|----------------------------------|----------------------|----------------|-----------|

Here is the data for TLB and icache. Each event was collected independently along with instructions to ensure no multiplexing. The variance reported by perf was less than 1% for each event (often less than 0.5%).

|-----------|--------------------------------------------|--------------------------------------------------------|
|           | $ perf stat -r 3 -e frontend_retired.${EVENT}:u,instructions:u -- run-commands.sh                   |
|-----------|--------------------------------------------|--------------------------------------------------------|
|           | Machine Function Splitter                  | FDO Baseline                                           |
|-----------|--------------------------------------------|--------------------------------------------------------|
| EVENT     | Misses        | Instructions      | MPKI   | Misses         | Instructions      | MPKI   | % Change |
| itlb_miss | 1,411,325,040 | 1,618,495,692,919 | 0.8720 |  2,066,003,373 | 1,618,097,715,534 | 1.2768 |    31.70 |
| stlb_miss |   131,949,440 | 1,618,466,757,079 | 0.0815 |    195,471,938 | 1,618,061,281,016 | 0.1208 |    32.51 |
| l1i_miss  | 9,678,255,804 | 1,618,479,987,914 | 5.9798 | 10,698,143,090 | 1,618,081,273,918 | 6.6116 |     9.56 |
| l2_miss   |   434,287,963 | 1,618,443,723,597 | 0.2683 |    542,869,835 | 1,618,081,904,973 | 0.3355 |    20.02 |
|-----------|--------------------------------------------|--------------------------------------------------------|

Update PSI metadata to fix assert failure.

Harbormaster completed remote builds in B67614: Diff 284169.Aug 8 2020, 11:50 PM

Thanks for adding the results, could you share the script to measure bootstrap numbers?

In HCS the ability to keep cold functions in a separate section was added in: D85331 (cc: @rjf ), can we try with -mllvm -enable-cold-section to compare with MachineFuncionSplitter.

In D85368#2205653, @hiraditya wrote:

Thanks for adding the results, could you share the script to measure bootstrap numbers?

I've uploaded a Makefile here which will allow you to run the bootstrap benchmarks. Applying this patch on a local llvm repo and pointing the Makefile at it should be sufficient to get you going.

In HCS the ability to keep cold functions in a separate section was added in: D85331 (cc: @rjf ), can we try with -mllvm -enable-cold-section to compare with MachineFuncionSplitter.

We already incorporate this in our evaluation since we link using lld along with the flag -z,keep-text-section-prefix. Since the extracted functions are marked cold, they are assigned a .text.unlikely prefix. Passing -z,keep-text-section-prefix to lld ensures that these functions are placed in the appropriate output section achieving the same goal of improving locality for hot code. The impact of this can be seen in the binary characteristics we shared in the original RFC which showed a 41% and 47% decrease in size of .text and .text.hot respectively for the hot cold split pass.

We applied patch D85331 and find similar results. Comparing the sections of the binary (hot cold split vs PGO baseline), we find a new __llvm_cold section along with similar fractions of code extracted from .text and .text.hot --

   FILE SIZE        VM SIZE    
--------------  -------------- 
 [NEW] +7.31Mi  [NEW] +7.31Mi    __llvm_cold
  +64% +3.21Mi   +64% +3.21Mi    .eh_frame
  +26% +2.77Mi  [ = ]       0    .strtab
  +27%  +711Ki  [ = ]       0    .symtab
  +31%  +236Ki   +31%  +236Ki    .eh_frame_hdr
 +3.1%     +12  [ = ]       0    .shstrtab
  +29%      +9   +29%      +9    [LOAD #3 [RX]]
 +0.8%      +6  +0.8%      +6    [LOAD #2 [R]]
 -7.1%      -1  [ = ]       0    [Unmapped]
 -0.0%    -246  -0.0%    -246    .dynstr
 -0.1% -4.05Ki  -0.1% -4.05Ki    .rodata
 -1.4% -5.73Ki  -1.4% -5.73Ki    .text.startup
 -1.6%  -536Ki  -1.6%  -536Ki    .text.unlikely
-46.7% -2.51Mi -46.7% -2.51Mi    .text.hot
-43.8% -2.89Mi -43.8% -2.89Mi    .text
  +10% +8.28Mi  +7.2% +4.82Mi    TOTAL

Running the benchmarks with the patch enabled and comparing against the MachineFunctionSplitter. We see similar performance numbers --

|----------|----------|----------------|---------------------------|
| Run #    | Baseline | Hot Cold Split | Machine Function Splitter |
|----------|----------|----------------|---------------------------|
|        1 |   501.25 |         490.84 |                    489.48 |
|        2 |   504.22 |         491.66 |                    493.42 |
|        3 |   500.04 |          492.7 |                    489.18 |
|        4 |    499.4 |         493.31 |                    489.47 |
|        5 |   495.62 |          496.1 |                    488.79 |
|        6 |   500.62 |         495.61 |                    488.41 |
|        7 |   501.81 |         494.45 |                    487.67 |
|        8 |   496.96 |         495.91 |                    490.91 |
|        9 |   500.22 |         497.17 |                       488 |
|       10 |   499.66 |         493.81 |                    489.65 |
|----------|----------|----------------|---------------------------|
| Average  |   499.98 |        494.156 |                   489.498 |
| % Change |          |           1.16 |                      2.10 |
|----------|----------|----------------|---------------------------|

Add an option to exclude specific sections.

Add option -mfs-excluded-sections to allow users to specify section names to exclude.
Add a test for the option.

Harbormaster completed remote builds in B67836: Diff 284560.Aug 10 2020, 8:17 PM

Overall approach looks good to me even when we don't see good SPEC-17 numbers as the optimization is intended to reduce page faults. The improvements would be more pronounced in large applications. I'll review the code in more detail in the next few days. Thanks for working on this.

hiraditya added a reviewer: hiraditya.Aug 14 2020, 12:07 AM

please run clang-format on the patch.

llvm/lib/CodeGen/BasicBlockSections.cpp
232	nit: tab?
277	Is this comment necessary?

This revision now requires changes to proceed.Aug 14 2020, 12:10 AM

Can we add more test cases to include eh_pad, invokeinst,

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
114	std::find?

tmsriram added inline comments.Aug 18 2020, 12:35 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	I think excluding sections needs a bit more thought and we should do this as a separate patch if it is useful but I think a linker solution would be more favorable. From what I understand, when a user specifies section names using the section keyword, then the expectation is that all functions marked with that section name will be grouped together. With function splitting, since you attach the ".cold" suffix to such sections that are split, there is no guarantee that the linker will place them together as these are not prefixed as ".text". To overcome the above problem, the option to exclude such sections from being split is not ideal either as it moves the burden to the user to get this right with appropriate options. I think the temporary fix is to not split sections which are not prefixed as ".text". You can add a "FIXME:" comment here to describe why you are doing this. Moving forward, we can look at a linker solution where '.' is treated as a valid section name separator and sections with identical prefix before the "." are always grouped together even if they are not named ".text". I think we can move this handling as an enhancement in another patch.

tmsriram added inline comments.Aug 18 2020, 12:37 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	Correction: I meant ".unlikely" and not ".cold".

tschuett added a subscriber: tschuett.Aug 18 2020, 12:38 PM

efriedma added inline comments.Aug 18 2020, 1:02 PM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	I think I'd rather make splitting for functions with an explicit section attribute opt-in, rather than opt-out. The user might have a strong need to emit a function in a particular section (for example, if the name is mentioned in a linker script). If someone is messing with section attributes in the first place, I'd like to be conservative by default.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	Why do we need two options to control the same thing?

Address review comments.

Remove excluded section attribute option.
Don't split functions which have a section attribute.
Add a test for ehpads.
Remove redundant comments and clang-format.

Thanks for the reviews!
@hiraditya I've added a test to ensure ehpads are not split out. Let me know if there are additional cases you want to cover.

llvm/lib/CodeGen/BasicBlockSections.cpp
232	This seems to be the output from clang-format. The diff does not have a tab but it looks like phabricator is showing it as one?
llvm/lib/CodeGen/MachineFunctionSplitter.cpp
112	@tmsriram I think you meant to exclude such functions from being split. I agree, taking into consideration @efriedma's comment, I've removed the option and made this a conservative check on the section attribute. Any function with the section attribute set is not split.
114	Code referencing this comment was removed.
llvm/lib/CodeGen/TargetPassConfig.cpp
218	In this patch we added two options An option in llvm/lib/CodeGen/CommandFlags.cpp "split-machine-functions" so that llc can be used to invoke it in the tests. We added a temporary option in llvm/lib/CodeGen/TargetPassConfig.cpp so that it can be invoked when running with clang or lld (for LTO). AFAICT we cant use (2) for tests and having (1) makes it easy to compile things without an intermediate llc step. We plan on removing (2) in a future patch which will add appropriate options to clang (-fsplit-machine-functions) and lld (--lto-split-machine-functions).

Harbormaster completed remote builds in B68836: Diff 286459.Aug 18 2020, 8:34 PM

snehasish added inline comments.Aug 18 2020, 9:33 PM

llvm/lib/CodeGen/TargetPassConfig.cpp
218	Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc however since we plan to introduce clang and lld flags in the near future it seems cleaner to leave the llc flag in place and just remove (2) when that happens rather than reintroduce it. WDYT?

efriedma added subscribers: serge-sans-paille, MaskRay.Aug 19 2020, 4:22 PM

efriedma added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
218	clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing.

snehasish added inline comments.Aug 20 2020, 1:05 PM

llvm/lib/CodeGen/TargetPassConfig.cpp

218

The clang driver does not register the codegen flags, the only clang tool which does is clang-fuzzer. A small patch like the one below would do the trick for basic functionality. More plumbing might be needed to print the appropriate flags from the driver. I think this is probably worth more discussion and beyond the scope of this patch.

diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp
index 87047be3c2b..0b9b5673d3e 100644
--- a/clang/tools/driver/cc1as_main.cpp
+++ b/clang/tools/driver/cc1as_main.cpp
@@ -21,6 +21,7 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/ADT/Triple.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/MC/MCAsmBackend.h"
 #include "llvm/MC/MCAsmInfo.h"
@@ -61,6 +62,8 @@ using namespace clang::driver::options;
 using namespace llvm;
 using namespace llvm::opt;
 
+static codegen::RegisterCodeGenFlags CGF;
+
 namespace {
 
 /// Helper class for representing a single invocation of the assembler.

LGTM, thanks for doing this!

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
110	Maybe a comment here that this is useful while sorting the blocks?
118	Do we need a fixme comment here to say that we could split out landing pads if the exception patch for bb sections lands?

Add a comment to explain renumbering, FIXME for ehpads.

Added a comment to explain why renumbering blocks is necessary.
Added a FIXME and pointer to exceptions splitting patch.

Thanks for the comments.
@efriedma @hiraditya - Let me know if you have any further comments, I will wait till EOD Thursday 08/26. If not I'll take that as go ahead to commit this change.

Harbormaster completed remote builds in B69514: Diff 287785.Aug 25 2020, 4:50 PM

hiraditya added inline comments.Aug 28 2020, 8:43 AM

llvm/lib/CodeGen/MachineFunctionSplitter.cpp
124	nit: redundant braces,
llvm/lib/CodeGen/TargetPassConfig.cpp
216	Remove this FIXME. most passes continue to have cl::opt anyways.

nit: If FIXME's are mostly future works, then please replace them with TODOs.

Address reviewer comments.

Remove redundant braces.
Remove FIXME for cl::opt.
s/FIXME/TODO for future work.

snehasish marked 2 inline comments as done.Aug 28 2020, 10:04 AM

Harbormaster completed remote builds in B69935: Diff 288643.Aug 28 2020, 10:50 AM

Thanks for the comments all! The builds look green and I'm going to go ahead and push this.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 28 2020, 11:13 AM

Closed by commit rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter (authored by snehasish). · Explain Why

This revision was automatically updated to reflect the committed changes.

snehasish added a commit: rG94faadaca4e1: [llvm][CodeGen] Machine Function Splitter.

wxiao3 mentioned this in D94215: [PostRASched] Breaking More CriticalAntiDeps.Apr 26 2021, 7:37 AM

shenhan mentioned this in D152399: [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO. .Jun 7 2023, 2:44 PM

shenhan mentioned this in rG8df75969ae70: [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO..Jul 10 2023, 4:02 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

BasicBlockSectionUtils.h

27 lines

2 lines

3 lines

4 lines

1 line

Target/

TargetOptions.h

15 lines

lib/

CodeGen/

BasicBlockSections.cpp

87 lines

CMakeLists.txt

1 line

CommandFlags.cpp

9 lines

MachineFunctionSplitter.cpp

148 lines

TargetPassConfig.cpp

14 lines

test/

CodeGen/

X86/

machine-function-splitter.ll

218 lines

Diff 288652

llvm/include/llvm/CodeGen/BasicBlockSectionUtils.h

This file was added.

				//===- BasicBlockSectionUtils.h - Utilities for basic block sections --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H
				#define LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H

				#include "llvm/ADT/STLExtras.h"

				namespace llvm {

				class MachineFunction;
				class MachineBasicBlock;

				using MachineBasicBlockComparator =
				function_ref<bool(const MachineBasicBlock &, const MachineBasicBlock &)>;

				void sortBasicBlocksAndUpdateBranches(MachineFunction &MF,
				MachineBasicBlockComparator MBBCmp);

				} // end namespace llvm

				#endif // LLVM_CODEGEN_BASICBLOCKSECTIONUTILS_H

llvm/include/llvm/CodeGen/CommandFlags.h

	Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
	llvm::DebuggerKind getDebuggerTuningOpt();			llvm::DebuggerKind getDebuggerTuningOpt();

	bool getEnableStackSizeSection();			bool getEnableStackSizeSection();

	bool getEnableAddrsig();			bool getEnableAddrsig();

	bool getEmitCallSiteInfo();			bool getEmitCallSiteInfo();

				bool getEnableMachineFunctionSplitter();

	bool getEnableDebugEntryValues();			bool getEnableDebugEntryValues();

	bool getValueTrackingVariableLocations();			bool getValueTrackingVariableLocations();

	bool getForceDwarfFrameSection();			bool getForceDwarfFrameSection();

	bool getXRayOmitFunctionIndex();			bool getXRayOmitFunctionIndex();

	Show All 29 Lines

llvm/include/llvm/CodeGen/MachineFunction.h

Show First 20 Lines • Show All 488 Lines • ▼ Show 20 Lines	public:
StringRef getName() const;		StringRef getName() const;

/// getFunctionNumber - Return a unique ID for the current function.		/// getFunctionNumber - Return a unique ID for the current function.
unsigned getFunctionNumber() const { return FunctionNumber; }		unsigned getFunctionNumber() const { return FunctionNumber; }

/// Returns true if this function has basic block sections enabled.		/// Returns true if this function has basic block sections enabled.
bool hasBBSections() const {		bool hasBBSections() const {
return (BBSectionsType == BasicBlockSection::All \|\|		return (BBSectionsType == BasicBlockSection::All \|\|
BBSectionsType == BasicBlockSection::List);		BBSectionsType == BasicBlockSection::List \|\|
		BBSectionsType == BasicBlockSection::Preset);
}		}

/// Returns true if basic block labels are to be generated for this function.		/// Returns true if basic block labels are to be generated for this function.
bool hasBBLabels() const {		bool hasBBLabels() const {
return BBSectionsType == BasicBlockSection::Labels;		return BBSectionsType == BasicBlockSection::Labels;
}		}

void setBBSectionsType(BasicBlockSection V) { BBSectionsType = V; }		void setBBSectionsType(BasicBlockSection V) { BBSectionsType = V; }
▲ Show 20 Lines • Show All 646 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	namespace llvm {
FunctionPass *createUnreachableBlockEliminationPass();		FunctionPass *createUnreachableBlockEliminationPass();

/// createBasicBlockSections Pass - This pass assigns sections to machine		/// createBasicBlockSections Pass - This pass assigns sections to machine
/// basic blocks and is enabled with -fbasic-block-sections. Buf is a memory		/// basic blocks and is enabled with -fbasic-block-sections. Buf is a memory
/// buffer that contains the list of functions and basic block ids to		/// buffer that contains the list of functions and basic block ids to
/// selectively enable basic block sections.		/// selectively enable basic block sections.
MachineFunctionPass createBasicBlockSectionsPass(const MemoryBuffer Buf);		MachineFunctionPass createBasicBlockSectionsPass(const MemoryBuffer Buf);

		/// createMachineFunctionSplitterPass - This pass splits machine functions
		/// using profile information.
		MachineFunctionPass *createMachineFunctionSplitterPass();

/// MachineFunctionPrinter pass - This pass prints out the machine function to		/// MachineFunctionPrinter pass - This pass prints out the machine function to
/// the given stream as a debugging tool.		/// the given stream as a debugging tool.
MachineFunctionPass *		MachineFunctionPass *
createMachineFunctionPrinterPass(raw_ostream &OS,		createMachineFunctionPrinterPass(raw_ostream &OS,
const std::string &Banner ="");		const std::string &Banner ="");

/// MIRPrinting pass - this pass prints out the LLVM IR into the given stream		/// MIRPrinting pass - this pass prints out the LLVM IR into the given stream
/// using the MIR serialization format.		/// using the MIR serialization format.
▲ Show 20 Lines • Show All 431 Lines • Show Last 20 Lines

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 274 Lines • ▼ Show 20 Lines
	void initializeMachineBlockPlacementStatsPass(PassRegistry&);			void initializeMachineBlockPlacementStatsPass(PassRegistry&);
	void initializeMachineBranchProbabilityInfoPass(PassRegistry&);			void initializeMachineBranchProbabilityInfoPass(PassRegistry&);
	void initializeMachineCSEPass(PassRegistry&);			void initializeMachineCSEPass(PassRegistry&);
	void initializeMachineCombinerPass(PassRegistry&);			void initializeMachineCombinerPass(PassRegistry&);
	void initializeMachineCopyPropagationPass(PassRegistry&);			void initializeMachineCopyPropagationPass(PassRegistry&);
	void initializeMachineDominanceFrontierPass(PassRegistry&);			void initializeMachineDominanceFrontierPass(PassRegistry&);
	void initializeMachineDominatorTreePass(PassRegistry&);			void initializeMachineDominatorTreePass(PassRegistry&);
	void initializeMachineFunctionPrinterPassPass(PassRegistry&);			void initializeMachineFunctionPrinterPassPass(PassRegistry&);
				void initializeMachineFunctionSplitterPass(PassRegistry &);
	void initializeMachineLICMPass(PassRegistry&);			void initializeMachineLICMPass(PassRegistry&);
	void initializeMachineLoopInfoPass(PassRegistry&);			void initializeMachineLoopInfoPass(PassRegistry&);
	void initializeMachineModuleInfoWrapperPassPass(PassRegistry &);			void initializeMachineModuleInfoWrapperPassPass(PassRegistry &);
	void initializeMachineOptimizationRemarkEmitterPassPass(PassRegistry&);			void initializeMachineOptimizationRemarkEmitterPassPass(PassRegistry&);
	void initializeMachineOutlinerPass(PassRegistry&);			void initializeMachineOutlinerPass(PassRegistry&);
	void initializeMachinePipelinerPass(PassRegistry&);			void initializeMachinePipelinerPass(PassRegistry&);
	void initializeMachinePostDominatorTreePass(PassRegistry&);			void initializeMachinePostDominatorTreePass(PassRegistry&);
	void initializeMachineRegionInfoPassPass(PassRegistry&);			void initializeMachineRegionInfoPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	enum class BasicBlockSection {
All, // Use Basic Block Sections for all basic blocks. A section		All, // Use Basic Block Sections for all basic blocks. A section
// for every basic block can significantly bloat object file sizes.		// for every basic block can significantly bloat object file sizes.
List, // Get list of functions & BBs from a file. Selectively enables		List, // Get list of functions & BBs from a file. Selectively enables
// basic block sections for a subset of basic blocks which can be		// basic block sections for a subset of basic blocks which can be
// used to control object size bloats from creating sections.		// used to control object size bloats from creating sections.
Labels, // Do not use Basic Block Sections but label basic blocks. This		Labels, // Do not use Basic Block Sections but label basic blocks. This
// is useful when associating profile counts from virtual addresses		// is useful when associating profile counts from virtual addresses
// to basic blocks.		// to basic blocks.
		Preset, // Similar to list but the blocks are identified by passes which
		// seek to use Basic Block Sections, e.g. MachineFunctionSplitter.
		// This option cannot be set via the command line.
None // Do not use Basic Block Sections.		None // Do not use Basic Block Sections.
};		};

enum class EABI {		enum class EABI {
Unknown,		Unknown,
Default, // Default means not specified		Default, // Default means not specified
EABI4, // Target-specific (either 4, 5 or gnu depending on triple).		EABI4, // Target-specific (either 4, 5 or gnu depending on triple).
EABI5,		EABI5,
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	TargetOptions()
GuaranteedTailCallOpt(false), StackSymbolOrdering(true),		GuaranteedTailCallOpt(false), StackSymbolOrdering(true),
EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),		EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),
DisableIntegratedAS(false), RelaxELFRelocations(false),		DisableIntegratedAS(false), RelaxELFRelocations(false),
FunctionSections(false), DataSections(false),		FunctionSections(false), DataSections(false),
UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),		UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),
TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),		TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),
EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),		EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),
EmitStackSizeSection(false), EnableMachineOutliner(false),		EmitStackSizeSection(false), EnableMachineOutliner(false),
SupportsDefaultOutlining(false), EmitAddrsig(false),		EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
EmitCallSiteInfo(false), SupportsDebugEntryValues(false),		EmitAddrsig(false), EmitCallSiteInfo(false),
EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),		SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),		ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false),
		XRayOmitFunctionIndex(false),
FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}		FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

/// DisableFramePointerElim - This returns true if frame pointer elimination		/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.		/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;		bool DisableFramePointerElim(const MachineFunction &MF) const;

/// UnsafeFPMath - This flag is enabled when the		/// UnsafeFPMath - This flag is enabled when the
/// -enable-unsafe-fp-math flag is specified on the command line. When		/// -enable-unsafe-fp-math flag is specified on the command line. When
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	public:
unsigned EnableIPRA : 1;		unsigned EnableIPRA : 1;

/// Emit section containing metadata on function stack sizes.		/// Emit section containing metadata on function stack sizes.
unsigned EmitStackSizeSection : 1;		unsigned EmitStackSizeSection : 1;

/// Enables the MachineOutliner pass.		/// Enables the MachineOutliner pass.
unsigned EnableMachineOutliner : 1;		unsigned EnableMachineOutliner : 1;

		/// Enables the MachineFunctionSplitter pass.
		unsigned EnableMachineFunctionSplitter : 1;

/// Set if the target supports default outlining behaviour.		/// Set if the target supports default outlining behaviour.
unsigned SupportsDefaultOutlining : 1;		unsigned SupportsDefaultOutlining : 1;

/// Emit address-significance table.		/// Emit address-significance table.
unsigned EmitAddrsig : 1;		unsigned EmitAddrsig : 1;

/// Emit basic blocks into separate sections.		/// Emit basic blocks into separate sections.
BasicBlockSection BBSections = BasicBlockSection::None;		BasicBlockSection BBSections = BasicBlockSection::None;
▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

llvm/lib/CodeGen/BasicBlockSections.cpp

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/CodeGen/BasicBlockSectionUtils.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/CodeGen/TargetInstrInfo.h"		#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/LineIterator.h"		#include "llvm/Support/LineIterator.h"
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
// All explicitly specified clusters of basic blocks will be ordered		// All explicitly specified clusters of basic blocks will be ordered
// accordingly. All non-specified BBs go into a separate "Cold" section.		// accordingly. All non-specified BBs go into a separate "Cold" section.
// Additionally, if exception handling landing pads end up in more than one		// Additionally, if exception handling landing pads end up in more than one
// clusters, they are moved into a single "Exception" section. Eventually,		// clusters, they are moved into a single "Exception" section. Eventually,
// clusters are ordered in increasing order of their IDs, with the "Exception"		// clusters are ordered in increasing order of their IDs, with the "Exception"
// and "Cold" succeeding all other clusters.		// and "Cold" succeeding all other clusters.
// FuncBBClusterInfo represent the cluster information for basic blocks. If this		// FuncBBClusterInfo represent the cluster information for basic blocks. If this
// is empty, it means unique sections for all basic blocks in the function.		// is empty, it means unique sections for all basic blocks in the function.
static bool assignSectionsAndSortBasicBlocks(		static void
MachineFunction &MF,		assignSections(MachineFunction &MF,
const std::vector<Optional<BBClusterInfo>> &FuncBBClusterInfo) {		const std::vector<Optional<BBClusterInfo>> &FuncBBClusterInfo) {
		hiradityaUnsubmitted Not Done Reply Inline Actions nit: tab? hiraditya: nit: tab?
		snehasishAuthorUnsubmitted Done Reply Inline Actions This seems to be the output from clang-format. The diff does not have a tab but it looks like phabricator is showing it as one? snehasish: This seems to be the output from clang-format. The diff does not have a tab but it looks like…
assert(MF.hasBBSections() && "BB Sections is not set for function.");		assert(MF.hasBBSections() && "BB Sections is not set for function.");
// This variable stores the section ID of the cluster containing eh_pads (if		// This variable stores the section ID of the cluster containing eh_pads (if
// all eh_pads are one cluster). If more than one cluster contain eh_pads, we		// all eh_pads are one cluster). If more than one cluster contain eh_pads, we
// set it equal to ExceptionSectionID.		// set it equal to ExceptionSectionID.
Optional<MBBSectionID> EHPadsSectionID;		Optional<MBBSectionID> EHPadsSectionID;

for (auto &MBB : MF) {		for (auto &MBB : MF) {
// With the 'all' option, every basic block is placed in a unique section.		// With the 'all' option, every basic block is placed in a unique section.
Show All 26 Lines	assignSections(MachineFunction &MF,
}		}

// If EHPads are in more than one section, this places all of them in the		// If EHPads are in more than one section, this places all of them in the
// special exception section.		// special exception section.
if (EHPadsSectionID == MBBSectionID::ExceptionSectionID)		if (EHPadsSectionID == MBBSectionID::ExceptionSectionID)
for (auto &MBB : MF)		for (auto &MBB : MF)
if (MBB.isEHPad())		if (MBB.isEHPad())
MBB.setSectionID(EHPadsSectionID.getValue());		MBB.setSectionID(EHPadsSectionID.getValue());
		}

		void llvm::sortBasicBlocksAndUpdateBranches(
		hiradityaUnsubmitted Done Reply Inline Actions Is this comment necessary? hiraditya: Is this comment necessary?
		MachineFunction &MF, MachineBasicBlockComparator MBBCmp) {
SmallVector<MachineBasicBlock *, 4> PreLayoutFallThroughs(		SmallVector<MachineBasicBlock *, 4> PreLayoutFallThroughs(
MF.getNumBlockIDs());		MF.getNumBlockIDs());
for (auto &MBB : MF)		for (auto &MBB : MF)
PreLayoutFallThroughs[MBB.getNumber()] = MBB.getFallThrough();		PreLayoutFallThroughs[MBB.getNumber()] = MBB.getFallThrough();

		MF.sort(MBBCmp);

		// Set IsBeginSection and IsEndSection according to the assigned section IDs.
		MF.assignBeginEndSections();

		// After reordering basic blocks, we must update basic block branches to
		// insert explicit fallthrough branches when required and optimize branches
		// when possible.
		updateBranches(MF, PreLayoutFallThroughs);
		}

		bool BasicBlockSections::runOnMachineFunction(MachineFunction &MF) {
		auto BBSectionsType = MF.getTarget().getBBSectionsType();
		assert(BBSectionsType != BasicBlockSection::None &&
		"BB Sections not enabled!");
		// Renumber blocks before sorting them for basic block sections. This is
		// useful during sorting, basic blocks in the same section will retain the
		// default order. This renumbering should also be done for basic block
		// labels to match the profiles with the correct blocks.
		MF.RenumberBlocks();

		if (BBSectionsType == BasicBlockSection::Labels) {
		MF.setBBSectionsType(BBSectionsType);
		MF.createBBLabels();
		return true;
		}

		std::vector<Optional<BBClusterInfo>> FuncBBClusterInfo;
		if (BBSectionsType == BasicBlockSection::List &&
		!getBBClusterInfoForFunction(MF, FuncAliasMap, ProgramBBClusterInfo,
		FuncBBClusterInfo))
		return true;
		MF.setBBSectionsType(BBSectionsType);
		MF.createBBLabels();
		assignSections(MF, FuncBBClusterInfo);

// We make sure that the cluster including the entry basic block precedes all		// We make sure that the cluster including the entry basic block precedes all
// other clusters.		// other clusters.
auto EntryBBSectionID = MF.front().getSectionID();		auto EntryBBSectionID = MF.front().getSectionID();

// Helper function for ordering BB sections as follows:		// Helper function for ordering BB sections as follows:
// * Entry section (section including the entry block).		// * Entry section (section including the entry block).
// * Regular sections (in increasing order of their Number).		// * Regular sections (in increasing order of their Number).
// ...		// ...
// * Exception section		// * Exception section
// * Cold section		// * Cold section
auto MBBSectionOrder = [EntryBBSectionID](const MBBSectionID &LHS,		auto MBBSectionOrder = [EntryBBSectionID](const MBBSectionID &LHS,
const MBBSectionID &RHS) {		const MBBSectionID &RHS) {
// We make sure that the section containing the entry block precedes all the		// We make sure that the section containing the entry block precedes all the
// other sections.		// other sections.
if (LHS == EntryBBSectionID \|\| RHS == EntryBBSectionID)		if (LHS == EntryBBSectionID \|\| RHS == EntryBBSectionID)
return LHS == EntryBBSectionID;		return LHS == EntryBBSectionID;
return LHS.Type == RHS.Type ? LHS.Number < RHS.Number : LHS.Type < RHS.Type;		return LHS.Type == RHS.Type ? LHS.Number < RHS.Number : LHS.Type < RHS.Type;
};		};

// We sort all basic blocks to make sure the basic blocks of every cluster are		// We sort all basic blocks to make sure the basic blocks of every cluster are
// contiguous and ordered accordingly. Furthermore, clusters are ordered in		// contiguous and ordered accordingly. Furthermore, clusters are ordered in
// increasing order of their section IDs, with the exception and the		// increasing order of their section IDs, with the exception and the
// cold section placed at the end of the function.		// cold section placed at the end of the function.
MF.sort([&](MachineBasicBlock &X, MachineBasicBlock &Y) {		auto Comparator = [&](const MachineBasicBlock &X,
		const MachineBasicBlock &Y) {
auto XSectionID = X.getSectionID();		auto XSectionID = X.getSectionID();
auto YSectionID = Y.getSectionID();		auto YSectionID = Y.getSectionID();
if (XSectionID != YSectionID)		if (XSectionID != YSectionID)
return MBBSectionOrder(XSectionID, YSectionID);		return MBBSectionOrder(XSectionID, YSectionID);
// If the two basic block are in the same section, the order is decided by		// If the two basic block are in the same section, the order is decided by
// their position within the section.		// their position within the section.
if (XSectionID.Type == MBBSectionID::SectionType::Default)		if (XSectionID.Type == MBBSectionID::SectionType::Default)
return FuncBBClusterInfo[X.getNumber()]->PositionInCluster <		return FuncBBClusterInfo[X.getNumber()]->PositionInCluster <
FuncBBClusterInfo[Y.getNumber()]->PositionInCluster;		FuncBBClusterInfo[Y.getNumber()]->PositionInCluster;
return X.getNumber() < Y.getNumber();		return X.getNumber() < Y.getNumber();
});		};

// Set IsBeginSection and IsEndSection according to the assigned section IDs.
MF.assignBeginEndSections();

// After reordering basic blocks, we must update basic block branches to
// insert explicit fallthrough branches when required and optimize branches
// when possible.
updateBranches(MF, PreLayoutFallThroughs);

return true;
}

bool BasicBlockSections::runOnMachineFunction(MachineFunction &MF) {
auto BBSectionsType = MF.getTarget().getBBSectionsType();
assert(BBSectionsType != BasicBlockSection::None &&
"BB Sections not enabled!");
// Renumber blocks before sorting them for basic block sections. This is
// useful during sorting, basic blocks in the same section will retain the
// default order. This renumbering should also be done for basic block
// labels to match the profiles with the correct blocks.
MF.RenumberBlocks();

if (BBSectionsType == BasicBlockSection::Labels) {
MF.setBBSectionsType(BBSectionsType);
MF.createBBLabels();
return true;
}

std::vector<Optional<BBClusterInfo>> FuncBBClusterInfo;		sortBasicBlocksAndUpdateBranches(MF, Comparator);
if (BBSectionsType == BasicBlockSection::List &&
!getBBClusterInfoForFunction(MF, FuncAliasMap, ProgramBBClusterInfo,
FuncBBClusterInfo))
return true;
MF.setBBSectionsType(BBSectionsType);
MF.createBBLabels();
assignSectionsAndSortBasicBlocks(MF, FuncBBClusterInfo);
return true;		return true;
}		}

// Basic Block Sections can be enabled for a subset of machine basic blocks.		// Basic Block Sections can be enabled for a subset of machine basic blocks.
// This is done by passing a file containing names of functions for which basic		// This is done by passing a file containing names of functions for which basic
// block sections are desired. Additionally, machine basic block ids of the		// block sections are desired. Additionally, machine basic block ids of the
// functions can also be specified for a finer granularity. Moreover, a cluster		// functions can also be specified for a finer granularity. Moreover, a cluster
// of basic blocks could be assigned to the same section.		// of basic blocks could be assigned to the same section.
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
MachineCSE.cpp		MachineCSE.cpp
MachineDebugify.cpp		MachineDebugify.cpp
MachineDominanceFrontier.cpp		MachineDominanceFrontier.cpp
MachineDominators.cpp		MachineDominators.cpp
MachineFrameInfo.cpp		MachineFrameInfo.cpp
MachineFunction.cpp		MachineFunction.cpp
MachineFunctionPass.cpp		MachineFunctionPass.cpp
MachineFunctionPrinterPass.cpp		MachineFunctionPrinterPass.cpp
		MachineFunctionSplitter.cpp
MachineInstrBundle.cpp		MachineInstrBundle.cpp
MachineInstr.cpp		MachineInstr.cpp
MachineLICM.cpp		MachineLICM.cpp
MachineLoopInfo.cpp		MachineLoopInfo.cpp
MachineLoopUtils.cpp		MachineLoopUtils.cpp
MachineModuleInfo.cpp		MachineModuleInfo.cpp
MachineModuleInfoImpls.cpp		MachineModuleInfoImpls.cpp
MachineOperand.cpp		MachineOperand.cpp
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CommandFlags.cpp

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
CGOPT(bool, EmulatedTLS)		CGOPT(bool, EmulatedTLS)
CGOPT(bool, UniqueSectionNames)		CGOPT(bool, UniqueSectionNames)
CGOPT(bool, UniqueBasicBlockSectionNames)		CGOPT(bool, UniqueBasicBlockSectionNames)
CGOPT(EABI, EABIVersion)		CGOPT(EABI, EABIVersion)
CGOPT(DebuggerKind, DebuggerTuningOpt)		CGOPT(DebuggerKind, DebuggerTuningOpt)
CGOPT(bool, EnableStackSizeSection)		CGOPT(bool, EnableStackSizeSection)
CGOPT(bool, EnableAddrsig)		CGOPT(bool, EnableAddrsig)
CGOPT(bool, EmitCallSiteInfo)		CGOPT(bool, EmitCallSiteInfo)
		CGOPT(bool, EnableMachineFunctionSplitter)
CGOPT(bool, EnableDebugEntryValues)		CGOPT(bool, EnableDebugEntryValues)
CGOPT(bool, ValueTrackingVariableLocations)		CGOPT(bool, ValueTrackingVariableLocations)
CGOPT(bool, ForceDwarfFrameSection)		CGOPT(bool, ForceDwarfFrameSection)
CGOPT(bool, XRayOmitFunctionIndex)		CGOPT(bool, XRayOmitFunctionIndex)

codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {		codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
#define CGBINDOPT(NAME) \		#define CGBINDOPT(NAME) \
do { \		do { \
▲ Show 20 Lines • Show All 307 Lines • ▼ Show 20 Lines	#define CGBINDOPT(NAME) \
CGBINDOPT(EnableDebugEntryValues);		CGBINDOPT(EnableDebugEntryValues);

static cl::opt<bool> ValueTrackingVariableLocations(		static cl::opt<bool> ValueTrackingVariableLocations(
"experimental-debug-variable-locations",		"experimental-debug-variable-locations",
cl::desc("Use experimental new value-tracking variable locations"),		cl::desc("Use experimental new value-tracking variable locations"),
cl::init(false));		cl::init(false));
CGBINDOPT(ValueTrackingVariableLocations);		CGBINDOPT(ValueTrackingVariableLocations);

		static cl::opt<bool> EnableMachineFunctionSplitter(
		"split-machine-functions",
		cl::desc("Split out cold basic blocks from machine functions based on "
		"profile information"),
		cl::init(false));
		CGBINDOPT(EnableMachineFunctionSplitter);

static cl::opt<bool> ForceDwarfFrameSection(		static cl::opt<bool> ForceDwarfFrameSection(
"force-dwarf-frame-section",		"force-dwarf-frame-section",
cl::desc("Always emit a debug frame section."), cl::init(false));		cl::desc("Always emit a debug frame section."), cl::init(false));
CGBINDOPT(ForceDwarfFrameSection);		CGBINDOPT(ForceDwarfFrameSection);

static cl::opt<bool> XRayOmitFunctionIndex(		static cl::opt<bool> XRayOmitFunctionIndex(
"no-xray-index", cl::desc("Don't emit xray_fn_idx section"),		"no-xray-index", cl::desc("Don't emit xray_fn_idx section"),
cl::init(false));		cl::init(false));
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	TargetOptions codegen::InitTargetOptionsFromCodeGenFlags() {
Options.BBSections = getBBSectionsMode(Options);		Options.BBSections = getBBSectionsMode(Options);
Options.UniqueSectionNames = getUniqueSectionNames();		Options.UniqueSectionNames = getUniqueSectionNames();
Options.UniqueBasicBlockSectionNames = getUniqueBasicBlockSectionNames();		Options.UniqueBasicBlockSectionNames = getUniqueBasicBlockSectionNames();
Options.TLSSize = getTLSSize();		Options.TLSSize = getTLSSize();
Options.EmulatedTLS = getEmulatedTLS();		Options.EmulatedTLS = getEmulatedTLS();
Options.ExplicitEmulatedTLS = EmulatedTLSView->getNumOccurrences() > 0;		Options.ExplicitEmulatedTLS = EmulatedTLSView->getNumOccurrences() > 0;
Options.ExceptionModel = getExceptionModel();		Options.ExceptionModel = getExceptionModel();
Options.EmitStackSizeSection = getEnableStackSizeSection();		Options.EmitStackSizeSection = getEnableStackSizeSection();
		Options.EnableMachineFunctionSplitter = getEnableMachineFunctionSplitter();
Options.EmitAddrsig = getEnableAddrsig();		Options.EmitAddrsig = getEnableAddrsig();
Options.EmitCallSiteInfo = getEmitCallSiteInfo();		Options.EmitCallSiteInfo = getEmitCallSiteInfo();
Options.EnableDebugEntryValues = getEnableDebugEntryValues();		Options.EnableDebugEntryValues = getEnableDebugEntryValues();
Options.ValueTrackingVariableLocations = getValueTrackingVariableLocations();		Options.ValueTrackingVariableLocations = getValueTrackingVariableLocations();
Options.ForceDwarfFrameSection = getForceDwarfFrameSection();		Options.ForceDwarfFrameSection = getForceDwarfFrameSection();
Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();		Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();

Options.MCOptions = mc::InitMCTargetOptionsFromFlags();		Options.MCOptions = mc::InitMCTargetOptionsFromFlags();
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

This file was added.

				//===-- MachineFunctionSplitter.cpp - Split machine functions //-----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// \file
				// Uses profile information to split out cold blocks.
				//
				// This pass splits out cold machine basic blocks from the parent function. This
				// implementation leverages the basic block section framework. Blocks marked
				// cold by this pass are grouped together in a separate section prefixed with
				// ".text.unlikely.*". The linker can then group these together as a cold
				// section. The split part of the function is a contiguous region identified by
				// the symbol "foo.cold". Grouping all cold blocks across functions together
				// decreases fragmentation and improves icache and itlb utilization. Note that
				// the overall changes to the binary size are negligible; only a small number of
				// additional jump instructions may be introduced.
				//
				// For the original RFC of this pass please see
				// https://groups.google.com/d/msg/llvm-dev/RUegaMg-iqc/wFAVxa6fCgAJ
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/ProfileSummaryInfo.h"
				#include "llvm/CodeGen/BasicBlockSectionUtils.h"
				#include "llvm/CodeGen/MachineBasicBlock.h"
				#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineModuleInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/Function.h"
				#include "llvm/IR/Module.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Support/CommandLine.h"

				using namespace llvm;

				static cl::opt<unsigned>
				PercentileCutoff("mfs-psi-cutoff",
				cl::desc("Percentile profile summary cutoff used to "
				"determine cold blocks. Unused if set to zero."),
				cl::init(0), cl::Hidden);

				static cl::opt<unsigned> ColdCountThreshold(
				"mfs-count-threshold",
				cl::desc(
				"Minimum number of times a block must be executed to be retained."),
				cl::init(1), cl::Hidden);

				namespace {

				class MachineFunctionSplitter : public MachineFunctionPass {
				public:
				static char ID;
				MachineFunctionSplitter() : MachineFunctionPass(ID) {
				initializeMachineFunctionSplitterPass(*PassRegistry::getPassRegistry());
				}

				StringRef getPassName() const override {
				return "Machine Function Splitter Transformation";
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override;

				bool runOnMachineFunction(MachineFunction &F) override;
				};
				} // end anonymous namespace

				static bool isColdBlock(MachineBasicBlock &MBB,
				const MachineBlockFrequencyInfo *MBFI,
				ProfileSummaryInfo *PSI) {
				Optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB);
				if (!Count.hasValue())
				return true;
				hiradityaUnsubmitted Done Reply Inline Actions Do we need to renumber? hiraditya: Do we need to renumber?
				tmsriramUnsubmitted Not Done Reply Inline Actions Renumbering makes the sorting easy. The sorting will preserve the basic block order for the blocks that are not split. tmsriram: Renumbering makes the sorting easy. The sorting will preserve the basic block order for the…
				snehasishAuthorUnsubmitted Done Reply Inline Actions We need to ensure that the order is preserved so that we don't perturb the decisions made by prior passes such as MachineBlockPlacement. Renumbering simplifies the code that needs to be shared with the BasicBlockSections pass. snehasish: We need to ensure that the order is preserved so that we don't perturb the decisions made by…
				hiradityaUnsubmitted Not Done Reply Inline Actions I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections hiraditya: I see we are renumbering both in MachineFunctionSplitter and BasicBlockSections
				tmsriramUnsubmitted Not Done Reply Inline Actions The "bbsections-prepare" pass and the machine function splitter pass are intentionally made mutually exclusive. If bbsections is explicitly requested, machine function splitter does not apply. Please see the change in TargetPassConfig.cpp tmsriram: The "bbsections-prepare" pass and the machine function splitter pass are intentionally made…

				if (PercentileCutoff > 0) {
				return PSI->isColdCountNthPercentile(PercentileCutoff, *Count);
				}
				return (*Count < ColdCountThreshold);
				}

				bool MachineFunctionSplitter::runOnMachineFunction(MachineFunction &MF) {
				// TODO: We only target functions with profile data. Static information may
				// also be considered but we don't see performance improvements yet.
				if (!MF.getFunction().hasProfileData())
				return false;
				davidxlUnsubmitted Done Reply Inline Actions Add an internal option here (the coldness threshold) for experimental purpose. I also suggest add an option to specify programSummary based coldness threshold such as 99.99 percentile coldness. The default cutoff is 99.9999% defined in ProfileSummaryInfo.cpp: ProfileSummaryCutoffCold davidxl: Add an internal option here (the coldness threshold) for experimental purpose. I also suggest…

				// TODO: We don't split functions where a section attribute has been set
				// since the split part may not be placed in a contiguous region. It may also
				// be more beneficial to augment the linker to ensure contiguous layout of
				// split functions within the same section as specified by the attribute.
				if (!MF.getFunction().getSection().empty())
				return false;

				// We don't want to proceed further for cold functions
				// or functions of unknown hotness. Lukewarm functions have no prefix.
				Optional<StringRef> SectionPrefix = MF.getFunction().getSectionPrefix();
				if (SectionPrefix.hasValue() &&
				(SectionPrefix.getValue().equals(".unlikely") \|\|
				SectionPrefix.getValue().equals(".unknown"))) {
				return false;
				}

				// Renumbering blocks here preserves the order of the blocks as
				// sortBasicBlocksAndUpdateBranches uses the numeric identifier to sort
				// blocks. Preserving the order of blocks is essential to retaining decisions
				tmsriramUnsubmitted Done Reply Inline Actions Maybe a comment here that this is useful while sorting the blocks? tmsriram: Maybe a comment here that this is useful while sorting the blocks?
				// made by prior passes such as MachineBlockPlacement.
				MF.RenumberBlocks();
				tmsriramUnsubmitted Not Done Reply Inline Actions I think excluding sections needs a bit more thought and we should do this as a separate patch if it is useful but I think a linker solution would be more favorable. From what I understand, when a user specifies section names using the section keyword, then the expectation is that all functions marked with that section name will be grouped together. With function splitting, since you attach the ".cold" suffix to such sections that are split, there is no guarantee that the linker will place them together as these are not prefixed as ".text". To overcome the above problem, the option to exclude such sections from being split is not ideal either as it moves the burden to the user to get this right with appropriate options. I think the temporary fix is to not split sections which are not prefixed as ".text". You can add a "FIXME:" comment here to describe why you are doing this. Moving forward, we can look at a linker solution where '.' is treated as a valid section name separator and sections with identical prefix before the "." are always grouped together even if they are not named ".text". I think we can move this handling as an enhancement in another patch. tmsriram: I think excluding sections needs a bit more thought and we should do this as a separate patch…
				tmsriramUnsubmitted Not Done Reply Inline Actions Correction: I meant ".unlikely" and not ".cold". tmsriram: Correction: I meant ".unlikely" and not ".cold".
				efriedmaUnsubmitted Not Done Reply Inline Actions I think I'd rather make splitting for functions with an explicit section attribute opt-in, rather than opt-out. The user might have a strong need to emit a function in a particular section (for example, if the name is mentioned in a linker script). If someone is messing with section attributes in the first place, I'd like to be conservative by default. efriedma: I think I'd rather make splitting for functions with an explicit section attribute opt-in…
				snehasishAuthorUnsubmitted Done Reply Inline Actions @tmsriram I think you meant to exclude such functions from being split. I agree, taking into consideration @efriedma's comment, I've removed the option and made this a conservative check on the section attribute. Any function with the section attribute set is not split. snehasish: @tmsriram I think you meant to exclude such functions from being split. I agree, taking into…
				MF.setBBSectionsType(BasicBlockSection::Preset);
				auto *MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
				hiradityaUnsubmitted Not Done Reply Inline Actions std::find? hiraditya: std::find?
				snehasishAuthorUnsubmitted Done Reply Inline Actions Code referencing this comment was removed. snehasish: Code referencing this comment was removed.
				auto *PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();

				for (auto &MBB : MF) {
				// FIXME: We retain the entry block and conservatively keep all landing pad
				tmsriramUnsubmitted Done Reply Inline Actions Do we need a fixme comment here to say that we could split out landing pads if the exception patch for bb sections lands? tmsriram: Do we need a fixme comment here to say that we could split out landing pads if the exception…
				// blocks as part of the original function. Once D73739 is submitted, we can
				// improve the handling of ehpads.
				if ((MBB.pred_empty() \|\| MBB.isEHPad()))
				continue;
				if (isColdBlock(MBB, MBFI, PSI))
				MBB.setSectionID(MBBSectionID::ColdSectionID);
				hiradityaUnsubmitted Done Reply Inline Actions nit: redundant braces, hiraditya: nit: redundant braces,
				}

				auto Comparator = [](const MachineBasicBlock &X, const MachineBasicBlock &Y) {
				return X.getSectionID().Type < Y.getSectionID().Type;
				};
				llvm::sortBasicBlocksAndUpdateBranches(MF, Comparator);

				return true;
				}

				void MachineFunctionSplitter::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.addRequired<MachineModuleInfoWrapperPass>();
				AU.addRequired<MachineBlockFrequencyInfo>();
				AU.addRequired<ProfileSummaryInfoWrapperPass>();
				}

				char MachineFunctionSplitter::ID = 0;
				INITIALIZE_PASS(MachineFunctionSplitter, "machine-function-splitter",
				"Split machine functions using profile information", false,
				false)

				MachineFunctionPass *llvm::createMachineFunctionSplitterPass() {
				return new MachineFunctionSplitter();
				}

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	StopAfterOpt(StringRef(StopAfterOptName),
cl::desc("Stop compilation after a specific pass"),		cl::desc("Stop compilation after a specific pass"),
cl::value_desc("pass-name"), cl::init(""), cl::Hidden);		cl::value_desc("pass-name"), cl::init(""), cl::Hidden);

static cl::opt<std::string>		static cl::opt<std::string>
StopBeforeOpt(StringRef(StopBeforeOptName),		StopBeforeOpt(StringRef(StopBeforeOptName),
cl::desc("Stop compilation before a specific pass"),		cl::desc("Stop compilation before a specific pass"),
cl::value_desc("pass-name"), cl::init(""), cl::Hidden);		cl::value_desc("pass-name"), cl::init(""), cl::Hidden);

		/// Enable the machine function splitter pass.
		static cl::opt<bool> EnableMachineFunctionSplitter(
		hiradityaUnsubmitted Done Reply Inline Actions Remove this FIXME. most passes continue to have cl::opt anyways. hiraditya: Remove this FIXME. most passes continue to have cl::opt anyways.
		"enable-split-machine-functions", cl::Hidden,
		cl::desc("Split out cold blocks from machine functions based on profile "
		tmsriramUnsubmitted Not Done Reply Inline Actions Why not call this split-machine-functions too for consistency? tmsriram: Why not call this split-machine-functions too for consistency?
		snehasishAuthorUnsubmitted Done Reply Inline Actions We can't register two options with the same string, i.e. "split-machine-functions". snehasish: We can't register two options with the same string, i.e. "split-machine-functions".
		efriedmaUnsubmitted Not Done Reply Inline Actions Why do we need two options to control the same thing? efriedma: Why do we need two options to control the same thing?
		snehasishAuthorUnsubmitted Done Reply Inline Actions In this patch we added two options An option in llvm/lib/CodeGen/CommandFlags.cpp "split-machine-functions" so that llc can be used to invoke it in the tests. We added a temporary option in llvm/lib/CodeGen/TargetPassConfig.cpp so that it can be invoked when running with clang or lld (for LTO). AFAICT we cant use (2) for tests and having (1) makes it easy to compile things without an intermediate llc step. We plan on removing (2) in a future patch which will add appropriate options to clang (-fsplit-machine-functions) and lld (--lto-split-machine-functions). snehasish: In this patch we added two options 1. An option in llvm/lib/CodeGen/CommandFlags.cpp "split…
		snehasishAuthorUnsubmitted Done Reply Inline Actions Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc however since we plan to introduce clang and lld flags in the near future it seems cleaner to leave the llc flag in place and just remove (2) when that happens rather than reintroduce it. WDYT? snehasish: Correction: We can use (2) for tests by passing "-enable-split-machine-functions" to llc…
		efriedmaUnsubmitted Not Done Reply Inline Actions clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing. efriedma: clang doesn't call RegisterCodeGenFlags? That seems like something we should consider changing.
		snehasishAuthorUnsubmitted Done Reply Inline Actions The clang driver does not register the codegen flags, the only clang tool which does is clang-fuzzer. A small patch like the one below would do the trick for basic functionality. More plumbing might be needed to print the appropriate flags from the driver. I think this is probably worth more discussion and beyond the scope of this patch. diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp index 87047be3c2b..0b9b5673d3e 100644 --- a/clang/tools/driver/cc1as_main.cpp +++ b/clang/tools/driver/cc1as_main.cpp @@ -21,6 +21,7 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/ADT/Triple.h" +#include "llvm/CodeGen/CommandFlags.h" #include "llvm/IR/DataLayout.h" #include "llvm/MC/MCAsmBackend.h" #include "llvm/MC/MCAsmInfo.h" @@ -61,6 +62,8 @@ using namespace clang::driver::options; using namespace llvm; using namespace llvm::opt; +static codegen::RegisterCodeGenFlags CGF; + namespace { /// Helper class for representing a single invocation of the assembler. snehasish: The clang driver does not register the codegen flags, the only clang tool which does is clang…
		"information."));

/// Allow standard passes to be disabled by command line options. This supports		/// Allow standard passes to be disabled by command line options. This supports
/// simple binary flags that either suppress the pass or do nothing.		/// simple binary flags that either suppress the pass or do nothing.
/// i.e. -disable-mypass=false has no effect.		/// i.e. -disable-mypass=false has no effect.
/// These should be converted to boolOrDefault in order to use applyOverride.		/// These should be converted to boolOrDefault in order to use applyOverride.
static IdentifyingPassPtr applyDisable(IdentifyingPassPtr PassID,		static IdentifyingPassPtr applyDisable(IdentifyingPassPtr PassID,
bool Override) {		bool Override) {
if (Override)		if (Override)
return IdentifyingPassPtr();		return IdentifyingPassPtr();
▲ Show 20 Lines • Show All 786 Lines • ▼ Show 20 Lines	if (TM->Options.EnableMachineOutliner && getOptLevel() != CodeGenOpt::None &&
EnableMachineOutliner != NeverOutline) {		EnableMachineOutliner != NeverOutline) {
bool RunOnAllFunctions = (EnableMachineOutliner == AlwaysOutline);		bool RunOnAllFunctions = (EnableMachineOutliner == AlwaysOutline);
bool AddOutliner = RunOnAllFunctions \|\|		bool AddOutliner = RunOnAllFunctions \|\|
TM->Options.SupportsDefaultOutlining;		TM->Options.SupportsDefaultOutlining;
if (AddOutliner)		if (AddOutliner)
addPass(createMachineOutlinerPass(RunOnAllFunctions));		addPass(createMachineOutlinerPass(RunOnAllFunctions));
}		}

if (TM->getBBSectionsType() != llvm::BasicBlockSection::None)		// Machine function splitter uses the basic block sections feature. Both
		// cannot be enabled at the same time.
		if (TM->Options.EnableMachineFunctionSplitter \|\|
		EnableMachineFunctionSplitter) {
		addPass(createMachineFunctionSplitterPass());
		} else if (TM->getBBSectionsType() != llvm::BasicBlockSection::None) {
addPass(llvm::createBasicBlockSectionsPass(TM->getBBSectionsFuncListBuf()));		addPass(llvm::createBasicBlockSectionsPass(TM->getBBSectionsFuncListBuf()));
		}

// Add passes that directly emit MI after all other MI passes.		// Add passes that directly emit MI after all other MI passes.
addPreEmitPass2();		addPreEmitPass2();

AddingMachinePasses = false;		AddingMachinePasses = false;
}		}

/// Add passes that optimize machine instructions in SSA form.		/// Add passes that optimize machine instructions in SSA form.
▲ Show 20 Lines • Show All 254 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/machine-function-splitter.ll

This file was added.

				; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions \| FileCheck %s -check-prefix=MFS-DEFAULTS
				; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-count-threshold=2000 \| FileCheck %s --dump-input=always -check-prefix=MFS-OPTS1
				; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=950000 \| FileCheck %s -check-prefix=MFS-OPTS2

				define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 {
				;; Check that cold block is moved to .text.unlikely.
				; MFS-DEFAULTS-LABEL: foo1
				; MFS-DEFAULTS: .section .text.unlikely.foo1
				; MFS-DEFAULTS-NEXT: foo1.cold:
				tmsriramUnsubmitted Done Reply Inline Actions Also, check if the block is moved to the cold region has the expected call instruction? tmsriram: Also, check if the block is moved to the cold region has the expected call instruction?
				; MFS-DEFAULTS-NOT: callq bar
				; MFS-DEFAULTS-NEXT: callq baz
				br i1 %0, label %2, label %4, !prof !17

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define void @foo2(i1 zeroext %0) nounwind !prof !23 !section_prefix !16 {
				;; Check that function marked unlikely is not split.
				; MFS-DEFAULTS-LABEL: foo2
				; MFS-DEFAULTS-NOT: foo2.cold:
				br i1 %0, label %2, label %4, !prof !17

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define void @foo3(i1 zeroext %0) nounwind !section_prefix !15 {
				;; Check that function without profile data is not split.
				; MFS-DEFAULTS-LABEL: foo3
				; MFS-DEFAULTS-NOT: foo3.cold:
				br i1 %0, label %2, label %4

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define void @foo4(i1 zeroext %0, i1 zeroext %1) nounwind !prof !20 {
				;; Check that count threshold works.
				; MFS-OPTS1-LABEL: foo4
				; MFS-OPTS1: .section .text.unlikely.foo4
				; MFS-OPTS1-NEXT: foo4.cold:
				; MFS-OPTS1-NOT: callq bar
				; MFS-OPTS1-NOT: callq baz
				; MFS-OPTS1-NEXT: callq bam
				br i1 %0, label %3, label %7, !prof !18

				3:
				%4 = call i32 @bar()
				br label %7

				5:
				%6 = call i32 @baz()
				br label %7

				7:
				br i1 %1, label %8, label %10, !prof !19

				8:
				%9 = call i32 @bam()
				br label %12

				10:
				%11 = call i32 @baz()
				br label %12

				12:
				%13 = tail call i32 @qux()
				ret void
				}

				define void @foo5(i1 zeroext %0, i1 zeroext %1) nounwind !prof !20 {
				;; Check that profile summary info cutoff works.
				; MFS-OPTS2-LABEL: foo5
				; MFS-OPTS2: .section .text.unlikely.foo5
				; MFS-OPTS2-NEXT: foo5.cold:
				; MFS-OPTS2-NOT: callq bar
				; MFS-OPTS2-NOT: callq baz
				; MFS-OPTS2-NEXT: callq bam
				br i1 %0, label %3, label %7, !prof !21

				3:
				%4 = call i32 @bar()
				br label %7

				5:
				%6 = call i32 @baz()
				br label %7

				7:
				br i1 %1, label %8, label %10, !prof !22

				8:
				%9 = call i32 @bam()
				br label %12

				10:
				%11 = call i32 @baz()
				br label %12

				12:
				%13 = call i32 @qux()
				ret void
				}

				define void @foo6(i1 zeroext %0) nounwind section "nosplit" !prof !14 {
				;; Check that function with section attribute is not split.
				; MFS-DEFAULTS-LABEL: foo6
				; MFS-DEFAULTS-NOT: foo6.cold:
				br i1 %0, label %2, label %4, !prof !17

				2: ; preds = %1
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret void
				}

				define i32 @foo7(i1 zeroext %0) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) !prof !14 {
				;; Check that cold ehpads are not split out.
				; MFS-DEFAULTS-LABEL: foo7
				; MFS-DEFAULTS: .section .text.unlikely.foo7,"ax",@progbits
				; MFS-DEFAULTS-NEXT: foo7.cold:
				; MFS-DEFAULTS-NOT: callq _Unwind_Resume
				; MFS-DEFAULTS: callq baz
				entry:
				invoke void @_Z1fv()
				to label %try.cont unwind label %lpad

				lpad:
				%1 = landingpad { i8*, i32 }
				cleanup
				catch i8* bitcast (i8** @_ZTIi to i8*)
				resume { i8*, i32 } %1

				try.cont:
				br i1 %0, label %2, label %4, !prof !17

				2: ; preds = try.cont
				%3 = call i32 @bar()
				br label %6

				4: ; preds = %1
				%5 = call i32 @baz()
				br label %6

				6: ; preds = %4, %2
				%7 = tail call i32 @qux()
				ret i32 %7
				}

				declare i32 @bar()
				declare i32 @baz()
				declare i32 @bam()
				declare i32 @qux()
				declare void @_Z1fv()
				declare i32 @__gxx_personality_v0(...)

				@_ZTIi = external constant i8*

				!llvm.module.flags = !{!0}
				!0 = !{i32 1, !"ProfileSummary", !1}
				!1 = !{!2, !3, !4, !5, !6, !7, !8, !9}
				!2 = !{!"ProfileFormat", !"InstrProf"}
				!3 = !{!"TotalCount", i64 10000}
				!4 = !{!"MaxCount", i64 10}
				!5 = !{!"MaxInternalCount", i64 1}
				!6 = !{!"MaxFunctionCount", i64 1000}
				!7 = !{!"NumCounts", i64 3}
				!8 = !{!"NumFunctions", i64 5}
				!9 = !{!"DetailedSummary", !10}
				!10 = !{!11, !12, !13}
				!11 = !{i32 10000, i64 100, i32 1}
				!12 = !{i32 999900, i64 100, i32 1}
				!13 = !{i32 999999, i64 1, i32 2}
				!14 = !{!"function_entry_count", i64 7000}
				!15 = !{!"function_section_prefix", !".hot"}
				!16 = !{!"function_section_prefix", !".unlikely"}
				!17 = !{!"branch_weights", i32 7000, i32 0}
				!18 = !{!"branch_weights", i32 3000, i32 4000}
				!19 = !{!"branch_weights", i32 1000, i32 6000}
				!20 = !{!"function_entry_count", i64 10000}
				!21 = !{!"branch_weights", i32 6000, i32 4000}
				!22 = !{!"branch_weights", i32 80, i32 9920}
				!23 = !{!"function_entry_count", i64 7}

This is an archive of the discontinued LLVM Phabricator instance.

[llvm][CodeGen] Machine Function SplitterClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 288652

llvm/include/llvm/CodeGen/BasicBlockSectionUtils.h

llvm/include/llvm/CodeGen/CommandFlags.h

llvm/include/llvm/CodeGen/MachineFunction.h

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/Target/TargetOptions.h

llvm/lib/CodeGen/BasicBlockSections.cpp

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CommandFlags.cpp

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/test/CodeGen/X86/machine-function-splitter.ll

[llvm][CodeGen] Machine Function Splitter
ClosedPublic