Download Raw Diff

Details

Reviewers

davidxl
danielcdh
wenlei

Commits

rGaa2ddfc73d6e: [SampleFDO] For functions without profiles, provide an option to put them in a…

Summary

For sampleFDO, because the optimized build uses profile generated from previous release, previously we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persuing the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside.

In https://reviews.llvm.org/D66374, we introduced profile symbol list to discriminate functions being
cold versus functions being newly added. This mechanism works quite well for regular use cases in AutoFDO. However, in some case, we can only have a partial profile when optimizing a target. The partial profile may be an aggregated profile collected from many targets. The profile symbol list method used for regular sampleFDO profile is not applicable to partial profile use case because it may be too large and introduce many false positives.

To solve the problem for partial profile use case, I want to resurrect this patch. In this patch, we provide an option called --profile-unknown-in-special-section. For functions without profile, we will still treat them conservatively in compiler optimizations -- for example, treat them as warm instead of cold in inliner. When we use profile info to add section prefix for functions, we will discriminate functions known to be not cold versus functions without profile (being unknown), and we will put functions being unknown in a special text section called .text.unknown. Runtime system will have the flexibility to decide where to put the special section in order to achieve a balance between performance and memory saving.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wmi created this revision.May 28 2019, 11:41 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2019, 11:41 AM

Herald added subscribers: kristof.beyls, eraman, javed.absar. · View Herald Transcript

I don't object to this functionality, but I wonder if this problem would be better solved in some other way. For example, why can't the sampling (post-)processing record the list of (MD5 hashes of) function names in the line table (or symbol table or whatever is relevant). If you have the list, then we don't need to guess whether a function missing from the profile is very cold or hadn't been added yet. I realize that this list might be large, but some kind of on-disk Bloom filter might work well (the Bloom filter will have some small possibility of false positives, but if rarely a cold function is treated as warm, that might be an acceptable tradeoff).

Longer term, the plan is to put a white list of the symbols into the profile data so that the compiler can decide if a function is newly created or simply cold.

Wei, if that is in place, is there a need for this patch?

Longer term, the plan is to put a white list of the symbols into the profile data so that the compiler can decide if a function is newly created or simply cold.

Wei, if that is in place, is there a need for this patch?

Yes, that is the longer term plan, and that should bring extra benefit that compiler optimizations will be benefited from the extra information.

We use the current patch as a temporary solution because we saw some urgent demand to save a lot of memory from it. It won't be needed if the longer term plan is in place.

what is the main blocker for the longer term solution?

In D62540#1520087, @davidxl wrote:

what is the main blocker for the longer term solution?

Like Hal mentioned the symbol list may be large since it include all the symbols. Current autofdo profile only include symbols that are hot/warm, so the whole symbol list should be way larger than the symbol section in current autofdo profile. I havn't looked at whether bloom-filter can help here. If we use md5, we will have a problem on profile-remapping.(https://llvm.org/docs/CommandGuide/llvm-profdata.html#cmdoption-llvm-profdata-merge-remapping-file).

I don't see other blockers.

Why is symbol remapping an issue? old MD5 --> input symbol name -> output symbol name --> mapped MD5sum ?

In D62540#1520129, @davidxl wrote:

Why is symbol remapping an issue? old MD5 --> input symbol name -> output symbol name --> mapped MD5sum ?

We want to use profile remapping support when we want to find the counterparts in profile for the set of symbols being renamed during optimized build after a large scale codebase refactoring. The profile remapping support doesn't work well if the symbols in the profile are md5. If the symbol whitelist contain md5, after refactoring, profile remapping will not find old symbols being renamed in the whitelist and will think they are all new symbols.

Existing profile remapping doesn't support comparing old MD5 with the mapped MD5. profile remapping is based on C++ mangling, so only symbol names are supported during remapping.

A new version of the patch.

Herald added a subscriber: hiraditya. · View Herald TranscriptMay 4 2020, 5:46 PM

I updated the patch summary describing the motivation to resurrect the patch.

davidxl added inline comments.May 5 2020, 10:38 AM

llvm/include/llvm/Analysis/ProfileSummaryInfo.h
118	--> isFunctionHotnessUnknown (or isFunctionProfileUnknown) seems better.
llvm/lib/Analysis/ProfileSummaryInfo.cpp
200	assert (F && hasSampleProfile()) ?
llvm/lib/CodeGen/CodeGenPrepare.cpp
462	is this used?

wmi marked 3 inline comments as done.May 5 2020, 4:11 PM

wmi added inline comments.

llvm/include/llvm/Analysis/ProfileSummaryInfo.h
118	Ok, changed.
llvm/lib/Analysis/ProfileSummaryInfo.cpp
200	Added.
llvm/lib/CodeGen/CodeGenPrepare.cpp
462	Removed.

Address David's comment.

Now that the use case is for partial profile, I think an umbrella switch to tell whether input profile is a partial profile would be helpful. That'd be complimentary to the partial flag in the profile itself, because whether a profile is partial actually depends on the use case too, e.g. a profile from service A would be a partial profile for server B, but not partial for A itself.

We could then tie all optimization tweaks for partial profile to that switch (and the partial flag from profile itself), which makes it easier to use. If we want the flexibility for individual tuning, we could have specific flags like this one as narrower override for the umbrella switch. What do you think?

In D62540#2021848, @wenlei wrote:

Now that the use case is for partial profile, I think an umbrella switch to tell whether input profile is a partial profile would be helpful. That'd be complimentary to the partial flag in the profile itself, because whether a profile is partial actually depends on the use case too, e.g. a profile from service A would be a partial profile for server B, but not partial for A itself.

We could then tie all optimization tweaks for partial profile to that switch (and the partial flag from profile itself), which makes it easier to use. If we want the flexibility for individual tuning, we could have specific flags like this one as narrower override for the umbrella switch. What do you think?

That make sense. I updated the patch to add such a flag.

Address Wenlei's comment.

wenlei added inline comments.May 6 2020, 12:31 PM

llvm/lib/CodeGen/CodeGenPrepare.cpp
181	Set the default to true here so `partial-profile` alone can enable all related tweaks?

change the test.

davidxl added inline comments.May 6 2020, 12:50 PM

llvm/lib/CodeGen/CodeGenPrepare.cpp
181	This should default to be true since the partial profile is also used as a guard -- but perhaps as a follow up after more testing.

LGTM, thanks.

llvm/lib/CodeGen/CodeGenPrepare.cpp
181	Ok, sounds good.

This revision is now accepted and ready to land.May 6 2020, 3:55 PM

wmi marked an inline comment as done.May 6 2020, 4:26 PM

wmi added inline comments.

llvm/lib/CodeGen/CodeGenPrepare.cpp
181	Yes, that is the plan. We need some extra work in linker and in runtime to make the unknown section work as we expect, and we need to make sure even if there is no support in runtime, the special section should behave just like .text section. Before that is done and tested, we want to keep it off by default.

lgtm

Thanks for the review. I am going to wait a little bit and let the linker change (https://reviews.llvm.org/D79590) go in first in case any change to the name of the output section is needed.

Closed by commit rGaa2ddfc73d6e: [SampleFDO] For functions without profiles, provide an option to put them in a… (authored by wmi). · Explain WhyMay 8 2020, 11:48 AM

This revision was automatically updated to reflect the committed changes.

Diff 262924

llvm/include/llvm/Analysis/ProfileSummaryInfo.h

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	public:
bool hasProfileSummary() { return computeSummary(); }		bool hasProfileSummary() { return computeSummary(); }

/// Returns true if module \c M has sample profile.		/// Returns true if module \c M has sample profile.
bool hasSampleProfile() {		bool hasSampleProfile() {
return hasProfileSummary() &&		return hasProfileSummary() &&
Summary->getKind() == ProfileSummary::PSK_Sample;		Summary->getKind() == ProfileSummary::PSK_Sample;
}		}

/// Returns true if module \c M has partial-profile sample profile.
bool hasPartialSampleProfile() {
return hasProfileSummary() &&
Summary->getKind() == ProfileSummary::PSK_Sample &&
Summary->isPartialProfile();
}

/// Returns true if module \c M has instrumentation profile.		/// Returns true if module \c M has instrumentation profile.
bool hasInstrumentationProfile() {		bool hasInstrumentationProfile() {
return hasProfileSummary() &&		return hasProfileSummary() &&
Summary->getKind() == ProfileSummary::PSK_Instr;		Summary->getKind() == ProfileSummary::PSK_Instr;
}		}

/// Returns true if module \c M has context sensitive instrumentation profile.		/// Returns true if module \c M has context sensitive instrumentation profile.
bool hasCSInstrumentationProfile() {		bool hasCSInstrumentationProfile() {
Show All 11 Lines	bool invalidate(Module &, const PreservedAnalyses &,
ModuleAnalysisManager::Invalidator &) {		ModuleAnalysisManager::Invalidator &) {
return false;		return false;
}		}

/// Returns the profile count for \p CallInst.		/// Returns the profile count for \p CallInst.
Optional<uint64_t> getProfileCount(const CallBase &CallInst,		Optional<uint64_t> getProfileCount(const CallBase &CallInst,
BlockFrequencyInfo *BFI,		BlockFrequencyInfo *BFI,
bool AllowSynthetic = false);		bool AllowSynthetic = false);
		/// Returns true if module \c M has partial-profile sample profile.
		bool hasPartialSampleProfile();
/// Returns true if the working set size of the code is considered huge.		/// Returns true if the working set size of the code is considered huge.
bool hasHugeWorkingSetSize();		bool hasHugeWorkingSetSize();
/// Returns true if the working set size of the code is considered large.		/// Returns true if the working set size of the code is considered large.
bool hasLargeWorkingSetSize();		bool hasLargeWorkingSetSize();
/// Returns true if \p F has hot function entry.		/// Returns true if \p F has hot function entry.
bool isFunctionEntryHot(const Function *F);		bool isFunctionEntryHot(const Function *F);
/// Returns true if \p F contains hot code.		/// Returns true if \p F contains hot code.
bool isFunctionHotInCallGraph(const Function *F, BlockFrequencyInfo &BFI);		bool isFunctionHotInCallGraph(const Function *F, BlockFrequencyInfo &BFI);
/// Returns true if \p F has cold function entry.		/// Returns true if \p F has cold function entry.
bool isFunctionEntryCold(const Function *F);		bool isFunctionEntryCold(const Function *F);
/// Returns true if \p F contains only cold code.		/// Returns true if \p F contains only cold code.
bool isFunctionColdInCallGraph(const Function *F, BlockFrequencyInfo &BFI);		bool isFunctionColdInCallGraph(const Function *F, BlockFrequencyInfo &BFI);
		/// Returns true if the hotness of \p F is unknown.
		bool isFunctionHotnessUnknown(const Function &F);
/// Returns true if \p F contains hot code with regard to a given hot		/// Returns true if \p F contains hot code with regard to a given hot
		davidxlUnsubmitted Not Done Reply Inline Actions --> isFunctionHotnessUnknown (or isFunctionProfileUnknown) seems better. davidxl: --> isFunctionHotnessUnknown (or isFunctionProfileUnknown) seems better.
		wmiAuthorUnsubmitted Done Reply Inline Actions Ok, changed. wmi: Ok, changed.
/// percentile cutoff value.		/// percentile cutoff value.
bool isFunctionHotInCallGraphNthPercentile(int PercentileCutoff,		bool isFunctionHotInCallGraphNthPercentile(int PercentileCutoff,
const Function *F,		const Function *F,
BlockFrequencyInfo &BFI);		BlockFrequencyInfo &BFI);
/// Returns true if \p F contains cold code with regard to a given cold		/// Returns true if \p F contains cold code with regard to a given cold
/// percentile cutoff value.		/// percentile cutoff value.
bool isFunctionColdInCallGraphNthPercentile(int PercentileCutoff,		bool isFunctionColdInCallGraphNthPercentile(int PercentileCutoff,
const Function *F,		const Function *F,
▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

llvm/lib/Analysis/ProfileSummaryInfo.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	static cl::opt<int> ProfileSummaryHotCount(
cl::desc("A fixed hot count that overrides the count derived from"		cl::desc("A fixed hot count that overrides the count derived from"
" profile-summary-cutoff-hot"));		" profile-summary-cutoff-hot"));

static cl::opt<int> ProfileSummaryColdCount(		static cl::opt<int> ProfileSummaryColdCount(
"profile-summary-cold-count", cl::ReallyHidden, cl::ZeroOrMore,		"profile-summary-cold-count", cl::ReallyHidden, cl::ZeroOrMore,
cl::desc("A fixed cold count that overrides the count derived from"		cl::desc("A fixed cold count that overrides the count derived from"
" profile-summary-cutoff-cold"));		" profile-summary-cutoff-cold"));

		static cl::opt<bool> PartialProfile(
		"partial-profile", cl::Hidden, cl::init(false),
		cl::desc("Specify the current profile is used as a partial profile."));

// Find the summary entry for a desired percentile of counts.		// Find the summary entry for a desired percentile of counts.
static const ProfileSummaryEntry &getEntryForPercentile(SummaryEntryVector &DS,		static const ProfileSummaryEntry &getEntryForPercentile(SummaryEntryVector &DS,
uint64_t Percentile) {		uint64_t Percentile) {
auto It = partition_point(DS, [=](const ProfileSummaryEntry &Entry) {		auto It = partition_point(DS, [=](const ProfileSummaryEntry &Entry) {
return Entry.Cutoff < Percentile;		return Entry.Cutoff < Percentile;
});		});
// The required percentile has to be <= one of the percentiles in the		// The required percentile has to be <= one of the percentiles in the
// detailed summary.		// detailed summary.
▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	if (!isColdCount(TotalCallCount))
return false;		return false;
}		}
for (const auto &BB : *F)		for (const auto &BB : *F)
if (!isColdBlock(&BB, &BFI))		if (!isColdBlock(&BB, &BFI))
return false;		return false;
return true;		return true;
}		}

		bool ProfileSummaryInfo::isFunctionHotnessUnknown(const Function &F) {
		assert(hasPartialSampleProfile() && "Expect partial sample profile");
		davidxlUnsubmitted Not Done Reply Inline Actions assert (F && hasSampleProfile()) ? davidxl: assert (F && hasSampleProfile()) ?
		wmiAuthorUnsubmitted Done Reply Inline Actions Added. wmi: Added.
		return !F.getEntryCount().hasValue();
		}

template<bool isHot>		template<bool isHot>
bool ProfileSummaryInfo::isFunctionHotOrColdInCallGraphNthPercentile(		bool ProfileSummaryInfo::isFunctionHotOrColdInCallGraphNthPercentile(
int PercentileCutoff, const Function *F, BlockFrequencyInfo &BFI) {		int PercentileCutoff, const Function *F, BlockFrequencyInfo &BFI) {
if (!F \|\| !computeSummary())		if (!F \|\| !computeSummary())
return false;		return false;
if (auto FunctionCount = F->getEntryCount()) {		if (auto FunctionCount = F->getEntryCount()) {
if (isHot &&		if (isHot &&
isHotCountNthPercentile(PercentileCutoff, FunctionCount.getCount()))		isHotCountNthPercentile(PercentileCutoff, FunctionCount.getCount()))
▲ Show 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	bool ProfileSummaryInfo::isColdCallSite(const CallBase &CB,
if (C)		if (C)
return isColdCount(*C);		return isColdCount(*C);

// In SamplePGO, if the caller has been sampled, and there is no profile		// In SamplePGO, if the caller has been sampled, and there is no profile
// annotated on the callsite, we consider the callsite as cold.		// annotated on the callsite, we consider the callsite as cold.
return hasSampleProfile() && CB.getCaller()->hasProfileData();		return hasSampleProfile() && CB.getCaller()->hasProfileData();
}		}

		bool ProfileSummaryInfo::hasPartialSampleProfile() {
		return hasProfileSummary() &&
		Summary->getKind() == ProfileSummary::PSK_Sample &&
		(PartialProfile \|\| Summary->isPartialProfile());
		}

INITIALIZE_PASS(ProfileSummaryInfoWrapperPass, "profile-summary-info",		INITIALIZE_PASS(ProfileSummaryInfoWrapperPass, "profile-summary-info",
"Profile summary info", false, true)		"Profile summary info", false, true)

ProfileSummaryInfoWrapperPass::ProfileSummaryInfoWrapperPass()		ProfileSummaryInfoWrapperPass::ProfileSummaryInfoWrapperPass()
: ImmutablePass(ID) {		: ImmutablePass(ID) {
initializeProfileSummaryInfoWrapperPassPass(*PassRegistry::getPassRegistry());		initializeProfileSummaryInfoWrapperPassPass(*PassRegistry::getPassRegistry());
}		}

Show All 33 Lines

llvm/lib/CodeGen/CodeGenPrepare.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines
static cl::opt<bool> DisablePreheaderProtect(		static cl::opt<bool> DisablePreheaderProtect(
"disable-preheader-prot", cl::Hidden, cl::init(false),		"disable-preheader-prot", cl::Hidden, cl::init(false),
cl::desc("Disable protection against removing loop preheaders"));		cl::desc("Disable protection against removing loop preheaders"));

static cl::opt<bool> ProfileGuidedSectionPrefix(		static cl::opt<bool> ProfileGuidedSectionPrefix(
"profile-guided-section-prefix", cl::Hidden, cl::init(true), cl::ZeroOrMore,		"profile-guided-section-prefix", cl::Hidden, cl::init(true), cl::ZeroOrMore,
cl::desc("Use profile info to add section prefix for hot/cold functions"));		cl::desc("Use profile info to add section prefix for hot/cold functions"));

		static cl::opt<bool> ProfileUnknownInSpecialSection(
		"profile-unknown-in-special-section", cl::Hidden, cl::init(false),
		wenleiUnsubmitted Not Done Reply Inline Actions Set the default to true here so `partial-profile` alone can enable all related tweaks? wenlei: Set the default to true here so `partial-profile` alone can enable all related tweaks?
		davidxlUnsubmitted Not Done Reply Inline Actions This should default to be true since the partial profile is also used as a guard -- but perhaps as a follow up after more testing. davidxl: This should default to be true since the partial profile is also used as a guard -- but…
		wenleiUnsubmitted Not Done Reply Inline Actions Ok, sounds good. wenlei: Ok, sounds good.
		wmiAuthorUnsubmitted Done Reply Inline Actions Yes, that is the plan. We need some extra work in linker and in runtime to make the unknown section work as we expect, and we need to make sure even if there is no support in runtime, the special section should behave just like .text section. Before that is done and tested, we want to keep it off by default. wmi: Yes, that is the plan. We need some extra work in linker and in runtime to make the unknown…
		cl::ZeroOrMore,
		cl::desc("In profiling mode like sampleFDO, if a function doesn't have "
		"profile, we cannot tell the function is cold for sure because "
		"it may be a function newly added without ever being sampled. "
		"With the flag enabled, compiler can put such profile unknown "
		"functions into a special section, so runtime system can choose "
		"to handle it in a different way than .text section, to save "
		"RAM for example. "));

static cl::opt<unsigned> FreqRatioToSkipMerge(		static cl::opt<unsigned> FreqRatioToSkipMerge(
"cgp-freq-ratio-to-skip-merge", cl::Hidden, cl::init(2),		"cgp-freq-ratio-to-skip-merge", cl::Hidden, cl::init(2),
cl::desc("Skip merging empty blocks if (frequency of empty block) / "		cl::desc("Skip merging empty blocks if (frequency of empty block) / "
"(frequency of destination block) is greater than this ratio"));		"(frequency of destination block) is greater than this ratio"));

static cl::opt<bool> ForceSplitStore(		static cl::opt<bool> ForceSplitStore(
"force-split-store", cl::Hidden, cl::init(false),		"force-split-store", cl::Hidden, cl::init(false),
cl::desc("Force store splitting no matter what the target query says."));		cl::desc("Force store splitting no matter what the target query says."));
▲ Show 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::runOnFunction(Function &F) {
TLInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		TLInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
BPI.reset(new BranchProbabilityInfo(F, *LI));		BPI.reset(new BranchProbabilityInfo(F, *LI));
BFI.reset(new BlockFrequencyInfo(F, BPI, LI));		BFI.reset(new BlockFrequencyInfo(F, BPI, LI));
PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();		PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
OptSize = F.hasOptSize();		OptSize = F.hasOptSize();
if (ProfileGuidedSectionPrefix) {		if (ProfileGuidedSectionPrefix) {
if (PSI->isFunctionHotInCallGraph(&F, *BFI))		if (PSI->isFunctionHotInCallGraph(&F, *BFI))
		davidxlUnsubmitted Not Done Reply Inline Actions is this used? davidxl: is this used?
		wmiAuthorUnsubmitted Done Reply Inline Actions Removed. wmi: Removed.
F.setSectionPrefix(".hot");		F.setSectionPrefix(".hot");
else if (PSI->isFunctionColdInCallGraph(&F, *BFI))		else if (PSI->isFunctionColdInCallGraph(&F, *BFI))
F.setSectionPrefix(".unlikely");		F.setSectionPrefix(".unlikely");
		else if (ProfileUnknownInSpecialSection && PSI->hasPartialSampleProfile() &&
		PSI->isFunctionHotnessUnknown(F))
		F.setSectionPrefix(".unknown");
}		}

/// This optimization identifies DIV instructions that can be		/// This optimization identifies DIV instructions that can be
/// profitably bypassed and carried out with a shorter, faster divide.		/// profitably bypassed and carried out with a shorter, faster divide.
if (!OptSize && !PSI->hasHugeWorkingSetSize() && TLI->isSlowDivBypassed()) {		if (!OptSize && !PSI->hasHugeWorkingSetSize() && TLI->isSlowDivBypassed()) {
const DenseMap<unsigned int, unsigned int> &BypassWidths =		const DenseMap<unsigned int, unsigned int> &BypassWidths =
TLI->getBypassSlowDivWidths();		TLI->getBypassSlowDivWidths();
BasicBlock* BB = &*F.begin();		BasicBlock* BB = &*F.begin();
▲ Show 20 Lines • Show All 7,260 Lines • Show Last 20 Lines

llvm/test/Transforms/SampleProfile/section-accurate-samplepgo.ll

	; REQUIRES: x86-registered-target			; REQUIRES: x86-registered-target
	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -codegenprepare -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -codegenprepare -S \| FileCheck %s
				; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -codegenprepare -profile-unknown-in-special-section -partial-profile -S \| FileCheck %s --check-prefix UNKNOWN
	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -codegenprepare -profile-sample-accurate -S \| FileCheck %s --check-prefix ACCURATE			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -codegenprepare -profile-sample-accurate -S \| FileCheck %s --check-prefix ACCURATE

	target triple = "x86_64-pc-linux-gnu"			target triple = "x86_64-pc-linux-gnu"

	; The test checks that function without profile gets unlikely section prefix			; The test checks that function without profile gets unlikely section prefix
	; if -profile-sample-accurate is specified or the function has the			; if -profile-sample-accurate is specified or the function has the
	; profile-sample-accurate attribute.			; profile-sample-accurate attribute.

	declare void @hot_func()			declare void @hot_func()

	; CHECK-NOT: foo_not_in_profile{{.*}}!section_prefix			; CHECK-NOT: foo_not_in_profile{{.*}}!section_prefix
	; CHECK: foo_not_in_profile{{.*}}!prof ![[UNKNOWN_ID:[0-9]+]]			; CHECK: foo_not_in_profile{{.*}}!prof ![[NOPROFILE_ID:[0-9]+]]
				; UNKNOWN: foo_not_in_profile{{.*}}!prof ![[NOPROFILE_ID:[0-9]+]] !section_prefix ![[UNKNOWN_ID:[0-9]+]]
	; ACCURATE: foo_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]			; ACCURATE: foo_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]
	; The function not appearing in profile is cold when -profile-sample-accurate			; The function not appearing in profile is cold when -profile-sample-accurate
	; is on.			; is on.
	define void @foo_not_in_profile() {			define void @foo_not_in_profile() {
	call void @hot_func()			call void @hot_func()
	ret void			ret void
	}			}

	; CHECK: bar_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]			; CHECK: bar_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]
	; ACCURATE: bar_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]			; ACCURATE: bar_not_in_profile{{.*}}!prof ![[ZERO_ID:[0-9]+]] !section_prefix ![[COLD_ID:[0-9]+]]
	; The function not appearing in profile is cold when the func has			; The function not appearing in profile is cold when the func has
	; profile-sample-accurate attribute.			; profile-sample-accurate attribute.
	define void @bar_not_in_profile() #0 {			define void @bar_not_in_profile() #0 {
	call void @hot_func()			call void @hot_func()
	ret void			ret void
	}			}

	attributes #0 = { "profile-sample-accurate" }			attributes #0 = { "profile-sample-accurate" }

	; CHECK: ![[UNKNOWN_ID]] = !{!"function_entry_count", i64 -1}			; CHECK: ![[NOPROFILE_ID]] = !{!"function_entry_count", i64 -1}
	; CHECK: ![[ZERO_ID]] = !{!"function_entry_count", i64 0}			; CHECK: ![[ZERO_ID]] = !{!"function_entry_count", i64 0}
	; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}			; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}
				; UNKNOWN: ![[NOPROFILE_ID]] = !{!"function_entry_count", i64 -1}
				; UNKNOWN: ![[UNKNOWN_ID]] = !{!"function_section_prefix", !".unknown"}
	; ACCURATE: ![[ZERO_ID]] = !{!"function_entry_count", i64 0}			; ACCURATE: ![[ZERO_ID]] = !{!"function_entry_count", i64 0}
	; ACCURATE: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}			; ACCURATE: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}
	!llvm.module.flags = !{!1}			!llvm.module.flags = !{!1}
	!1 = !{i32 1, !"ProfileSummary", !2}			!1 = !{i32 1, !"ProfileSummary", !2}
	!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}			!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}
	!3 = !{!"ProfileFormat", !"SampleProfile"}			!3 = !{!"ProfileFormat", !"SampleProfile"}
	!4 = !{!"TotalCount", i64 10000}			!4 = !{!"TotalCount", i64 10000}
	!5 = !{!"MaxCount", i64 1000}			!5 = !{!"MaxCount", i64 1000}
	Show All 9 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] For functions without profiles, provide an option to put them in a special text section
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 262924

llvm/include/llvm/Analysis/ProfileSummaryInfo.h

llvm/lib/Analysis/ProfileSummaryInfo.cpp

llvm/lib/CodeGen/CodeGenPrepare.cpp

llvm/test/Transforms/SampleProfile/section-accurate-samplepgo.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] For functions without profiles, provide an option to put them in a special text sectionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 262924

llvm/include/llvm/Analysis/ProfileSummaryInfo.h

llvm/lib/Analysis/ProfileSummaryInfo.cpp

llvm/lib/CodeGen/CodeGenPrepare.cpp

llvm/test/Transforms/SampleProfile/section-accurate-samplepgo.ll

[SampleFDO] For functions without profiles, provide an option to put them in a special text section
ClosedPublic