Download Raw Diff

Details

Reviewers

Commits

rGc5f76f734744: Add prefix based function layout when profile is available.
rL261582: Add prefix based function layout when profile is available.

Summary

If a function is hot, put it in text.hot section.

Diff Detail

Event Timeline

danielcdh updated this revision to Diff 48540.Feb 19 2016, 11:55 AM

danielcdh retitled this revision from to Add prefix based function layout when profile is available..

danielcdh updated this object.

danielcdh added a reviewer: davidxl.

danielcdh added a subscriber: llvm-commits.

mcrosier added a subscriber: mcrosier.Feb 19 2016, 12:04 PM

davidxl added inline comments.Feb 19 2016, 12:23 PM

lib/CodeGen/TargetLoweringObjectFileImpl.cpp
303	please introduce placeholder query APIs in include/llvm/ProfileData/ProfileCommon.h for this purpose (can be a static method of ProfileSummary class: bool ProfileSummary::isFunctionHot(Function ); bool ProfileSummary::isFunctionUnlikely(Function ) using the existence of entry profile count as the criteria is wrong -- the entry count can be 0 or a very low value. The right tuning will need to look at the max Block count of the function and program summary -- so I think we need to leave that part out and wait for Easwaran's tuning work is in Before the summary based work is in, for now, implement a simpler heuristic similar to what is used in the current inliner: a) check attribute 'hot' of the function (in practice, this may not work well -- after ininling, the out of line copy of the original hot function may become cold) b) check attribute 'cold' for the function -- this is more reliable c) if the function's has zero bb count/sample, mark it as cold (Inliner's herustics is 0.01* maxFunctionCount) -- but I think we can be more conservative here by checking zero count ones) In short, the initial heuristic should be conservative in the sense we only filter out those we are really confident with -- mainly cold ones never executed which can go a long way. Define two helper functions to return .hot and .unlikely suffixes instead of using hard coded strings here. We can also introduce an internal option to turn this layout optimization on \|off. By default it can be off. When it is off, the query functions above will simply return 'false'.

Integrate David's comments.

davidxl added inline comments.Feb 19 2016, 2:21 PM

include/llvm/ProfileData/ProfileCommon.h
73	Document this method -- what does it do?
lib/CodeGen/TargetLoweringObjectFileImpl.cpp
248	Move these two functions into ProfileSummary.h (as a standalone function in llvm:: namespace)
252	Better name it as: GroupFunctionsByHotness
255	I think the initial value should be false. To turn it on by default, more data is needed (and announced more widely).
lib/ProfileData/ProfileSummary.cpp
82	use FIXME
88	FIXME
99	FIXME

davidxl added a subscriber: eraman.Feb 19 2016, 2:21 PM

Integrate comments.

done

davidxl added inline comments.Feb 19 2016, 3:19 PM

lib/ProfileData/ProfileSummary.cpp
81	If this takes Function * as an argument, it can be a wrapper of getEntryCount call
92	I suggest changing this interface's name to be 'isFunctionRarelyCalled'. It is different from 'isFunctionCold' which should be checking the internal block count. These two interfaces are for different purposes. The later can be used to guide decisions such as optimize for size etc, while the former can be used for layout decision to reduce itlb misses. Similarly, isFunctionHot does not mean it is frequently called either -- so the interface need to be split up too.
96	the check of hasProfile is probably redundant.

eraman added inline comments.Feb 19 2016, 3:39 PM

include/llvm/ProfileData/ProfileCommon.h
33	These two functions should be elsewhere. Perhaps in TargetLowering ?
lib/ProfileData/ProfileSummary.cpp
92	This and isFunctionHot should be made as a member functions of Function.
96	I think what David says is true (!F->genEntryCount() implies !hasProfile()).

davidxl added inline comments.Feb 19 2016, 3:42 PM

include/llvm/ProfileData/ProfileCommon.h
33	I think it is fine to be here -- it is generic enough and it is profile related.
lib/ProfileData/ProfileSummary.cpp
92	I think it is better to keep all/most of the profile related query APIs here in this file.

eraman added inline comments.Feb 19 2016, 4:03 PM

include/llvm/ProfileData/ProfileCommon.h
33	The strings themselves are not profile related. One might want to use .unlikely for error handling routines even without profile data.
lib/ProfileData/ProfileSummary.cpp
92	Again, the function itself is not fully profile dependent. I think ProfileCommon should only have APIs like isCountConsideredHot(uint64_t Count).

davidxl added inline comments.Feb 19 2016, 4:13 PM

include/llvm/ProfileData/ProfileCommon.h
33	Aren't those also related to profile too (it is profile that can be determined statically)?
lib/ProfileData/ProfileSummary.cpp
92	You have a point about Function being not profile dependent, but I think we should not restrict the scope of ProfileCommon.h too much like this. It is reasonable for ProfileCommon to reference IR related headers.

update

lib/ProfileData/ProfileSummary.cpp
81	!getEntryCount() does not mean no profile for the binary, but no profile for the function. We need to have this function to indicate if this is an FDO/AFDO build.
92	After a second thought, I think it still makes sense to group "hot" functions together instead of "frequently called" This is because the hot path of an "infrequently called" hot function may have many calls to a frequently called function. So if these two functions are far apart, it will waste an ITLB entry. Additionally, this will pollute the heat map and makes performance tuning more difficult.
96	I've updated the logic, PTAL.

davidxl added inline comments.Feb 19 2016, 4:51 PM

include/llvm/ProfileData/ProfileCommon.h
74	If it is used to indicate profile availability at program level, it should take Module as argument.
lib/ProfileData/ProfileSummary.cpp
100	Should return false -- as we don't know (no profile data)
103	This will work well for instr profile but may produce false positive for sampleFDO -- this will be fixed when summary based computation is used.

danielcdh marked an inline comment as done.Feb 19 2016, 5:09 PM

danielcdh added inline comments.

lib/ProfileData/ProfileSummary.cpp
100	For AutoFDO, F->setEntryCount() is only called when there is sample in that function. So for those 0-sampled functions, F->getEntryCount() will be false.
103	Yes, but we can always set entry-count as no less than 1 if the function has profile. Otherwise we have no way to calculate bb counts.

update

Looks good in general.

Need a test case covering both regular case and -ffunction-sections (-function-sections for llc)

Need a follow up discussion to introduce user level option -freorder-function (and turn on by default with profile) in LLVM. Also talk to other maintainers of other main platforms (MachO, etc) to add this support there.

eraman added inline comments.Feb 21 2016, 1:43 PM

lib/ProfileData/ProfileSummary.cpp
83	return M->getMaximumFunctionCount().hasValue(); I know this doesn't work for sample profile, but it works for instrumentation profile and is better than returning false.
100	I think sample profile loader pass should set 0 to the entry count. The meaning of getEntryCount() should not be dependent on the profile format that is used. (To be clear, I am not saying that should be fixed as part of this patch)

add test and remove hasProfile.

added the unittest. also removed hasProfile to use getEntryCount()'s boolean value to check. I will have a follow-up patch to make getEntryCount() behavior consistent between sample profile and instr profile.

davidxl added inline comments.Feb 22 2016, 1:31 PM

test/CodeGen/X86/partition-sections.ll
2	Can you also add a test without -function-sections option.

add more test

LGTM.

To repeat the follow ups needed:

tuning, performance number and turning on by default for PGO
support for other platforms (and consider user options if applicable).

This revision is now accepted and ready to land.Feb 22 2016, 2:01 PM

danielcdh closed this revision.Feb 22 2016, 2:18 PM

This is broken on OSX.

The test case misses mtriple option.

Dehao, we should not leave buildbot failing for long period of time.
The buildbot message should arrive at your inbox very shortly after
the commit -- can you check if there is any issue there (e.g. in spam
filter)?

thanks,

David

Diff 48723

include/llvm/ProfileData/ProfileCommon.h

Show All 15 Lines
#include <functional>		#include <functional>
#include <map>		#include <map>
#include <vector>		#include <vector>

#ifndef LLVM_PROFILEDATA_PROFILE_COMMON_H		#ifndef LLVM_PROFILEDATA_PROFILE_COMMON_H
#define LLVM_PROFILEDATA_PROFILE_COMMON_H		#define LLVM_PROFILEDATA_PROFILE_COMMON_H

namespace llvm {		namespace llvm {
		class Function;
		class Module;
namespace IndexedInstrProf {		namespace IndexedInstrProf {
struct Summary;		struct Summary;
}		}
namespace sampleprof {		namespace sampleprof {
class FunctionSamples;		class FunctionSamples;
}		}
struct InstrProfRecord;		struct InstrProfRecord;
		inline const char *getHotSectionPrefix() { return ".hot"; }
		eramanUnsubmitted Not Done Reply Inline Actions These two functions should be elsewhere. Perhaps in TargetLowering ? eraman: These two functions should be elsewhere. Perhaps in TargetLowering ?
		davidxlUnsubmitted Not Done Reply Inline Actions I think it is fine to be here -- it is generic enough and it is profile related. davidxl: I think it is fine to be here -- it is generic enough and it is profile related.
		eramanUnsubmitted Not Done Reply Inline Actions The strings themselves are not profile related. One might want to use .unlikely for error handling routines even without profile data. eraman: The strings themselves are not profile related. One might want to use .unlikely for error…
		davidxlUnsubmitted Not Done Reply Inline Actions Aren't those also related to profile too (it is profile that can be determined statically)? davidxl: Aren't those also related to profile too (it is profile that can be determined statically)?
		inline const char *getUnlikelySectionPrefix() { return ".unlikely"; }
// The profile summary is one or more (Cutoff, MinCount, NumCounts) triplets.		// The profile summary is one or more (Cutoff, MinCount, NumCounts) triplets.
// The semantics of counts depend on the type of profile. For instrumentation		// The semantics of counts depend on the type of profile. For instrumentation
// profile, counts are block counts and for sample profile, counts are		// profile, counts are block counts and for sample profile, counts are
// per-line samples. Given a target counts percentile, we compute the minimum		// per-line samples. Given a target counts percentile, we compute the minimum
// number of counts needed to reach this target and the minimum among these		// number of counts needed to reach this target and the minimum among these
// counts.		// counts.
struct ProfileSummaryEntry {		struct ProfileSummaryEntry {
uint32_t Cutoff; ///< The required percentile of counts.		uint32_t Cutoff; ///< The required percentile of counts.
Show All 22 Lines	protected:
ProfileSummary(std::vector<ProfileSummaryEntry> DetailedSummary,		ProfileSummary(std::vector<ProfileSummaryEntry> DetailedSummary,
uint64_t TotalCount, uint64_t MaxCount, uint32_t NumCounts)		uint64_t TotalCount, uint64_t MaxCount, uint32_t NumCounts)
: DetailedSummary(DetailedSummary), TotalCount(TotalCount),		: DetailedSummary(DetailedSummary), TotalCount(TotalCount),
MaxCount(MaxCount), NumCounts(NumCounts) {}		MaxCount(MaxCount), NumCounts(NumCounts) {}
inline void addCount(uint64_t Count);		inline void addCount(uint64_t Count);

public:		public:
static const int Scale = 1000000;		static const int Scale = 1000000;
		// \brief Returns true if F is a hot function.
		davidxlUnsubmitted Done Reply Inline Actions Document this method -- what does it do? davidxl: Document this method -- what does it do?
		static bool isFunctionHot(const Function *F);
		davidxlUnsubmitted Done Reply Inline Actions If it is used to indicate profile availability at program level, it should take Module as argument. davidxl: If it is used to indicate profile availability at program level, it should take Module as…
		// \brief Returns true if F is unlikley executed.
		static bool isFunctionUnlikely(const Function *F);
inline std::vector<ProfileSummaryEntry> &getDetailedSummary();		inline std::vector<ProfileSummaryEntry> &getDetailedSummary();
void computeDetailedSummary();		void computeDetailedSummary();
/// \brief A vector of useful cutoff values for detailed summary.		/// \brief A vector of useful cutoff values for detailed summary.
static const std::vector<uint32_t> DefaultCutoffs;		static const std::vector<uint32_t> DefaultCutoffs;
};		};

class InstrProfSummary : public ProfileSummary {		class InstrProfSummary : public ProfileSummary {
uint64_t MaxInternalBlockCount, MaxFunctionCount;		uint64_t MaxInternalBlockCount, MaxFunctionCount;
▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

lib/CodeGen/TargetLoweringObjectFileImpl.cpp

Show All 28 Lines
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCSectionCOFF.h"		#include "llvm/MC/MCSectionCOFF.h"
#include "llvm/MC/MCSectionELF.h"		#include "llvm/MC/MCSectionELF.h"
#include "llvm/MC/MCSectionMachO.h"		#include "llvm/MC/MCSectionMachO.h"
#include "llvm/MC/MCStreamer.h"		#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSymbolELF.h"		#include "llvm/MC/MCSymbolELF.h"
#include "llvm/MC/MCValue.h"		#include "llvm/MC/MCValue.h"
#include "llvm/ProfileData/InstrProf.h"		#include "llvm/ProfileData/InstrProf.h"
		#include "llvm/ProfileData/ProfileCommon.h"
#include "llvm/Support/COFF.h"		#include "llvm/Support/COFF.h"
#include "llvm/Support/Dwarf.h"		#include "llvm/Support/Dwarf.h"
#include "llvm/Support/ELF.h"		#include "llvm/Support/ELF.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetLowering.h"		#include "llvm/Target/TargetLowering.h"
#include "llvm/Target/TargetMachine.h"		#include "llvm/Target/TargetMachine.h"
#include "llvm/Target/TargetSubtargetInfo.h"		#include "llvm/Target/TargetSubtargetInfo.h"
▲ Show 20 Lines • Show All 194 Lines • ▼ Show 20 Lines	static StringRef getSectionPrefixForGlobal(SectionKind Kind) {
if (Kind.isThreadBSS())		if (Kind.isThreadBSS())
return ".tbss";		return ".tbss";
if (Kind.isData())		if (Kind.isData())
return ".data";		return ".data";
assert(Kind.isReadOnlyWithRel() && "Unknown section kind");		assert(Kind.isReadOnlyWithRel() && "Unknown section kind");
return ".data.rel.ro";		return ".data.rel.ro";
}		}

		static cl::opt<bool> GroupFunctionsByHotness(
		davidxlUnsubmitted Done Reply Inline Actions Move these two functions into ProfileSummary.h (as a standalone function in llvm:: namespace) davidxl: Move these two functions into ProfileSummary.h (as a standalone function in llvm:: namespace)
		"group-functions-by-hotness",
		llvm::cl::desc("Partition hot/cold functions by sections prefix"),
		cl::init(false));

		davidxlUnsubmitted Done Reply Inline Actions Better name it as: GroupFunctionsByHotness davidxl: Better name it as: GroupFunctionsByHotness
static MCSectionELF *		static MCSectionELF *
selectELFSectionForGlobal(MCContext &Ctx, const GlobalValue *GV,		selectELFSectionForGlobal(MCContext &Ctx, const GlobalValue *GV,
SectionKind Kind, Mangler &Mang,		SectionKind Kind, Mangler &Mang,
		davidxlUnsubmitted Done Reply Inline Actions I think the initial value should be false. To turn it on by default, more data is needed (and announced more widely). davidxl: I think the initial value should be false. To turn it on by default, more data is needed (and…
const TargetMachine &TM, bool EmitUniqueSection,		const TargetMachine &TM, bool EmitUniqueSection,
unsigned Flags, unsigned *NextUniqueID) {		unsigned Flags, unsigned *NextUniqueID) {
unsigned EntrySize = 0;		unsigned EntrySize = 0;
if (Kind.isMergeableCString()) {		if (Kind.isMergeableCString()) {
if (Kind.isMergeable2ByteCString()) {		if (Kind.isMergeable2ByteCString()) {
EntrySize = 2;		EntrySize = 2;
} else if (Kind.isMergeable4ByteCString()) {		} else if (Kind.isMergeable4ByteCString()) {
EntrySize = 4;		EntrySize = 4;
Show All 31 Lines	if (Kind.isMergeableCString()) {
Name = SizeSpec + utostr(Align);		Name = SizeSpec + utostr(Align);
} else if (Kind.isMergeableConst()) {		} else if (Kind.isMergeableConst()) {
Name = ".rodata.cst";		Name = ".rodata.cst";
Name += utostr(EntrySize);		Name += utostr(EntrySize);
} else {		} else {
Name = getSectionPrefixForGlobal(Kind);		Name = getSectionPrefixForGlobal(Kind);
}		}

		if (GroupFunctionsByHotness) {
		davidxlUnsubmitted Not Done Reply Inline Actions please introduce placeholder query APIs in include/llvm/ProfileData/ProfileCommon.h for this purpose (can be a static method of ProfileSummary class: bool ProfileSummary::isFunctionHot(Function ); bool ProfileSummary::isFunctionUnlikely(Function ) using the existence of entry profile count as the criteria is wrong -- the entry count can be 0 or a very low value. The right tuning will need to look at the max Block count of the function and program summary -- so I think we need to leave that part out and wait for Easwaran's tuning work is in Before the summary based work is in, for now, implement a simpler heuristic similar to what is used in the current inliner: a) check attribute 'hot' of the function (in practice, this may not work well -- after ininling, the out of line copy of the original hot function may become cold) b) check attribute 'cold' for the function -- this is more reliable c) if the function's has zero bb count/sample, mark it as cold (Inliner's herustics is 0.01* maxFunctionCount) -- but I think we can be more conservative here by checking zero count ones) In short, the initial heuristic should be conservative in the sense we only filter out those we are really confident with -- mainly cold ones never executed which can go a long way. Define two helper functions to return .hot and .unlikely suffixes instead of using hard coded strings here. We can also introduce an internal option to turn this layout optimization on \|off. By default it can be off. When it is off, the query functions above will simply return 'false'. davidxl: 1) please introduce placeholder query APIs in include/llvm/ProfileData/ProfileCommon.h for…
		if (const Function *F = dyn_cast<Function>(GV)) {
		if (ProfileSummary::isFunctionHot(F)) {
		Name += getHotSectionPrefix();
		} else if (ProfileSummary::isFunctionUnlikely(F)) {
		Name += getUnlikelySectionPrefix();
		}
		}
		}

if (EmitUniqueSection && UniqueSectionNames) {		if (EmitUniqueSection && UniqueSectionNames) {
Name.push_back('.');		Name.push_back('.');
TM.getNameWithPrefix(Name, GV, Mang, true);		TM.getNameWithPrefix(Name, GV, Mang, true);
}		}
unsigned UniqueID = ~0;		unsigned UniqueID = ~0;
if (EmitUniqueSection && !UniqueSectionNames) {		if (EmitUniqueSection && !UniqueSectionNames) {
UniqueID = *NextUniqueID;		UniqueID = *NextUniqueID;
(*NextUniqueID)++;		(*NextUniqueID)++;
▲ Show 20 Lines • Show All 780 Lines • Show Last 20 Lines

lib/ProfileData/ProfileSummary.cpp

//=-- Profilesummary.cpp - Profile summary computation ----------------------=//		//=-- Profilesummary.cpp - Profile summary computation ----------------------=//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains support for computing profile summary data.		// This file contains support for computing profile summary data.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "llvm/IR/Attributes.h"
		#include "llvm/IR/Function.h"
		#include "llvm/IR/Module.h"
#include "llvm/ProfileData/InstrProf.h"		#include "llvm/ProfileData/InstrProf.h"
#include "llvm/ProfileData/ProfileCommon.h"		#include "llvm/ProfileData/ProfileCommon.h"
#include "llvm/ProfileData/SampleProf.h"		#include "llvm/ProfileData/SampleProf.h"

using namespace llvm;		using namespace llvm;

// A set of cutoff values. Each value, when divided by ProfileSummary::Scale		// A set of cutoff values. Each value, when divided by ProfileSummary::Scale
// (which is 1000000) is a desired percentile of total counts.		// (which is 1000000) is a desired percentile of total counts.
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	while (CurrSum < DesiredCount && Iter != End) {
Iter++;		Iter++;
}		}
assert(CurrSum >= DesiredCount);		assert(CurrSum >= DesiredCount);
ProfileSummaryEntry PSE = {Cutoff, Count, CountsSeen};		ProfileSummaryEntry PSE = {Cutoff, Count, CountsSeen};
DetailedSummary.push_back(PSE);		DetailedSummary.push_back(PSE);
}		}
}		}

		// Returns true if the function is a hot function.
		davidxlUnsubmitted Not Done Reply Inline Actions If this takes Function * as an argument, it can be a wrapper of getEntryCount call davidxl: If this takes Function * as an argument, it can be a wrapper of getEntryCount call
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions !getEntryCount() does not mean no profile for the binary, but no profile for the function. We need to have this function to indicate if this is an FDO/AFDO build. danielcdh: !getEntryCount() does not mean no profile for the binary, but no profile for the function. We…
		bool ProfileSummary::isFunctionHot(const Function *F) {
		davidxlUnsubmitted Done Reply Inline Actions use FIXME davidxl: use FIXME
		// FIXME: update when summary data is stored in module's metadata.
		eramanUnsubmitted Not Done Reply Inline Actions return M->getMaximumFunctionCount().hasValue(); I know this doesn't work for sample profile, but it works for instrumentation profile and is better than returning false. eraman: return M->getMaximumFunctionCount().hasValue(); I know this doesn't work for sample profile…
		return false;
		}

		// Returns true if the function is a cold function.
		bool ProfileSummary::isFunctionUnlikely(const Function *F) {
		davidxlUnsubmitted Done Reply Inline Actions FIXME davidxl: FIXME
		if (F->hasFnAttribute(Attribute::Cold)) {
		return true;
		}
		if (!F->getEntryCount()) {
		davidxlUnsubmitted Not Done Reply Inline Actions I suggest changing this interface's name to be 'isFunctionRarelyCalled'. It is different from 'isFunctionCold' which should be checking the internal block count. These two interfaces are for different purposes. The later can be used to guide decisions such as optimize for size etc, while the former can be used for layout decision to reduce itlb misses. Similarly, isFunctionHot does not mean it is frequently called either -- so the interface need to be split up too. davidxl: I suggest changing this interface's name to be 'isFunctionRarelyCalled'. It is different from…
		eramanUnsubmitted Not Done Reply Inline Actions This and isFunctionHot should be made as a member functions of Function. eraman: This and isFunctionHot should be made as a member functions of Function.
		davidxlUnsubmitted Not Done Reply Inline Actions I think it is better to keep all/most of the profile related query APIs here in this file. davidxl: I think it is better to keep all/most of the profile related query APIs here in this file.
		eramanUnsubmitted Not Done Reply Inline Actions Again, the function itself is not fully profile dependent. I think ProfileCommon should only have APIs like isCountConsideredHot(uint64_t Count). eraman: Again, the function itself is not fully profile dependent. I think ProfileCommon should only…
		davidxlUnsubmitted Not Done Reply Inline Actions You have a point about Function being not profile dependent, but I think we should not restrict the scope of ProfileCommon.h too much like this. It is reasonable for ProfileCommon to reference IR related headers. davidxl: You have a point about Function being not profile dependent, but I think we should not restrict…
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions After a second thought, I think it still makes sense to group "hot" functions together instead of "frequently called" This is because the hot path of an "infrequently called" hot function may have many calls to a frequently called function. So if these two functions are far apart, it will waste an ITLB entry. Additionally, this will pollute the heat map and makes performance tuning more difficult. danielcdh: After a second thought, I think it still makes sense to group "hot" functions together instead…
		return false;
		}
		// FIXME: update when summary data is stored in module's metadata.
		return (*F->getEntryCount()) == 0;
		davidxlUnsubmitted Not Done Reply Inline Actions the check of hasProfile is probably redundant. davidxl: the check of hasProfile is probably redundant.
		eramanUnsubmitted Not Done Reply Inline Actions I think what David says is true (!F->genEntryCount() implies !hasProfile()). eraman: I think what David says is true (!F->genEntryCount() implies !hasProfile()).
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions I've updated the logic, PTAL. danielcdh: I've updated the logic, PTAL.
		}

InstrProfSummary::InstrProfSummary(const IndexedInstrProf::Summary &S)		InstrProfSummary::InstrProfSummary(const IndexedInstrProf::Summary &S)
		davidxlUnsubmitted Done Reply Inline Actions FIXME davidxl: FIXME
: ProfileSummary(), MaxInternalBlockCount(S.get(		: ProfileSummary(), MaxInternalBlockCount(S.get(
		davidxlUnsubmitted Not Done Reply Inline Actions Should return false -- as we don't know (no profile data) davidxl: Should return false -- as we don't know (no profile data)
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions For AutoFDO, F->setEntryCount() is only called when there is sample in that function. So for those 0-sampled functions, F->getEntryCount() will be false. danielcdh: For AutoFDO, F->setEntryCount() is only called when there is sample in that function. So for…
		eramanUnsubmitted Not Done Reply Inline Actions I think sample profile loader pass should set 0 to the entry count. The meaning of getEntryCount() should not be dependent on the profile format that is used. (To be clear, I am not saying that should be fixed as part of this patch) eraman: I think sample profile loader pass should set 0 to the entry count. The meaning of…
IndexedInstrProf::Summary::MaxInternalBlockCount)),		IndexedInstrProf::Summary::MaxInternalBlockCount)),
MaxFunctionCount(S.get(IndexedInstrProf::Summary::MaxFunctionCount)),		MaxFunctionCount(S.get(IndexedInstrProf::Summary::MaxFunctionCount)),
NumFunctions(S.get(IndexedInstrProf::Summary::TotalNumFunctions)) {		NumFunctions(S.get(IndexedInstrProf::Summary::TotalNumFunctions)) {
		davidxlUnsubmitted Not Done Reply Inline Actions This will work well for instr profile but may produce false positive for sampleFDO -- this will be fixed when summary based computation is used. davidxl: This will work well for instr profile but may produce false positive for sampleFDO -- this will…
		danielcdhAuthorUnsubmitted Not Done Reply Inline Actions Yes, but we can always set entry-count as no less than 1 if the function has profile. Otherwise we have no way to calculate bb counts. danielcdh: Yes, but we can always set entry-count as no less than 1 if the function has profile. Otherwise…

TotalCount = S.get(IndexedInstrProf::Summary::TotalBlockCount);		TotalCount = S.get(IndexedInstrProf::Summary::TotalBlockCount);
MaxCount = S.get(IndexedInstrProf::Summary::MaxBlockCount);		MaxCount = S.get(IndexedInstrProf::Summary::MaxBlockCount);
NumCounts = S.get(IndexedInstrProf::Summary::TotalNumBlocks);		NumCounts = S.get(IndexedInstrProf::Summary::TotalNumBlocks);

for (unsigned I = 0; I < S.NumCutoffEntries; I++) {		for (unsigned I = 0; I < S.NumCutoffEntries; I++) {
const IndexedInstrProf::Summary::Entry &Ent = S.getEntry(I);		const IndexedInstrProf::Summary::Entry &Ent = S.getEntry(I);
DetailedSummary.emplace_back((uint32_t)Ent.Cutoff, Ent.MinBlockCount,		DetailedSummary.emplace_back((uint32_t)Ent.Cutoff, Ent.MinBlockCount,
Show All 15 Lines

test/CodeGen/X86/partition-sections.ll

This file was added.

				; RUN: llc < %s -group-functions-by-hotness=true \| FileCheck %s -check-prefix=PARTITION
				; RUN: llc < %s -function-sections -group-functions-by-hotness=false \| FileCheck %s -check-prefix=NO-PARTITION-FUNCTION-SECTION
				davidxlUnsubmitted Not Done Reply Inline Actions Can you also add a test without -function-sections option. davidxl: Can you also add a test without -function-sections option.
				; RUN: llc < %s -function-sections -group-functions-by-hotness=true \| FileCheck %s -check-prefix=PARTITION-FUNCTION-SECTION

				; PARTITION: .text.unlikely
				; PARTITION: .globl _Z3foov
				; NO-PARTITION-FUNCTION-SECTION: .text._Z3foov
				; PARTITION-FUNCTION-SECTION: .text.unlikely._Z3foov
				define i32 @_Z3foov() #0 {
				ret i32 0
				}

				; PARTITION: .globl _Z3barv
				; NO-PARTITION-FUNCTION-SECTION: .text._Z3barv
				; PARTITION-FUNCTION-SECTION: .text.unlikely._Z3barv
				define i32 @_Z3barv() #1 !prof !0 {
				ret i32 1
				}

				; PARTITION: .text
				; PARTITION: .globl _Z3bazv
				; NO-PARTITION-FUNCTION-SECTION: .text._Z3bazv
				; PARTITION-FUNCTION-SECTION: .text._Z3bazv
				define i32 @_Z3bazv() #1 {
				ret i32 2
				}

				attributes #0 = { nounwind uwtable cold }
				attributes #1 = { nounwind uwtable }

				!0 = !{!"function_entry_count", i64 0}

This is an archive of the discontinued LLVM Phabricator instance.

Add prefix based function layout when profile is available.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 48723

include/llvm/ProfileData/ProfileCommon.h

lib/CodeGen/TargetLoweringObjectFileImpl.cpp

lib/ProfileData/ProfileSummary.cpp

test/CodeGen/X86/partition-sections.ll

This is an archive of the discontinued LLVM Phabricator instance.

Add prefix based function layout when profile is available.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 48723

include/llvm/ProfileData/ProfileCommon.h

lib/CodeGen/TargetLoweringObjectFileImpl.cpp

lib/ProfileData/ProfileSummary.cpp

test/CodeGen/X86/partition-sections.ll

Add prefix based function layout when profile is available.
ClosedPublic