This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
IR/
-
MDBuilder.h
-
MC/
-
MCObjectFileInfo.h
-
lib/
-
CodeGen/
1/2
TargetLoweringObjectFileImpl.cpp
-
IR/
-
MDBuilder.cpp
-
MC/
2/4
MCObjectFileInfo.cpp
-
Transforms/IPO/
-
IPO/
1/2
SampleProfile.cpp
-
test/Transforms/SampleProfile/
-
Transforms/
-
SampleProfile/
-
profile-mismatch.ll
-
pseudo-probe-profile-mismatch.ll

Differential D136698

[SampleFDO] Persist profile staleness metrics into binary
ClosedPublic

Authored by wlei on Oct 25 2022, 10:12 AM.

Download Raw Diff

Details

Reviewers

hoy
wenlei
xur
davidxl
kazu
mtrofin

Commits

rG47b0758049ea: [SampleFDO] Persist profile staleness metrics into binary

Summary

With https://reviews.llvm.org/D136627, now we have the metrics for profile staleness based on profile statistics, monitoring the profile staleness in real-time can help user quickly identify performance issues. For a production scenario, the build is usually incremental and if we want the real-time metrics, we should store/cache all the old object's metrics somewhere and pull them in a post-build time. To make it more convenient, this patch add an option to persist them into the object binary, the metrics can be reported right away by decoding the binary rather than polling the previous stdout/stderrs from a cache system.

For implementation, it writes the statistics first into a new metadata section(llvm.stats) then encode into a special ELF .llvm_stats section. The section data is formatted as a list of key/value pair so that future statistics can be easily extended. This is also under a new switch(-persist-profile-staleness)

In terms of size overhead, the metrics are computed at module level, so the size overhead should be small, measured on one of our internal service, it costs less than < 1MB for a 10GB+ binary.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wlei created this revision.Oct 25 2022, 10:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2022, 10:12 AM

Herald added subscribers: ormris, hoy, wenlei, hiraditya. · View Herald Transcript

wlei requested review of this revision.Oct 25 2022, 10:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2022, 10:12 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B194215: Diff 470542.Oct 25 2022, 10:12 AM

wlei edited the summary of this revision. (Show Details)Oct 25 2022, 1:45 PM

wlei added reviewers: hoy, wenlei, xur, davidxl.

The use case of the feature is still a little fuzzy to me. Why is the mismatch issue not caught and handled at build time?

In D136698#3883720, @davidxl wrote:

The use case of the feature is still a little fuzzy to me. Why is the mismatch issue not caught and handled at build time?

With fleet-wide profiling for sample PGO, there's no much people can do at build time if we don't want to block builds when stale profile is found. And it's also not that scalable to address it at build time for thousands of services. We aim to build a system that monitors profile staleness for thousands of services using sample pgo, which then drives the operational work of tightening up PGO pipeline (done by people outside of compiler team). This is the compiler part for collecting such data.

In D136698#3883720, @davidxl wrote:

The use case of the feature is still a little fuzzy to me. Why is the mismatch issue not caught and handled at build time?

Sorry it was not clear. It is caught and handled at build time for one object file, the previous patch is supposed to run in an early compile-time not any linker time, one set of metrics for one obj file, but we'd like to report one aggregated metrics for the whole binary, so this patch is mostly to merge/aggregate them.

Also we' like to catch and monitor it in real time not during the off-line investigation time, one issue we hit is the incremental build, the object files are built(sometimes remotely) and cached in the database(can be shared with other users), so if we want to aggregate the metrics from old cached object file, we need also to cache the compile-time warning/stdouts, that requires a bit work on the build infra side. Hence, we chose to use the way in this patch for the merge.

wenlei added inline comments.Oct 25 2022, 3:02 PM

llvm/lib/MC/MCObjectFileInfo.cpp
534	prof_stats might still be a bit narrow. persistent key-value in obj can be a general mechanism used for other purpose as well. maybe `llvm-stats`?

For incremental build, are the artifacts (build logs) also cached?

Anyway, I can see the usefulness of the post-build analysis case. However should we consider more general mechanism for message persistence? I will copy some folks in the review.

davidxl added reviewers: kazu, mtrofin.Oct 25 2022, 4:04 PM

In D136698#3883954, @davidxl wrote:

For incremental build, are the artifacts (build logs) also cached?

Anyway, I can see the usefulness of the post-build analysis case. However should we consider more general mechanism for message persistence? I will copy some folks in the review.

Yes, this should be more general than profile staleness. Actually the other use case we have in mind is for persisting static performance proxy, which can also be used as full reward function for MLGO.

In D136698#3883954, @davidxl wrote:

For incremental build, are the artifacts (build logs) also cached?

Our infra doesn't support this right now.

Anyway, I can see the usefulness of the post-build analysis case. However should we consider more general mechanism for message persistence? I will copy some folks in the review.

Thanks for the feedback, yes, this key/value structure intends to be extended for other stats, I'd appreciate more feedback to make it general.

llvm/lib/MC/MCObjectFileInfo.cpp
534	Sounds good, renamed all to `llvm-stats`

rename to llvm stats.

Harbormaster completed remote builds in B194441: Diff 470852.Oct 26 2022, 10:10 AM

This can be dealt with later but do we need separate support (i.e. MCAsmStreamer?) for llvm-objdump to print out stats nicely?

llvm/lib/MC/MCObjectFileInfo.cpp
534	Use explicit section type `ELF::SHT_PROGBITS` instead of `DebugSecType`.

hoy added inline comments.Oct 26 2022, 11:31 AM

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
371	S could end up null. Suggest `getLLVMStatsSection` to return a non-null ptr even if it's non-elf. The work should work for non-elf since we are not using comdat concept here.

hoy added inline comments.Oct 26 2022, 11:35 AM

llvm/lib/Transforms/IPO/SampleProfile.cpp
2198	nit: StringRef should work here since all keys are literal constants.

In D136698#3886105, @wenlei wrote:

This can be dealt with later but do we need separate support (i.e. MCAsmStreamer?) for llvm-objdump to print out stats nicely?

Good idea to dump it in a human readable way, will try it later.

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
371	I see, removed the ELF condition.
llvm/lib/MC/MCObjectFileInfo.cpp
534	fixed, thanks!
llvm/lib/Transforms/IPO/SampleProfile.cpp
2198	fixed, thanks!

Updating D136698: [SampleFDO] Persist profile staleness metrics into binary

Harbormaster completed remote builds in B194530: Diff 470978.Oct 26 2022, 5:36 PM

lgtm, though I'd wait a few days in case others have comments.

This revision is now accepted and ready to land.Oct 28 2022, 9:27 AM

In D136698#3883961, @wenlei wrote:

In D136698#3883954, @davidxl wrote:

For incremental build, are the artifacts (build logs) also cached?

Anyway, I can see the usefulness of the post-build analysis case. However should we consider more general mechanism for message persistence? I will copy some folks in the review.

Yes, this should be more general than profile staleness. Actually the other use case we have in mind is for persisting static performance proxy, which can also be used as full reward function for MLGO.

(sorry for the delay) SGTM, I assume, if we want the values to be more complex, we can base64-encode them or something like that (i.e. we can evolve the format to not assume the values are ints); also, we may want to allow, at a later stage, the section to be populated with values available to the assembler - like exact MBB sizes - I don't think that'd be difficult to fit in at a later stage (some sort of callback to TargetLoweringObjectFileImpl.cpp or something like that). I'm listing these to check if there's any assumptions that such later evolution might invalidate.

tmsriram added a subscriber: tmsriram.Oct 31 2022, 3:11 PM

In D136698#3892484, @mtrofin wrote:

In D136698#3883961, @wenlei wrote:

In D136698#3883954, @davidxl wrote:

For incremental build, are the artifacts (build logs) also cached?

Anyway, I can see the usefulness of the post-build analysis case. However should we consider more general mechanism for message persistence? I will copy some folks in the review.

Yes, this should be more general than profile staleness. Actually the other use case we have in mind is for persisting static performance proxy, which can also be used as full reward function for MLGO.

(sorry for the delay) SGTM, I assume, if we want the values to be more complex, we can base64-encode them or something like that (i.e. we can evolve the format to not assume the values are ints); also, we may want to allow, at a later stage, the section to be populated with values available to the assembler - like exact MBB sizes - I don't think that'd be difficult to fit in at a later stage (some sort of callback to TargetLoweringObjectFileImpl.cpp or something like that). I'm listing these to check if there's any assumptions that such later evolution might invalidate.

Thank you for the feedback! Changed to use base64-encode for the value encoding.

we may want to allow, at a later stage, the section to be populated with values available to the assembler - like exact MBB sizes - I don't think that'd be difficult to fit in at a later stage (some sort of callback to TargetLoweringObjectFileImpl.cpp or something like that).

I was trying to play with the later stage values, it seems current framework should be able to be extended for this feature, i,e, it's covered as long as it can emit the values to the metadata. For the MC level values(like the MBB sizes ), IIUC, it's still able to access the metadata, And later in the AsmPrinter, TLOF.emitModuleMetadata(*OutStreamer, M); is called in AsmPrinter::doFinalization(Module &M){...}, my understanding is that doFinalization is already the very later stage, it's after the assembling things. Please let me know if I missed anything.

Changed to use base64-encode for the value encoding.

Harbormaster completed remote builds in B195553: Diff 472410.Nov 1 2022, 3:57 PM

fix one bug: StringRef --> string

Plan to commit this tomorrow if no any objections.

Harbormaster completed remote builds in B196795: Diff 474116.Nov 8 2022, 5:50 PM

wlei edited the summary of this revision. (Show Details)Nov 9 2022, 10:33 PM

Closed by commit rG47b0758049ea: [SampleFDO] Persist profile staleness metrics into binary (authored by wlei). · Explain WhyNov 9 2022, 10:35 PM

This revision was automatically updated to reflect the committed changes.

wlei added a commit: rG47b0758049ea: [SampleFDO] Persist profile staleness metrics into binary.

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

MDBuilder.h

4 lines

MC/

MCObjectFileInfo.h

5 lines

lib/

CodeGen/

TargetLoweringObjectFileImpl.cpp

26 lines

IR/

MDBuilder.cpp

12 lines

MC/

MCObjectFileInfo.cpp

6 lines

Transforms/

IPO/

SampleProfile.cpp

53 lines

test/

Transforms/

SampleProfile/

profile-mismatch.ll

9 lines

pseudo-probe-profile-mismatch.ll

9 lines

Diff 474444

llvm/include/llvm/IR/MDBuilder.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	MDNode *createFunctionEntryCount(uint64_t Count, bool Synthetic,
const DenseSet<GlobalValue::GUID> *Imports);		const DenseSet<GlobalValue::GUID> *Imports);

/// Return metadata containing the section prefix for a function.		/// Return metadata containing the section prefix for a function.
MDNode *createFunctionSectionPrefix(StringRef Prefix);		MDNode *createFunctionSectionPrefix(StringRef Prefix);

/// Return metadata containing the pseudo probe descriptor for a function.		/// Return metadata containing the pseudo probe descriptor for a function.
MDNode createPseudoProbeDesc(uint64_t GUID, uint64_t Hash, Function F);		MDNode createPseudoProbeDesc(uint64_t GUID, uint64_t Hash, Function F);

		/// Return metadata containing llvm statistics.
		MDNode *
		createLLVMStats(ArrayRef<std::pair<StringRef, uint64_t>> LLVMStatsVec);

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// Range metadata.		// Range metadata.
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//

/// Return metadata describing the range [Lo, Hi).		/// Return metadata describing the range [Lo, Hi).
MDNode *createRange(const APInt &Lo, const APInt &Hi);		MDNode *createRange(const APInt &Lo, const APInt &Hi);

/// Return metadata describing the range [Lo, Hi).		/// Return metadata describing the range [Lo, Hi).
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/include/llvm/MC/MCObjectFileInfo.h

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	protected:

/// Section containing metadata on function stack sizes.		/// Section containing metadata on function stack sizes.
MCSection *StackSizesSection = nullptr;		MCSection *StackSizesSection = nullptr;

/// Section for pseudo probe information used by AutoFDO		/// Section for pseudo probe information used by AutoFDO
MCSection *PseudoProbeSection = nullptr;		MCSection *PseudoProbeSection = nullptr;
MCSection *PseudoProbeDescSection = nullptr;		MCSection *PseudoProbeDescSection = nullptr;

		// Section for metadata of llvm statistics.
		MCSection *LLVMStatsSection = nullptr;

// ELF specific sections.		// ELF specific sections.
MCSection *DataRelROSection = nullptr;		MCSection *DataRelROSection = nullptr;
MCSection *MergeableConst4Section = nullptr;		MCSection *MergeableConst4Section = nullptr;
MCSection *MergeableConst8Section = nullptr;		MCSection *MergeableConst8Section = nullptr;
MCSection *MergeableConst16Section = nullptr;		MCSection *MergeableConst16Section = nullptr;
MCSection *MergeableConst32Section = nullptr;		MCSection *MergeableConst32Section = nullptr;

// MachO specific sections.		// MachO specific sections.
▲ Show 20 Lines • Show All 170 Lines • ▼ Show 20 Lines	public:
MCSection *getBBAddrMapSection(const MCSection &TextSec) const;		MCSection *getBBAddrMapSection(const MCSection &TextSec) const;

MCSection *getKCFITrapSection(const MCSection &TextSec) const;		MCSection *getKCFITrapSection(const MCSection &TextSec) const;

MCSection *getPseudoProbeSection(const MCSection &TextSec) const;		MCSection *getPseudoProbeSection(const MCSection &TextSec) const;

MCSection *getPseudoProbeDescSection(StringRef FuncName) const;		MCSection *getPseudoProbeDescSection(StringRef FuncName) const;

		MCSection *getLLVMStatsSection() const;

MCSection getPCSection(StringRef Name, const MCSection TextSec) const;		MCSection getPCSection(StringRef Name, const MCSection TextSec) const;

// ELF specific sections.		// ELF specific sections.
MCSection *getDataRelROSection() const { return DataRelROSection; }		MCSection *getDataRelROSection() const { return DataRelROSection; }
const MCSection *getMergeableConst4Section() const {		const MCSection *getMergeableConst4Section() const {
return MergeableConst4Section;		return MergeableConst4Section;
}		}
const MCSection *getMergeableConst8Section() const {		const MCSection *getMergeableConst8Section() const {
▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
#include "llvm/MC/MCSectionWasm.h"		#include "llvm/MC/MCSectionWasm.h"
#include "llvm/MC/MCSectionXCOFF.h"		#include "llvm/MC/MCSectionXCOFF.h"
#include "llvm/MC/MCStreamer.h"		#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSymbol.h"		#include "llvm/MC/MCSymbol.h"
#include "llvm/MC/MCSymbolELF.h"		#include "llvm/MC/MCSymbolELF.h"
#include "llvm/MC/MCValue.h"		#include "llvm/MC/MCValue.h"
#include "llvm/MC/SectionKind.h"		#include "llvm/MC/SectionKind.h"
#include "llvm/ProfileData/InstrProf.h"		#include "llvm/ProfileData/InstrProf.h"
		#include "llvm/Support/Base64.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CodeGen.h"		#include "llvm/Support/CodeGen.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/Format.h"		#include "llvm/Support/Format.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetMachine.h"		#include "llvm/Target/TargetMachine.h"
#include <cassert>		#include <cassert>
#include <string>		#include <string>
▲ Show 20 Lines • Show All 292 Lines • ▼ Show 20 Lines	for (const auto *Operand : FuncInfo->operands()) {
Streamer.switchSection(S);		Streamer.switchSection(S);
Streamer.emitInt64(GUID->getZExtValue());		Streamer.emitInt64(GUID->getZExtValue());
Streamer.emitInt64(Hash->getZExtValue());		Streamer.emitInt64(Hash->getZExtValue());
Streamer.emitULEB128IntValue(Name->getString().size());		Streamer.emitULEB128IntValue(Name->getString().size());
Streamer.emitBytes(Name->getString());		Streamer.emitBytes(Name->getString());
}		}
}		}

		if (NamedMDNode *LLVMStats = M.getNamedMetadata("llvm.stats")) {
		// Emit the metadata for llvm statistics into .llvm_stats section, which is
		hoyUnsubmitted Not Done Reply Inline Actions S could end up null. Suggest `getLLVMStatsSection` to return a non-null ptr even if it's non-elf. The work should work for non-elf since we are not using comdat concept here. hoy: S could end up null. Suggest `getLLVMStatsSection` to return a non-null ptr even if it's non…
		wleiAuthorUnsubmitted Done Reply Inline Actions I see, removed the ELF condition. wlei: I see, removed the ELF condition.
		// formatted as a list of key/value pair, the value is base64 encoded.
		auto *S = C.getObjectFileInfo()->getLLVMStatsSection();
		Streamer.switchSection(S);
		for (const auto *Operand : LLVMStats->operands()) {
		const auto *MD = cast<MDNode>(Operand);
		assert(MD->getNumOperands() % 2 == 0 &&
		("Operand num should be even for a list of key/value pair"));
		for (size_t I = 0; I < MD->getNumOperands(); I += 2) {
		// Encode the key string size.
		auto *Key = cast<MDString>(MD->getOperand(I));
		Streamer.emitULEB128IntValue(Key->getString().size());
		Streamer.emitBytes(Key->getString());
		// Encode the value into a Base64 string.
		std::string Value = encodeBase64(
		Twine(mdconst::dyn_extract<ConstantInt>(MD->getOperand(I + 1))
		->getZExtValue())
		.str());
		Streamer.emitULEB128IntValue(Value.size());
		Streamer.emitBytes(Value);
		}
		}
		}

unsigned Version = 0;		unsigned Version = 0;
unsigned Flags = 0;		unsigned Flags = 0;
StringRef Section;		StringRef Section;

GetObjCImageInfo(M, Version, Flags, Section);		GetObjCImageInfo(M, Version, Flags, Section);
if (!Section.empty()) {		if (!Section.empty()) {
auto *S = C.getELFSection(Section, ELF::SHT_PROGBITS, ELF::SHF_ALLOC);		auto *S = C.getELFSection(Section, ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
Streamer.switchSection(S);		Streamer.switchSection(S);
▲ Show 20 Lines • Show All 2,252 Lines • Show Last 20 Lines

llvm/lib/IR/MDBuilder.cpp

Show First 20 Lines • Show All 338 Lines • ▼ Show 20 Lines	MDNode *MDBuilder::createPseudoProbeDesc(uint64_t GUID, uint64_t Hash,
Function *F) {		Function *F) {
auto *Int64Ty = Type::getInt64Ty(Context);		auto *Int64Ty = Type::getInt64Ty(Context);
SmallVector<Metadata *, 3> Ops(3);		SmallVector<Metadata *, 3> Ops(3);
Ops[0] = createConstant(ConstantInt::get(Int64Ty, GUID));		Ops[0] = createConstant(ConstantInt::get(Int64Ty, GUID));
Ops[1] = createConstant(ConstantInt::get(Int64Ty, Hash));		Ops[1] = createConstant(ConstantInt::get(Int64Ty, Hash));
Ops[2] = createString(F->getName());		Ops[2] = createString(F->getName());
return MDNode::get(Context, Ops);		return MDNode::get(Context, Ops);
}		}

		MDNode *
		MDBuilder::createLLVMStats(ArrayRef<std::pair<StringRef, uint64_t>> LLVMStats) {
		auto *Int64Ty = Type::getInt64Ty(Context);
		SmallVector<Metadata , 4> Ops(LLVMStats.size() 2);
		for (size_t I = 0; I < LLVMStats.size(); I++) {
		Ops[I * 2] = createString(LLVMStats[I].first);
		Ops[I * 2 + 1] =
		createConstant(ConstantInt::get(Int64Ty, LLVMStats[I].second));
		}
		return MDNode::get(Context, Ops);
		}

llvm/lib/MC/MCObjectFileInfo.cpp

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	void MCObjectFileInfo::initELFMCObjectFileInfo(const Triple &T, bool Large) {
EHFrameSection =		EHFrameSection =
Ctx->getELFSection(".eh_frame", EHSectionType, EHSectionFlags);		Ctx->getELFSection(".eh_frame", EHSectionType, EHSectionFlags);

StackSizesSection = Ctx->getELFSection(".stack_sizes", ELF::SHT_PROGBITS, 0);		StackSizesSection = Ctx->getELFSection(".stack_sizes", ELF::SHT_PROGBITS, 0);

PseudoProbeSection = Ctx->getELFSection(".pseudo_probe", DebugSecType, 0);		PseudoProbeSection = Ctx->getELFSection(".pseudo_probe", DebugSecType, 0);
PseudoProbeDescSection =		PseudoProbeDescSection =
Ctx->getELFSection(".pseudo_probe_desc", DebugSecType, 0);		Ctx->getELFSection(".pseudo_probe_desc", DebugSecType, 0);

		wenleiUnsubmitted Not Done Reply Inline Actions prof_stats might still be a bit narrow. persistent key-value in obj can be a general mechanism used for other purpose as well. maybe `llvm-stats`? wenlei: prof_stats might still be a bit narrow. persistent key-value in obj can be a general mechanism…
		wleiAuthorUnsubmitted Done Reply Inline Actions Sounds good, renamed all to `llvm-stats` wlei: Sounds good, renamed all to `llvm-stats`
		wenleiUnsubmitted Not Done Reply Inline Actions Use explicit section type `ELF::SHT_PROGBITS` instead of `DebugSecType`. wenlei: Use explicit section type `ELF::SHT_PROGBITS` instead of `DebugSecType`.
		wleiAuthorUnsubmitted Done Reply Inline Actions fixed, thanks! wlei: fixed, thanks!
		LLVMStatsSection = Ctx->getELFSection(".llvm_stats", ELF::SHT_PROGBITS, 0);
}		}

void MCObjectFileInfo::initGOFFMCObjectFileInfo(const Triple &T) {		void MCObjectFileInfo::initGOFFMCObjectFileInfo(const Triple &T) {
TextSection =		TextSection =
Ctx->getGOFFSection(".text", SectionKind::getText(), nullptr, nullptr);		Ctx->getGOFFSection(".text", SectionKind::getText(), nullptr, nullptr);
BSSSection =		BSSSection =
Ctx->getGOFFSection(".bss", SectionKind::getBSS(), nullptr, nullptr);		Ctx->getGOFFSection(".bss", SectionKind::getBSS(), nullptr, nullptr);
PPA1Section =		PPA1Section =
▲ Show 20 Lines • Show All 652 Lines • ▼ Show 20 Lines	if (Ctx->getTargetTriple().supportsCOMDAT() && !FuncName.empty()) {
S->getEntrySize(),		S->getEntrySize(),
S->getName() + "_" + FuncName,		S->getName() + "_" + FuncName,
/IsComdat=/true);		/IsComdat=/true);
}		}
}		}
return PseudoProbeDescSection;		return PseudoProbeDescSection;
}		}

		MCSection *MCObjectFileInfo::getLLVMStatsSection() const {
		return LLVMStatsSection;
		}

MCSection *MCObjectFileInfo::getPCSection(StringRef Name,		MCSection *MCObjectFileInfo::getPCSection(StringRef Name,
const MCSection *TextSec) const {		const MCSection *TextSec) const {
if (Ctx->getObjectFileType() != MCContext::IsELF)		if (Ctx->getObjectFileType() != MCContext::IsELF)
return nullptr;		return nullptr;

// SHF_WRITE for relocations, and let user post-process data in-place.		// SHF_WRITE for relocations, and let user post-process data in-place.
unsigned Flags = ELF::SHF_WRITE \| ELF::SHF_ALLOC \| ELF::SHF_LINK_ORDER;		unsigned Flags = ELF::SHF_WRITE \| ELF::SHF_ALLOC \| ELF::SHF_LINK_ORDER;

Show All 13 Lines

llvm/lib/Transforms/IPO/SampleProfile.cpp

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
static cl::opt<std::string> SampleProfileRemappingFile(		static cl::opt<std::string> SampleProfileRemappingFile(
"sample-profile-remapping-file", cl::init(""), cl::value_desc("filename"),		"sample-profile-remapping-file", cl::init(""), cl::value_desc("filename"),
cl::desc("Profile remapping file loaded by -sample-profile"), cl::Hidden);		cl::desc("Profile remapping file loaded by -sample-profile"), cl::Hidden);

static cl::opt<bool> ReportProfileStaleness(		static cl::opt<bool> ReportProfileStaleness(
"report-profile-staleness", cl::Hidden, cl::init(false),		"report-profile-staleness", cl::Hidden, cl::init(false),
cl::desc("Compute and report stale profile statistical metrics."));		cl::desc("Compute and report stale profile statistical metrics."));

		static cl::opt<bool> PersistProfileStaleness(
		"persist-profile-staleness", cl::Hidden, cl::init(false),
		cl::desc("Compute stale profile statistical metrics and write it into the "
		"native object file(.llvm_stats section)."));

static cl::opt<bool> ProfileSampleAccurate(		static cl::opt<bool> ProfileSampleAccurate(
"profile-sample-accurate", cl::Hidden, cl::init(false),		"profile-sample-accurate", cl::Hidden, cl::init(false),
cl::desc("If the sample profile is accurate, we will mark all un-sampled "		cl::desc("If the sample profile is accurate, we will mark all un-sampled "
"callsite and function as having 0 samples. Otherwise, treat "		"callsite and function as having 0 samples. Otherwise, treat "
"un-sampled callsites and functions conservatively as unknown. "));		"un-sampled callsites and functions conservatively as unknown. "));

static cl::opt<bool> ProfileSampleBlockAccurate(		static cl::opt<bool> ProfileSampleBlockAccurate(
"profile-sample-block-accurate", cl::Hidden, cl::init(false),		"profile-sample-block-accurate", cl::Hidden, cl::init(false),
▲ Show 20 Lines • Show All 1,892 Lines • ▼ Show 20 Lines	if (!ProbeManager->moduleIsProbed(M)) {
const char *Msg =		const char *Msg =
"Pseudo-probe-based profile requires SampleProfileProbePass";		"Pseudo-probe-based profile requires SampleProfileProbePass";
Ctx.diagnose(DiagnosticInfoSampleProfile(M.getModuleIdentifier(), Msg,		Ctx.diagnose(DiagnosticInfoSampleProfile(M.getModuleIdentifier(), Msg,
DS_Warning));		DS_Warning));
return false;		return false;
}		}
}		}

if (ReportProfileStaleness) {		if (ReportProfileStaleness \|\| PersistProfileStaleness) {
MatchingManager =		MatchingManager =
std::make_unique<SampleProfileMatcher>(M, *Reader, ProbeManager.get());		std::make_unique<SampleProfileMatcher>(M, *Reader, ProbeManager.get());
}		}

return true;		return true;
}		}

void SampleProfileMatcher::detectProfileMismatch(const Function &F,		void SampleProfileMatcher::detectProfileMismatch(const Function &F,
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	for (auto &F : M) {
if (F.isDeclaration() \|\| !F.hasFnAttribute("use-sample-profile"))		if (F.isDeclaration() \|\| !F.hasFnAttribute("use-sample-profile"))
continue;		continue;
FunctionSamples *FS = Reader.getSamplesFor(F);		FunctionSamples *FS = Reader.getSamplesFor(F);
if (!FS)		if (!FS)
continue;		continue;
detectProfileMismatch(F, *FS);		detectProfileMismatch(F, *FS);
}		}

		if (ReportProfileStaleness) {
if (FunctionSamples::ProfileIsProbeBased) {		if (FunctionSamples::ProfileIsProbeBased) {
errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")"		errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")"
<< " of functions' profile are invalid and "		<< " of functions' profile are invalid and "
<< " (" << MismatchedFuncHashSamples << "/" << TotalFuncHashSamples		<< " (" << MismatchedFuncHashSamples << "/" << TotalFuncHashSamples
<< ")"		<< ")"
<< " of samples are discarded due to function hash mismatch.\n";		<< " of samples are discarded due to function hash mismatch.\n";
}		}
errs() << "(" << NumMismatchedCallsite << "/" << TotalProfiledCallsite << ")"		errs() << "(" << NumMismatchedCallsite << "/" << TotalProfiledCallsite
		<< ")"
<< " of callsites' profile are invalid and "		<< " of callsites' profile are invalid and "
<< "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples		<< "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples
<< ")"		<< ")"
<< " of samples are discarded due to callsite location mismatch.\n";		<< " of samples are discarded due to callsite location mismatch.\n";
}		}

		if (PersistProfileStaleness) {
		LLVMContext &Ctx = M.getContext();
		MDBuilder MDB(Ctx);

		SmallVector<std::pair<StringRef, uint64_t>> ProfStatsVec;
		if (FunctionSamples::ProfileIsProbeBased) {
		ProfStatsVec.emplace_back("NumMismatchedFuncHash", NumMismatchedFuncHash);
		ProfStatsVec.emplace_back("TotalProfiledFunc", TotalProfiledFunc);
		ProfStatsVec.emplace_back("MismatchedFuncHashSamples",
		MismatchedFuncHashSamples);
		ProfStatsVec.emplace_back("TotalFuncHashSamples", TotalFuncHashSamples);
		}
		ProfStatsVec.emplace_back("MismatchedCallsiteSamples",
		MismatchedCallsiteSamples);
		ProfStatsVec.emplace_back("TotalCallsiteSamples", TotalCallsiteSamples);

		auto *MD = MDB.createLLVMStats(ProfStatsVec);
		auto *NMD = M.getOrInsertNamedMetadata("llvm.stats");
		NMD->addOperand(MD);
		}
		}

bool SampleProfileLoader::runOnModule(Module &M, ModuleAnalysisManager *AM,		bool SampleProfileLoader::runOnModule(Module &M, ModuleAnalysisManager *AM,
ProfileSummaryInfo _PSI, CallGraph CG) {		ProfileSummaryInfo _PSI, CallGraph CG) {
GUIDToFuncNameMapper Mapper(M, *Reader, GUIDToFuncNameMap);		GUIDToFuncNameMapper Mapper(M, *Reader, GUIDToFuncNameMap);
		hoyUnsubmitted Not Done Reply Inline Actions nit: StringRef should work here since all keys are literal constants. hoy: nit: StringRef should work here since all keys are literal constants.
		wleiAuthorUnsubmitted Done Reply Inline Actions fixed, thanks! wlei: fixed, thanks!

PSI = _PSI;		PSI = _PSI;
if (M.getProfileSummary(/* IsCS */ false) == nullptr) {		if (M.getProfileSummary(/* IsCS */ false) == nullptr) {
M.setProfileSummary(Reader->getSummary().getMD(M.getContext()),		M.setProfileSummary(Reader->getSummary().getMD(M.getContext()),
ProfileSummary::PSK_Sample);		ProfileSummary::PSK_Sample);
PSI->refresh();		PSI->refresh();
}		}
// Compute the total number of samples collected in this profile.		// Compute the total number of samples collected in this profile.
Show All 25 Lines	if (Remapper) {
if (*MapName != OrigName && !MapName->empty())		if (*MapName != OrigName && !MapName->empty())
SymbolMap.insert(std::make_pair(*MapName, F));		SymbolMap.insert(std::make_pair(*MapName, F));
}		}
}		}
}		}
assert(SymbolMap.count(StringRef()) == 0 &&		assert(SymbolMap.count(StringRef()) == 0 &&
"No empty StringRef should be added in SymbolMap");		"No empty StringRef should be added in SymbolMap");

if (ReportProfileStaleness)		if (ReportProfileStaleness \|\| PersistProfileStaleness)
MatchingManager->detectProfileMismatch();		MatchingManager->detectProfileMismatch();

bool retval = false;		bool retval = false;
for (auto *F : buildFunctionOrder(M, CG)) {		for (auto *F : buildFunctionOrder(M, CG)) {
assert(!F->isDeclaration());		assert(!F->isDeclaration());
clearFunctionData();		clearFunctionData();
retval \|= runOnFunction(*F, AM);		retval \|= runOnFunction(*F, AM);
}		}
▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/test/Transforms/SampleProfile/profile-mismatch.ll

	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/profile-mismatch.prof -report-profile-staleness -S 2>%t			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/profile-mismatch.prof -report-profile-staleness -persist-profile-staleness -S 2>%t -o %t.ll
	; RUN: FileCheck %s --input-file %t			; RUN: FileCheck %s --input-file %t
				; RUN: FileCheck %s --input-file %t.ll -check-prefix=CHECK-MD
				; RUN: llc < %t.ll -filetype=obj -o %t.obj
				; RUN: llvm-objdump --section-headers %t.obj \| FileCheck %s --check-prefix=CHECK-OBJ

	; CHECK: (2/3) of callsites' profile are invalid and (20/30) of samples are discarded due to callsite location mismatch.			; CHECK: (2/3) of callsites' profile are invalid and (20/30) of samples are discarded due to callsite location mismatch.

				; CHECK-MD: ![[#]] = !{!"MismatchedCallsiteSamples", i64 20, !"TotalCallsiteSamples", i64 30}

				; CHECK-OBJ: .llvm_stats

	target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@x = dso_local global i32 0, align 4, !dbg !0			@x = dso_local global i32 0, align 4, !dbg !0

	; Function Attrs: nounwind uwtable			; Function Attrs: nounwind uwtable
	define dso_local i32 @foo(i32 noundef %x) #0 !dbg !12 {			define dso_local i32 @foo(i32 noundef %x) #0 !dbg !12 {
	entry:			entry:
	▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

llvm/test/Transforms/SampleProfile/pseudo-probe-profile-mismatch.ll

	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/pseudo-probe-profile-mismatch.prof -report-profile-staleness -S 2>%t			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/pseudo-probe-profile-mismatch.prof -report-profile-staleness -persist-profile-staleness -S 2>%t -o %t.ll
	; RUN: FileCheck %s --input-file %t			; RUN: FileCheck %s --input-file %t
				; RUN: FileCheck %s --input-file %t.ll -check-prefix=CHECK-MD
				; RUN: llc < %t.ll -filetype=obj -o %t.obj
				; RUN: llvm-objdump --section-headers %t.obj \| FileCheck %s --check-prefix=CHECK-OBJ

	; CHECK: (1/3) of functions' profile are invalid and (10/50) of samples are discarded due to function hash mismatch.			; CHECK: (1/3) of functions' profile are invalid and (10/50) of samples are discarded due to function hash mismatch.
	; CHECK: (2/3) of callsites' profile are invalid and (20/30) of samples are discarded due to callsite location mismatch.			; CHECK: (2/3) of callsites' profile are invalid and (20/30) of samples are discarded due to callsite location mismatch.

				; CHECK-MD: ![[#]] = !{!"NumMismatchedFuncHash", i64 1, !"TotalProfiledFunc", i64 3, !"MismatchedFuncHashSamples", i64 10, !"TotalFuncHashSamples", i64 50, !"MismatchedCallsiteSamples", i64 20, !"TotalCallsiteSamples", i64 30}

				; CHECK-OBJ: .llvm_stats

	target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@x = dso_local global i32 0, align 4, !dbg !0			@x = dso_local global i32 0, align 4, !dbg !0

	; Function Attrs: nounwind uwtable			; Function Attrs: nounwind uwtable
	define dso_local i32 @foo(i32 noundef %x) #0 !dbg !16 {			define dso_local i32 @foo(i32 noundef %x) #0 !dbg !16 {
	entry:			entry:
	▲ Show 20 Lines • Show All 219 Lines • Show Last 20 Lines