This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
bolt/
-
include/bolt/Profile/
-
bolt/
-
Profile/
2/2
YAMLProfileReader.h
-
lib/Profile/
-
Profile/
1/1
YAMLProfileReader.cpp

Differential D159460

[BOLT][NFC] Speedup YAML profile processing
ClosedPublic

Authored by Amir on Sep 5 2023, 6:54 PM.

Download Raw Diff

Details

Reviewers

rafauler
maksfb

Group Reviewers

Restricted Project

Commits

rG7b750943d722: [BOLT][NFC] Speedup YAML profile processing

Summary

Reduce YAML profile processing times:

preprocessProfile: speed up buildNameMaps by replacing ProfileNameToProfile mapping with ProfileFunctionNames set and ProfileBFs vector. Pre-look up YamlBF->BF correspondence, memoize in ProfileBFs.
readProfile: replace iteration over all functions in the binary by iteration over profile functions (strict match and LTO name match).

On a large binary (1.9M functions) and large YAML profile (121MB, 30k functions)
reduces profile steps runtime:
pre-process profile data: 12.4953s -> 10.7123s
process profile data: 9.8195s -> 5.6639s

Compared to fdata profile reading:
pre-process profile data: 8.0268s
process profile data: 1.0265s
process profile data pre-CFG: 0.1644s

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Amir created this revision.Sep 5 2023, 6:54 PM

Herald added a reviewer: rafauler. · View Herald TranscriptSep 5 2023, 6:54 PM

Herald added a reviewer: maksfb. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: treapster, ayermolo. · View Herald Transcript

Amir requested review of this revision.Sep 5 2023, 6:54 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2023, 6:54 PM

Herald added subscribers: llvm-commits, yota9. · View Herald Transcript

Harbormaster completed remote builds in B256665: Diff 555960.Sep 5 2023, 7:00 PM

Nice - good improvement. Do you know where the majority of the profile-processing time is spent with this change?

bolt/include/bolt/Profile/YAMLProfileReader.h
84	Is the change to always return `true` necessary?
bolt/lib/Profile/YAMLProfileReader.cpp
52	This should always return non-null value.

Address comments

maksfb accepted this revision.Sep 11 2023, 3:28 PM

This revision is now accepted and ready to land.Sep 11 2023, 3:28 PM

Amir marked an inline comment as done.Sep 11 2023, 3:28 PM

Amir added inline comments.

bolt/include/bolt/Profile/YAMLProfileReader.h
84	Not really, just to avoid curly braces in `matchProfile` lambda. Reverted this change back to returning `void` and added curly braces down below.

In D159460#4643578, @maksfb wrote:

Nice - good improvement. Do you know where the majority of the profile-processing time is spent with this change?

Here's what perf report shows when narrowed down to YAMLProfileReader class:

Samples: 49M of event 'cycles', Event count (approx.): 19618

-  100.00%              llvm-bolt
   -  100.00%              llvm-bolt
      -   49.06%              [.] llvm::bolt::YAMLProfileReader::mayHaveProfileData
         +   18.37%              [.] llvm::StringMapImpl::FindKey
         +   15.31%              [.] llvm::bolt::YAMLProfileReader::mayHaveProfileDat
         +    6.72%              [.] llvm::bolt::BinaryFunction::forEachName<llvm::bo
         +    3.94%              [.] operator delete@plt
         +    2.66%              [.] llvm::bolt::RewriteInstance::selectFunctionsToPr
         +    2.06%              [.] llvm::bolt::getLTOCommonName
         +    0.01%              [k] asm_sysvec_apic_timer_interrupt
      +   25.63%              [.] llvm::bolt::YAMLProfileReader::parseFunctionProfile
      +   22.19%              [.] llvm::bolt::YAMLProfileReader::buildNameMaps
      +    1.56%              [.] llvm::bolt::YAMLProfileReader::readProfile
      +    0.68%              [.] llvm::bolt::YAMLProfileReader::matchProfileToFunction
      +    0.32%              [.] llvm::bolt::YAMLProfileReader::parseFunctionProfile(llvm::bolt::BinaryFunction&
      +    0.32%              [.] llvm::bolt::YAMLProfileReader::~YAMLProfileReader
      +    0.24%              [.] llvm::bolt::YAMLProfileReader::preprocessProfile

Quick analysis:

mayHaveProfileData is not called from profile {pre-,}processing. It's only called from RI::selectFunctionsToProcess.
parseFunctionProfile is invoked once per profile function and does the heavy lifting attaching the profile to CFG.
buildNameMaps loops over profile functions and binary functions, hence the cost. I tried to reduce overhead per summary.
readProfile actually reads YAML profile, but the bulk of overhead is in llvm::yaml code - see below.

llvm::yaml methods:

Samples: 49M of event 'cycles', Event count (approx.): 298169

-  100.00%              llvm-bolt
   -  100.00%              llvm-bolt
      +   15.42%              [.] llvm::yaml::Scanner::peekNext
      +    8.81%              [.] llvm::yaml::Scanner::scanPlainScalar
      +    8.18%              [.] llvm::yaml::Scanner::removeStaleSimpleKeyCandidates
      +    7.88%              [.] llvm::yaml::Scanner::fetchMoreTokens
      +    4.94%              [.] llvm::yaml::Document::parseBlockNode
      +    4.86%              [.] llvm::yaml::Scanner::getNext
      +    4.53%              [.] llvm::yaml::Scanner::scanToNextToken
      +    4.49%              [.] llvm::yaml::Input::createHNodes

As you can see createHNodes is no longer the most expensive part. I don't see easy optimization opportunities here.

This revision was landed with ongoing or failed builds.Sep 11 2023, 4:08 PM

Closed by commit rG7b750943d722: [BOLT][NFC] Speedup YAML profile processing (authored by Amir). · Explain Why

This revision was automatically updated to reflect the committed changes.

Amir added a commit: rG7b750943d722: [BOLT][NFC] Speedup YAML profile processing.

Amir mentioned this in D159529: [BOLT][YAML] Only read first profile per function.Sep 18 2023, 6:17 PM

Amir mentioned this in rG6a1cf545cc0b: [BOLT][YAML] Only read first profile per function.Sep 18 2023, 8:41 PM

Revision Contents

Path

Size

bolt/

include/

bolt/

Profile/

YAMLProfileReader.h

11 lines

lib/

Profile/

YAMLProfileReader.cpp

139 lines

Diff 556502

bolt/include/bolt/Profile/YAMLProfileReader.h

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	private:
/// is attributed.		/// is attributed.
FunctionSet ProfiledFunctions;		FunctionSet ProfiledFunctions;

/// For LTO symbol resolution.		/// For LTO symbol resolution.
/// Map a common LTO prefix to a list of YAML profiles matching the prefix.		/// Map a common LTO prefix to a list of YAML profiles matching the prefix.
StringMap<std::vector<yaml::bolt::BinaryFunctionProfile *>> LTOCommonNameMap;		StringMap<std::vector<yaml::bolt::BinaryFunctionProfile *>> LTOCommonNameMap;

/// Map a common LTO prefix to a set of binary functions.		/// Map a common LTO prefix to a set of binary functions.
StringMap<FunctionSet> LTOCommonNameFunctionMap;		StringMap<std::unordered_set<BinaryFunction *>> LTOCommonNameFunctionMap;

/// Strict matching of a name in a profile to its contents.		/// Function names in profile.
StringMap<yaml::bolt::BinaryFunctionProfile *> ProfileNameToProfile;		StringSet<> ProfileFunctionNames;

		/// BinaryFunction pointers indexed by YamlBP functions.
		std::vector<BinaryFunction *> ProfileBFs;

/// Populate \p Function profile with the one supplied in YAML format.		/// Populate \p Function profile with the one supplied in YAML format.
bool parseFunctionProfile(BinaryFunction &Function,		bool parseFunctionProfile(BinaryFunction &Function,
const yaml::bolt::BinaryFunctionProfile &YamlBF);		const yaml::bolt::BinaryFunctionProfile &YamlBF);

/// Infer function profile from stale data (collected on older binaries).		/// Infer function profile from stale data (collected on older binaries).
bool inferStaleProfile(BinaryFunction &Function,		bool inferStaleProfile(BinaryFunction &Function,
const yaml::bolt::BinaryFunctionProfile &YamlBF);		const yaml::bolt::BinaryFunctionProfile &YamlBF);

/// Initialize maps for profile matching.		/// Initialize maps for profile matching.
void buildNameMaps(std::map<uint64_t, BinaryFunction> &Functions);		void buildNameMaps(BinaryContext &BC);

/// Update matched YAML -> BinaryFunction pair.		/// Update matched YAML -> BinaryFunction pair.
void matchProfileToFunction(yaml::bolt::BinaryFunctionProfile &YamlBF,		void matchProfileToFunction(yaml::bolt::BinaryFunctionProfile &YamlBF,
		maksfbUnsubmitted Done Reply Inline Actions Is the change to always return `true` necessary? maksfb: Is the change to always return `true` necessary?
		AmirAuthorUnsubmitted Done Reply Inline Actions Not really, just to avoid curly braces in `matchProfile` lambda. Reverted this change back to returning `void` and added curly braces down below. Amir: Not really, just to avoid curly braces in `matchProfile` lambda. Reverted this change back to…
BinaryFunction &BF) {		BinaryFunction &BF) {
if (YamlBF.Id >= YamlProfileToFunction.size())		if (YamlBF.Id >= YamlProfileToFunction.size())
YamlProfileToFunction.resize(YamlBF.Id + 1);		YamlProfileToFunction.resize(YamlBF.Id + 1);
YamlProfileToFunction[YamlBF.Id] = &BF;		YamlProfileToFunction[YamlBF.Id] = &BF;
YamlBF.Used = true;		YamlBF.Used = true;

assert(!ProfiledFunctions.count(&BF) &&		assert(!ProfiledFunctions.count(&BF) &&
"function already has an assigned profile");		"function already has an assigned profile");
Show All 11 Lines

bolt/lib/Profile/YAMLProfileReader.cpp

//===- bolt/Profile/YAMLProfileReader.cpp - YAML profile de-serializer ----===//		//===- bolt/Profile/YAMLProfileReader.cpp - YAML profile de-serializer ----===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "bolt/Profile/YAMLProfileReader.h"		#include "bolt/Profile/YAMLProfileReader.h"
#include "bolt/Core/BinaryBasicBlock.h"		#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Core/BinaryFunction.h"		#include "bolt/Core/BinaryFunction.h"
#include "bolt/Passes/MCF.h"		#include "bolt/Passes/MCF.h"
#include "bolt/Profile/ProfileYAMLMapping.h"		#include "bolt/Profile/ProfileYAMLMapping.h"
#include "bolt/Utils/Utils.h"		#include "bolt/Utils/Utils.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"

using namespace llvm;		using namespace llvm;

namespace opts {		namespace opts {

extern cl::opt<unsigned> Verbosity;		extern cl::opt<unsigned> Verbosity;
extern cl::OptionCategory BoltOptCategory;		extern cl::OptionCategory BoltOptCategory;
Show All 17 Lines	if (auto MB = MemoryBuffer::getFileOrSTDIN(Filename)) {
StringRef Buffer = (*MB)->getBuffer();		StringRef Buffer = (*MB)->getBuffer();
return Buffer.startswith("---\n");		return Buffer.startswith("---\n");
} else {		} else {
report_error(Filename, MB.getError());		report_error(Filename, MB.getError());
}		}
return false;		return false;
}		}

void YAMLProfileReader::buildNameMaps(		void YAMLProfileReader::buildNameMaps(BinaryContext &BC) {
std::map<uint64_t, BinaryFunction> &Functions) {		auto lookupFunction = [&](StringRef Name) -> BinaryFunction * {
		if (BinaryData *BD = BC.getBinaryDataByName(Name))
		return BC.getFunctionForSymbol(BD->getSymbol());
		maksfbUnsubmitted Done Reply Inline Actions This should always return non-null value. maksfb: This should always return non-null value.
		return nullptr;
		};

		ProfileBFs.reserve(YamlBP.Functions.size());

for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {		for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
StringRef Name = YamlBF.Name;		StringRef Name = YamlBF.Name;
const size_t Pos = Name.find("(*");		const size_t Pos = Name.find("(*");
if (Pos != StringRef::npos)		if (Pos != StringRef::npos)
Name = Name.substr(0, Pos);		Name = Name.substr(0, Pos);
ProfileNameToProfile[Name] = &YamlBF;		ProfileFunctionNames.insert(Name);
		ProfileBFs.push_back(lookupFunction(Name));
if (const std::optional<StringRef> CommonName = getLTOCommonName(Name))		if (const std::optional<StringRef> CommonName = getLTOCommonName(Name))
LTOCommonNameMap[*CommonName].push_back(&YamlBF);		LTOCommonNameMap[*CommonName].push_back(&YamlBF);
}		}
for (auto &BFI : Functions) {		for (auto &[Symbol, BF] : BC.SymbolToFunctionMap) {
const BinaryFunction &Function = BFI.second;		StringRef Name = Symbol->getName();
for (StringRef Name : Function.getNames())
if (const std::optional<StringRef> CommonName = getLTOCommonName(Name))		if (const std::optional<StringRef> CommonName = getLTOCommonName(Name))
LTOCommonNameFunctionMap[*CommonName].insert(&Function);		LTOCommonNameFunctionMap[*CommonName].insert(BF);
}		}
}		}

bool YAMLProfileReader::hasLocalsWithFileName() const {		bool YAMLProfileReader::hasLocalsWithFileName() const {
return llvm::any_of(ProfileNameToProfile.keys(), [](StringRef FuncName) {		return llvm::any_of(ProfileFunctionNames.keys(), [](StringRef FuncName) {
return FuncName.count('/') == 2 && FuncName[0] != '/';		return FuncName.count('/') == 2 && FuncName[0] != '/';
});		});
}		}

bool YAMLProfileReader::parseFunctionProfile(		bool YAMLProfileReader::parseFunctionProfile(
BinaryFunction &BF, const yaml::bolt::BinaryFunctionProfile &YamlBF) {		BinaryFunction &BF, const yaml::bolt::BinaryFunctionProfile &YamlBF) {
BinaryContext &BC = BF.getBinaryContext();		BinaryContext &BC = BF.getBinaryContext();

▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	return make_error<StringError>(
inconvertibleErrorCode());		inconvertibleErrorCode());

if (YamlBP.Header.EventNames.find(',') != StringRef::npos)		if (YamlBP.Header.EventNames.find(',') != StringRef::npos)
return make_error<StringError>(		return make_error<StringError>(
Twine("multiple events in profile are not supported"),		Twine("multiple events in profile are not supported"),
inconvertibleErrorCode());		inconvertibleErrorCode());

// Match profile to function based on a function name.		// Match profile to function based on a function name.
buildNameMaps(BC.getBinaryFunctions());		buildNameMaps(BC);

// Preliminary assign function execution count.		// Preliminary assign function execution count.
for (auto &KV : BC.getBinaryFunctions()) {		for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs))
BinaryFunction &BF = KV.second;		if (BF)
for (StringRef Name : BF.getNames()) {		BF->setExecutionCount(YamlBF.ExecCount);
auto PI = ProfileNameToProfile.find(Name);
if (PI != ProfileNameToProfile.end()) {
yaml::bolt::BinaryFunctionProfile &YamlBF = *PI->getValue();
BF.setExecutionCount(YamlBF.ExecCount);
break;
}
}
}

return Error::success();		return Error::success();
}		}

bool YAMLProfileReader::mayHaveProfileData(const BinaryFunction &BF) {		bool YAMLProfileReader::mayHaveProfileData(const BinaryFunction &BF) {
for (StringRef Name : BF.getNames()) {		for (StringRef Name : BF.getNames())
if (ProfileNameToProfile.contains(Name))		if (ProfileFunctionNames.contains(Name))
return true;		return true;
		for (StringRef Name : BF.getNames()) {
if (const std::optional<StringRef> CommonName = getLTOCommonName(Name)) {		if (const std::optional<StringRef> CommonName = getLTOCommonName(Name)) {
if (LTOCommonNameMap.contains(*CommonName))		if (LTOCommonNameMap.contains(*CommonName))
return true;		return true;
}		}
}		}

return false;		return false;
}		}

Error YAMLProfileReader::readProfile(BinaryContext &BC) {		Error YAMLProfileReader::readProfile(BinaryContext &BC) {
YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);		YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);

auto profileMatches = [](const yaml::bolt::BinaryFunctionProfile &Profile,		auto profileMatches = [](const yaml::bolt::BinaryFunctionProfile &Profile,
BinaryFunction &BF) {		BinaryFunction &BF) {
if (opts::IgnoreHash)		if (opts::IgnoreHash)
return Profile.NumBasicBlocks == BF.size();		return Profile.NumBasicBlocks == BF.size();
return Profile.Hash == static_cast<uint64_t>(BF.getHash());		return Profile.Hash == static_cast<uint64_t>(BF.getHash());
};		};

// We have to do 2 passes since LTO introduces an ambiguity in function		// We have to do 2 passes since LTO introduces an ambiguity in function
// names. The first pass assigns profiles that match 100% by name and		// names. The first pass assigns profiles that match 100% by name and
// by hash. The second pass allows name ambiguity for LTO private functions.		// by hash. The second pass allows name ambiguity for LTO private functions.
for (auto &BFI : BC.getBinaryFunctions()) {		for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs)) {
BinaryFunction &Function = BFI.second;		if (!BF)
		continue;
		BinaryFunction &Function = *BF;
// Clear function call count that may have been set while pre-processing		// Clear function call count that may have been set while pre-processing
// the profile.		// the profile.
Function.setExecutionCount(BinaryFunction::COUNT_NO_PROFILE);		Function.setExecutionCount(BinaryFunction::COUNT_NO_PROFILE);

// Recompute hash once per function.		// Recompute hash once per function.
if (!opts::IgnoreHash)		if (!opts::IgnoreHash)
Function.computeHash(YamlBP.Header.IsDFSOrder);		Function.computeHash(YamlBP.Header.IsDFSOrder);

for (StringRef FunctionName : Function.getNames()) {		if (profileMatches(YamlBF, Function))
auto PI = ProfileNameToProfile.find(FunctionName);
if (PI == ProfileNameToProfile.end())
continue;

yaml::bolt::BinaryFunctionProfile &YamlBF = *PI->getValue();
if (profileMatches(YamlBF, Function)) {
matchProfileToFunction(YamlBF, Function);		matchProfileToFunction(YamlBF, Function);
break;
}
}		}
}

for (auto &BFI : BC.getBinaryFunctions()) {
BinaryFunction &Function = BFI.second;

if (ProfiledFunctions.count(&Function))		for (auto &[CommonName, LTOProfiles]: LTOCommonNameMap) {
		if (!LTOCommonNameFunctionMap.contains(CommonName))
continue;		continue;
		std::unordered_set<BinaryFunction *> &Functions =
for (StringRef FunctionName : Function.getNames()) {		LTOCommonNameFunctionMap[CommonName];
const std::optional<StringRef> CommonName =		// Return true if a given profile is matched to one of BinaryFunctions with
getLTOCommonName(FunctionName);		// matching LTO common name.
if (CommonName) {		auto matchProfile = [&](yaml::bolt::BinaryFunctionProfile *YamlBF) {
auto I = LTOCommonNameMap.find(*CommonName);
if (I == LTOCommonNameMap.end())
continue;

bool ProfileMatched = false;
std::vector<yaml::bolt::BinaryFunctionProfile *> &LTOProfiles =
I->getValue();
for (yaml::bolt::BinaryFunctionProfile *YamlBF : LTOProfiles) {
if (YamlBF->Used)		if (YamlBF->Used)
continue;		return false;
if ((ProfileMatched = profileMatches(*YamlBF, Function))) {		for (BinaryFunction *BF : Functions) {
matchProfileToFunction(*YamlBF, Function);		if (!ProfiledFunctions.count(BF) && profileMatches(YamlBF, BF)) {
break;		matchProfileToFunction(YamlBF, BF);
		return true;
}		}
}		}
if (ProfileMatched)		return false;
break;		};
		bool ProfileMatched = llvm::any_of(LTOProfiles, matchProfile);

// If there's only one function with a given name, try to		// If there's only one function with a given name, try to match it
// match it partially.		// partially.
if (LTOProfiles.size() == 1 &&		if (!ProfileMatched && LTOProfiles.size() == 1 && Functions.size() == 1 &&
LTOCommonNameFunctionMap[*CommonName].size() == 1 &&		!LTOProfiles.front()->Used &&
!LTOProfiles.front()->Used) {		!ProfiledFunctions.count(*Functions.begin()))
matchProfileToFunction(*LTOProfiles.front(), Function);		matchProfileToFunction(LTOProfiles.front(), *Functions.begin());
break;
}		}
} else {
auto PI = ProfileNameToProfile.find(FunctionName);
if (PI == ProfileNameToProfile.end())
continue;

yaml::bolt::BinaryFunctionProfile &YamlBF = *PI->getValue();		for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs))
if (!YamlBF.Used) {		if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
matchProfileToFunction(YamlBF, Function);		matchProfileToFunction(YamlBF, *BF);
break;
}
}
}
}

for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)		for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
if (!YamlBF.Used && opts::Verbosity >= 1)		if (!YamlBF.Used && opts::Verbosity >= 1)
errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name		errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name
<< '\n';		<< '\n';

// Set for parseFunctionProfile().		// Set for parseFunctionProfile().
NormalizeByInsnCount = usesEvent("cycles") \|\| usesEvent("instructions");		NormalizeByInsnCount = usesEvent("cycles") \|\| usesEvent("instructions");
Show All 26 Lines