This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/ProfileData/
-
llvm/
-
ProfileData/
1/2
SampleProf.h
-
SampleProfReader.h
-
SampleProfWriter.h
-
lib/ProfileData/
-
ProfileData/
6/12
SampleProfReader.cpp
1/2
SampleProfWriter.cpp
-
test/
-
Transforms/SampleProfile/
-
SampleProfile/
-
Inputs/
-
inline.fixlenmd5.extbinary.afdo
-
profile-format.ll
-
tools/llvm-profdata/
-
llvm-profdata/
-
show-prof-info.test

Differential D92621

[SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used.
ClosedPublic

Authored by wmi on Dec 3 2020, 5:17 PM.

Download Raw Diff

Details

Reviewers

davidxl
wenlei
hoy

Commits

rG64e768536889: [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if

Summary

Currently during sample profile loading, NameTable has to be loaded entirely up front before any name string is retrieved. That is because NameTable is stored using ULEB128 encoding and cannot be directly accessed like an array. However, if MD5 is used to represent name in the NameTable, it has fixed length. If MD5 names are stored in uint64_t type instead of ULEB128, NameTable can be accessed like an array then in many cases only part of the NameTable has to be read. This is helpful for reducing compile time especially when small source file is compiled.

We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%. The size of the profile is slightly reduced with this change by ~0.2%, and that also indicates encoding MD5 in ULEB128 doesn't save the storage space.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wmi created this revision.Dec 3 2020, 5:17 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptDec 3 2020, 5:17 PM

wmi requested review of this revision.Dec 3 2020, 5:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2020, 5:17 PM

wmi retitled this revision from [SampleFDO] [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used. to [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used..Dec 3 2020, 5:17 PM

davidxl added inline comments.Dec 3 2020, 7:38 PM

llvm/lib/ProfileData/SampleProfReader.cpp
802	Does FixLengthMD5 imply IsMD5, so IsMD5 is enough?

hoy added a subscriber: hoy.Dec 3 2020, 8:57 PM

hoy added inline comments.

llvm/include/llvm/ProfileData/SampleProf.h
171	Nit: SecFlagFixedLengthMD5?
llvm/lib/ProfileData/SampleProfWriter.cpp
177–179	I guess the writer does this unconditionally because we no longer want the variant encoding. The reader supports both for downwards compatibility?

wmi added inline comments.Dec 3 2020, 10:12 PM

llvm/include/llvm/ProfileData/SampleProf.h
171	Will change it.
llvm/lib/ProfileData/SampleProfReader.cpp
802	That is right. Will fix it.
llvm/lib/ProfileData/SampleProfWriter.cpp
177–179	Yes, you are right.

Address David and Hongtao's comments.

hoy added inline comments.Dec 4 2020, 3:00 PM

llvm/lib/ProfileData/SampleProfReader.cpp
385	Can this be just `readStringIndex(NameTable)` since `NameTable` is preallocated?

wmi added inline comments.Dec 4 2020, 4:28 PM

llvm/lib/ProfileData/SampleProfReader.cpp
385	Good point. That makes the code simpler.

Address Hongtao's comment.

hoy accepted this revision.Dec 4 2020, 11:47 PM

hoy added inline comments.

llvm/lib/ProfileData/SampleProfReader.cpp
742	Nit: this might still be helpful to the non-fixed MD5 path.
782	Nit: should `NameTable` be always empty right before here? Could an assert be useful?

This revision is now accepted and ready to land.Dec 4 2020, 11:47 PM

hoy removed a reviewer: hoyFB.Dec 4 2020, 11:48 PM

We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%.

This is great, thanks for the patch! Just to clarify, did you mean 5% of end to end ThinLTO+AutoFDO build? I'm curious what is the total profile reading time your saw in terms of %? and how different that % is between md5 profile vs non-md5 profile? Currently we mostly use non-md5 profile, assuming md5 profile is more about saving profile size.

llvm/lib/ProfileData/SampleProfReader.cpp
385	Actually it looks like both fixed path and ULEB128 path need to access `NameTable[*Idx]` in addition to `readStringIndex(NameTable)`? What about just call `SampleProfileReaderBinary::readStringFromTable` for both, and diverge only after that - return for ULEB128 path, check error or empty StringRef for fixed path?
393	If we return error code before reaching this point, `Data` pointer will not be restored, is that ok? What about using RAII to make sure restoration always happens? I also noticed this is not the only place where we do save/restore for `Data` pointer, but other instances are not part of this change.

In D92621#2435924, @wenlei wrote:

We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%.

This is great, thanks for the patch! Just to clarify, did you mean 5% of end to end ThinLTO+AutoFDO build? I'm curious what is the total profile reading time your saw in terms of %? and how different that % is between md5 profile vs non-md5 profile? Currently we mostly use non-md5 profile, assuming md5 profile is more about saving profile size.

Yes, it is 5% saving of end to end ThinLTO + AutoFDO build time.

Each module compiling has different % of profile reading time. The percentage is negligible for large source module but can be significant for small source module. I tried it on a very small file.

The build time using md5 profile was 0.55 second. 
The build time using non-md5 profile was 0.75 second. 
The build time using fixlenmd5 profile was 0.15 second. 
The build time not using any profile was 0.1 second.

Because for a large project, many source files are small or medium sizes so the building time for small files matters especially for build cpu resource. It also matters for end-to-end build time but usually not as significant. For the experiment I did (a 20M binary profile), the end-to-end build time saving was about the same as the aggregate build cpu resource saving (both were 5%), but from our experience using larger profile (like 300M profile), it will affect aggregate build cpu resource more than end-to-end build time.

llvm/lib/ProfileData/SampleProfReader.cpp
385	For fixed length MD5 path, we want to get a reference of NameTable[*Idx] and change it after we read the real name from memory. This is different from the ULEB128 path, where NameTable has been populated up front, and readStringIndex will only return a copy of StringRef.
393	When error happens, the reader/writer won't proceed and will return to caller calling read/write API interface, so the inconsistent Data pointer doesn't matter. The reason we don't want to issue a fatal error is the caller may want to skip the processing of the current problematic profile and proceed to the next one.
742	Good catch.
782	Currently NameTable is always empty before here but it is possible we have multiple NameTable sections. I have a followup NFC to support that.

In D92621#2437658, @wmi wrote:
In D92621#2435924, @wenlei wrote:

We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%.

This is great, thanks for the patch! Just to clarify, did you mean 5% of end to end ThinLTO+AutoFDO build? I'm curious what is the total profile reading time your saw in terms of %? and how different that % is between md5 profile vs non-md5 profile? Currently we mostly use non-md5 profile, assuming md5 profile is more about saving profile size.

Yes, it is 5% saving of end to end ThinLTO + AutoFDO build time.

Each module compiling has different % of profile reading time. The percentage is negligible for large source module but can be significant for small source module. I tried it on a very small file.
The build time using md5 profile was 0.55 second. 
The build time using non-md5 profile was 0.75 second. 
The build time using fixlenmd5 profile was 0.15 second. 
The build time not using any profile was 0.1 second.
Because for a large project, many source files are small or medium sizes so the building time for small files matters especially for build cpu resource. It also matters for end-to-end build time but usually not as significant. For the experiment I did (a 20M binary profile), the end-to-end build time saving was about the same as the aggregate build cpu resource saving (both were 5%), but from our experience using larger profile (like 300M profile), it will affect aggregate build cpu resource more than end-to-end build time.

Thanks for sharing those numbers, and it's good evidence that the name string in profile can be non-trivial for build time. I think CSSPGO may need to mitigate the overhead introduced by long names - we will get to that.

Closed by commit rG64e768536889: [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if (authored by wmi). · Explain WhyDec 8 2020, 4:21 PM

This revision was automatically updated to reflect the committed changes.

wmi added a commit: rG64e768536889: [SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if.

Revision Contents

Path

Size

llvm/

include/

llvm/

ProfileData/

SampleProf.h

5 lines

SampleProfReader.h

12 lines

SampleProfWriter.h

3 lines

lib/

ProfileData/

SampleProfReader.cpp

56 lines

SampleProfWriter.cpp

10 lines

test/

Transforms/

SampleProfile/

Inputs/

inline.fixlenmd5.extbinary.afdo

profile-format.ll

2 lines

tools/

llvm-profdata/

show-prof-info.test

2 lines

Diff 310384

llvm/include/llvm/ProfileData/SampleProf.h

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	enum class SecCommonFlags : uint32_t {
SecFlagCompress = (1 << 0)		SecFlagCompress = (1 << 0)
};		};

// Section specific flags are defined here.		// Section specific flags are defined here.
// !!!Note: Everytime a new enum class is created here, please add		// !!!Note: Everytime a new enum class is created here, please add
// a new check in verifySecFlag.		// a new check in verifySecFlag.
enum class SecNameTableFlags : uint32_t {		enum class SecNameTableFlags : uint32_t {
SecFlagInValid = 0,		SecFlagInValid = 0,
SecFlagMD5Name = (1 << 0)		SecFlagMD5Name = (1 << 0),
		// Store MD5 in fixed length instead of ULEB128 so NameTable can be
		// accessed like an array.
		SecFlagFixedLengthMD5 = (1 << 1)
		hoyUnsubmitted Not Done Reply Inline Actions Nit: SecFlagFixedLengthMD5? hoy: Nit: SecFlagFixedLengthMD5?
		wmiAuthorUnsubmitted Done Reply Inline Actions Will change it. wmi: Will change it.
};		};
enum class SecProfSummaryFlags : uint32_t {		enum class SecProfSummaryFlags : uint32_t {
SecFlagInValid = 0,		SecFlagInValid = 0,
/// SecFlagPartial means the profile is for common/shared code.		/// SecFlagPartial means the profile is for common/shared code.
/// The common profile is usually merged from profiles collected		/// The common profile is usually merged from profiles collected
/// from running other targets.		/// from running other targets.
SecFlagPartial = (1 << 0)		SecFlagPartial = (1 << 0)
};		};
▲ Show 20 Lines • Show All 732 Lines • Show Last 20 Lines

llvm/include/llvm/ProfileData/SampleProfReader.h

Show First 20 Lines • Show All 612 Lines • ▼ Show 20 Lines	protected:
std::error_code readProfileSymbolList();		std::error_code readProfileSymbolList();

virtual std::error_code readHeader() override;		virtual std::error_code readHeader() override;
virtual std::error_code verifySPMagic(uint64_t Magic) override = 0;		virtual std::error_code verifySPMagic(uint64_t Magic) override = 0;
virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size,		virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size,
const SecHdrTableEntry &Entry);		const SecHdrTableEntry &Entry);
// placeholder for subclasses to dispatch their own section readers.		// placeholder for subclasses to dispatch their own section readers.
virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry) = 0;		virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry) = 0;
		virtual ErrorOr<StringRef> readStringFromTable() override;

std::unique_ptr<ProfileSymbolList> ProfSymList;		std::unique_ptr<ProfileSymbolList> ProfSymList;

/// The table mapping from function name to the offset of its FunctionSample		/// The table mapping from function name to the offset of its FunctionSample
/// towards file start.		/// towards file start.
DenseMap<StringRef, uint64_t> FuncOffsetTable;		DenseMap<StringRef, uint64_t> FuncOffsetTable;
/// The set containing the functions to use when compiling a module.		/// The set containing the functions to use when compiling a module.
DenseSet<StringRef> FuncsToUse;		DenseSet<StringRef> FuncsToUse;
/// Use all functions from the input profile.		/// Use all functions from the input profile.
bool UseAllFuncs = true;		bool UseAllFuncs = true;

		/// Use fixed length MD5 instead of ULEB128 encoding so NameTable doesn't
		/// need to be read in up front and can be directly accessed using index.
		bool FixedLengthMD5 = false;
		/// The starting address of NameTable containing fixed length MD5.
		const uint8_t *MD5NameMemStart = nullptr;

/// If MD5 is used in NameTable section, the section saves uint64_t data.		/// If MD5 is used in NameTable section, the section saves uint64_t data.
/// The uint64_t data has to be converted to a string and then the string		/// The uint64_t data has to be converted to a string and then the string
/// will be used to initialize StringRef in NameTable.		/// will be used to initialize StringRef in NameTable.
/// Note NameTable contains StringRef so it needs another buffer to own		/// Note NameTable contains StringRef so it needs another buffer to own
/// the string data. MD5StringBuf serves as the string buffer that is		/// the string data. MD5StringBuf serves as the string buffer that is
/// referenced by NameTable (vector of StringRef). We make sure		/// referenced by NameTable (vector of StringRef). We make sure
/// the lifetime of MD5StringBuf is not shorter than that of NameTable.		/// the lifetime of MD5StringBuf is not shorter than that of NameTable.
std::unique_ptr<std::vector<std::string>> MD5StringBuf;		std::unique_ptr<std::vector<std::string>> MD5StringBuf;
Show All 11 Lines	public:
/// Get the total size of header and all sections.		/// Get the total size of header and all sections.
uint64_t getFileSize();		uint64_t getFileSize();
virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) override;		virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) override;

/// Collect functions with definitions in Module \p M.		/// Collect functions with definitions in Module \p M.
void collectFuncsFrom(const Module &M) override;		void collectFuncsFrom(const Module &M) override;

/// Return whether names in the profile are all MD5 numbers.		/// Return whether names in the profile are all MD5 numbers.
virtual bool useMD5() override {		virtual bool useMD5() override { return MD5StringBuf.get(); }
assert(!NameTable.empty() && "NameTable should have been initialized");
return MD5StringBuf && !MD5StringBuf->empty();
}

virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {		virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {
return std::move(ProfSymList);		return std::move(ProfSymList);
};		};
};		};

class SampleProfileReaderExtBinary : public SampleProfileReaderExtBinaryBase {		class SampleProfileReaderExtBinary : public SampleProfileReaderExtBinaryBase {
private:		private:
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

llvm/include/llvm/ProfileData/SampleProfWriter.h

Show First 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	public:
virtual void setToCompressAllSections() override;		virtual void setToCompressAllSections() override;
void setToCompressSection(SecType Type);		void setToCompressSection(SecType Type);
virtual std::error_code writeSample(const FunctionSamples &S) override;		virtual std::error_code writeSample(const FunctionSamples &S) override;

// Set to use MD5 to represent string in NameTable.		// Set to use MD5 to represent string in NameTable.
virtual void setUseMD5() override {		virtual void setUseMD5() override {
UseMD5 = true;		UseMD5 = true;
addSectionFlag(SecNameTable, SecNameTableFlags::SecFlagMD5Name);		addSectionFlag(SecNameTable, SecNameTableFlags::SecFlagMD5Name);
		// MD5 will be stored as plain uint64_t instead of variable-length
		// quantity format in NameTable section.
		addSectionFlag(SecNameTable, SecNameTableFlags::SecFlagFixedLengthMD5);
}		}

// Set the profile to be partial. It means the profile is for		// Set the profile to be partial. It means the profile is for
// common/shared code. The common profile is usually merged from		// common/shared code. The common profile is usually merged from
// profiles collected from running other targets.		// profiles collected from running other targets.
virtual void setPartialProfile() override {		virtual void setPartialProfile() override {
addSectionFlag(SecProfSummary, SecProfSummaryFlags::SecFlagPartial);		addSectionFlag(SecProfSummary, SecProfSummaryFlags::SecFlagPartial);
}		}
▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfReader.cpp

Show First 20 Lines • Show All 361 Lines • ▼ Show 20 Lines
ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable() {		ErrorOr<StringRef> SampleProfileReaderBinary::readStringFromTable() {
auto Idx = readStringIndex(NameTable);		auto Idx = readStringIndex(NameTable);
if (std::error_code EC = Idx.getError())		if (std::error_code EC = Idx.getError())
return EC;		return EC;

return NameTable[*Idx];		return NameTable[*Idx];
}		}

		ErrorOr<StringRef> SampleProfileReaderExtBinaryBase::readStringFromTable() {
		if (!FixedLengthMD5)
		return SampleProfileReaderBinary::readStringFromTable();

		// read NameTable index.
		auto Idx = readStringIndex(NameTable);
		if (std::error_code EC = Idx.getError())
		return EC;

		// Check whether the name to be accessed has been accessed before,
		// if not, read it from memory directly.
		StringRef &SR = NameTable[*Idx];
		if (SR.empty()) {
		const uint8_t *SavedData = Data;
		Data = MD5NameMemStart + ((Idx) sizeof(uint64_t));
		auto FID = readUnencodedNumber<uint64_t>();
		hoyUnsubmitted Not Done Reply Inline Actions Can this be just `readStringIndex(NameTable)` since `NameTable` is preallocated? hoy: Can this be just `readStringIndex(NameTable)` since `NameTable` is preallocated?
		wmiAuthorUnsubmitted Done Reply Inline Actions Good point. That makes the code simpler. wmi: Good point. That makes the code simpler.
		wenleiUnsubmitted Not Done Reply Inline Actions Actually it looks like both fixed path and ULEB128 path need to access `NameTable[Idx]` in addition to `readStringIndex(NameTable)`? What about just call `SampleProfileReaderBinary::readStringFromTable` for both, and diverge only after that - return for ULEB128 path, check error or empty StringRef for fixed path? wenlei:* Actually it looks like both fixed path and ULEB128 path need to access `NameTable[*Idx]` in…
		wmiAuthorUnsubmitted Done Reply Inline Actions For fixed length MD5 path, we want to get a reference of NameTable[Idx] and change it after we read the real name from memory. This is different from the ULEB128 path, where NameTable has been populated up front, and readStringIndex will only return a copy of StringRef. wmi:* For fixed length MD5 path, we want to get a reference of NameTable[*Idx] and change it after we…
		if (std::error_code EC = FID.getError())
		return EC;
		// Save the string converted from uint64_t in MD5StringBuf. All the
		// references to the name are all StringRefs refering to the string
		// in MD5StringBuf.
		MD5StringBuf->push_back(std::to_string(*FID));
		SR = MD5StringBuf->back();
		Data = SavedData;
		wenleiUnsubmitted Not Done Reply Inline Actions If we return error code before reaching this point, `Data` pointer will not be restored, is that ok? What about using RAII to make sure restoration always happens? I also noticed this is not the only place where we do save/restore for `Data` pointer, but other instances are not part of this change. wenlei: If we return error code before reaching this point, `Data` pointer will not be restored, is…
		wmiAuthorUnsubmitted Done Reply Inline Actions When error happens, the reader/writer won't proceed and will return to caller calling read/write API interface, so the inconsistent Data pointer doesn't matter. The reason we don't want to issue a fatal error is the caller may want to skip the processing of the current problematic profile and proceed to the next one. wmi: When error happens, the reader/writer won't proceed and will return to caller calling…
		}
		return SR;
		}

ErrorOr<StringRef> SampleProfileReaderCompactBinary::readStringFromTable() {		ErrorOr<StringRef> SampleProfileReaderCompactBinary::readStringFromTable() {
auto Idx = readStringIndex(NameTable);		auto Idx = readStringIndex(NameTable);
if (std::error_code EC = Idx.getError())		if (std::error_code EC = Idx.getError())
return EC;		return EC;

return StringRef(NameTable[*Idx]);		return StringRef(NameTable[*Idx]);
}		}

▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	std::error_code SampleProfileReaderExtBinaryBase::readOneSection(
End = Start + Size;		End = Start + Size;
switch (Entry.Type) {		switch (Entry.Type) {
case SecProfSummary:		case SecProfSummary:
if (std::error_code EC = readSummary())		if (std::error_code EC = readSummary())
return EC;		return EC;
if (hasSecFlag(Entry, SecProfSummaryFlags::SecFlagPartial))		if (hasSecFlag(Entry, SecProfSummaryFlags::SecFlagPartial))
Summary->setPartialProfile(true);		Summary->setPartialProfile(true);
break;		break;
case SecNameTable:		case SecNameTable: {
if (std::error_code EC = readNameTableSec(		FixedLengthMD5 =
hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name)))		hasSecFlag(Entry, SecNameTableFlags::SecFlagFixedLengthMD5);
		bool UseMD5 = hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name);
		assert((!FixedLengthMD5 \|\| UseMD5) &&
		"If FixedLengthMD5 is true, UseMD5 has to be true");
		if (std::error_code EC = readNameTableSec(UseMD5))
return EC;		return EC;
break;		break;
		}
case SecLBRProfile:		case SecLBRProfile:
if (std::error_code EC = readFuncProfiles())		if (std::error_code EC = readFuncProfiles())
return EC;		return EC;
break;		break;
case SecFuncOffsetTable:		case SecFuncOffsetTable:
if (std::error_code EC = readFuncOffsetTable())		if (std::error_code EC = readFuncOffsetTable())
return EC;		return EC;
break;		break;
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	std::error_code SampleProfileReaderBinary::readNameTable() {

return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileReaderExtBinaryBase::readMD5NameTable() {		std::error_code SampleProfileReaderExtBinaryBase::readMD5NameTable() {
auto Size = readNumber<uint64_t>();		auto Size = readNumber<uint64_t>();
if (std::error_code EC = Size.getError())		if (std::error_code EC = Size.getError())
return EC;		return EC;
NameTable.reserve(*Size);
hoyUnsubmitted Not Done Reply Inline Actions Nit: this might still be helpful to the non-fixed MD5 path. hoy: Nit: this might still be helpful to the non-fixed MD5 path.
wmiAuthorUnsubmitted Done Reply Inline Actions Good catch. wmi: Good catch.
MD5StringBuf = std::make_unique<std::vector<std::string>>();		MD5StringBuf = std::make_unique<std::vector<std::string>>();
MD5StringBuf->reserve(*Size);		MD5StringBuf->reserve(*Size);
		if (FixedLengthMD5) {
		// Preallocate and initialize NameTable so we can check whether a name
		// index has been read before by checking whether the element in the
		// NameTable is empty, meanwhile readStringIndex can do the boundary
		// check using the size of NameTable.
		NameTable.resize(*Size + NameTable.size());
		hoyUnsubmitted Not Done Reply Inline Actions Nit: should `NameTable` be always empty right before here? Could an assert be useful? hoy: Nit: should `NameTable` be always empty right before here? Could an assert be useful?
		wmiAuthorUnsubmitted Done Reply Inline Actions Currently NameTable is always empty before here but it is possible we have multiple NameTable sections. I have a followup NFC to support that. wmi: Currently NameTable is always empty before here but it is possible we have multiple NameTable…

		MD5NameMemStart = Data;
		Data = Data + (Size) sizeof(uint64_t);
		return sampleprof_error::success;
		}
		NameTable.reserve(*Size);
for (uint32_t I = 0; I < *Size; ++I) {		for (uint32_t I = 0; I < *Size; ++I) {
auto FID = readNumber<uint64_t>();		auto FID = readNumber<uint64_t>();
if (std::error_code EC = FID.getError())		if (std::error_code EC = FID.getError())
return EC;		return EC;
MD5StringBuf->push_back(std::to_string(*FID));		MD5StringBuf->push_back(std::to_string(*FID));
// NameTable is a vector of StringRef. Here it is pushing back a		// NameTable is a vector of StringRef. Here it is pushing back a
// StringRef initialized with the last string in MD5stringBuf.		// StringRef initialized with the last string in MD5stringBuf.
NameTable.push_back(MD5StringBuf->back());		NameTable.push_back(MD5StringBuf->back());
}		}
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileReaderExtBinaryBase::readNameTableSec(bool IsMD5) {		std::error_code SampleProfileReaderExtBinaryBase::readNameTableSec(bool IsMD5) {
if (IsMD5)		if (IsMD5)
		davidxlUnsubmitted Not Done Reply Inline Actions Does FixLengthMD5 imply IsMD5, so IsMD5 is enough? davidxl: Does FixLengthMD5 imply IsMD5, so IsMD5 is enough?
		wmiAuthorUnsubmitted Done Reply Inline Actions That is right. Will fix it. wmi: That is right. Will fix it.
return readMD5NameTable();		return readMD5NameTable();
return SampleProfileReaderBinary::readNameTable();		return SampleProfileReaderBinary::readNameTable();
}		}

std::error_code SampleProfileReaderCompactBinary::readNameTable() {		std::error_code SampleProfileReaderCompactBinary::readNameTable() {
auto Size = readNumber<uint64_t>();		auto Size = readNumber<uint64_t>();
if (std::error_code EC = Size.getError())		if (std::error_code EC = Size.getError())
return EC;		return EC;
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	static std::string getSecFlagsStr(const SecHdrTableEntry &Entry) {
std::string Flags;		std::string Flags;
if (hasSecFlag(Entry, SecCommonFlags::SecFlagCompress))		if (hasSecFlag(Entry, SecCommonFlags::SecFlagCompress))
Flags.append("{compressed,");		Flags.append("{compressed,");
else		else
Flags.append("{");		Flags.append("{");

switch (Entry.Type) {		switch (Entry.Type) {
case SecNameTable:		case SecNameTable:
if (hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name))		if (hasSecFlag(Entry, SecNameTableFlags::SecFlagFixedLengthMD5))
		Flags.append("fixlenmd5,");
		else if (hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name))
Flags.append("md5,");		Flags.append("md5,");
break;		break;
case SecProfSummary:		case SecProfSummary:
if (hasSecFlag(Entry, SecProfSummaryFlags::SecFlagPartial))		if (hasSecFlag(Entry, SecProfSummaryFlags::SecFlagPartial))
Flags.append("partial,");		Flags.append("partial,");
break;		break;
default:		default:
break;		break;
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfWriter.cpp

	Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines
	std::error_code SampleProfileWriterExtBinaryBase::writeNameTable() {			std::error_code SampleProfileWriterExtBinaryBase::writeNameTable() {
	if (!UseMD5)			if (!UseMD5)
	return SampleProfileWriterBinary::writeNameTable();			return SampleProfileWriterBinary::writeNameTable();

	auto &OS = *OutputStream;			auto &OS = *OutputStream;
	std::set<StringRef> V;			std::set<StringRef> V;
	stablizeNameTable(V);			stablizeNameTable(V);

	// Write out the name table.			// Write out the MD5 name table. We wrote unencoded MD5 so reader can
				// retrieve the name using the name index without having to read the
				// whole name table.
				hoyUnsubmitted Not Done Reply Inline Actions I guess the writer does this unconditionally because we no longer want the variant encoding. The reader supports both for downwards compatibility? hoy: I guess the writer does this unconditionally because we no longer want the variant encoding.
				wmiAuthorUnsubmitted Done Reply Inline Actions Yes, you are right. wmi: Yes, you are right.
	encodeULEB128(NameTable.size(), OS);			encodeULEB128(NameTable.size(), OS);
	for (auto N : V) {			support::endian::Writer Writer(OS, support::little);
	encodeULEB128(MD5Hash(N), OS);			for (auto N : V)
	}			Writer.write(MD5Hash(N));
	return sampleprof_error::success;			return sampleprof_error::success;
	}			}

	std::error_code SampleProfileWriterExtBinaryBase::writeNameTableSection(			std::error_code SampleProfileWriterExtBinaryBase::writeNameTableSection(
	const StringMap<FunctionSamples> &ProfileMap) {			const StringMap<FunctionSamples> &ProfileMap) {
	for (const auto &I : ProfileMap) {			for (const auto &I : ProfileMap) {
	addName(I.first());			addName(I.first());
	addNames(I.second);			addNames(I.second);
	▲ Show 20 Lines • Show All 473 Lines • Show Last 20 Lines

llvm/test/Transforms/SampleProfile/Inputs/inline.fixlenmd5.extbinary.afdo

This binary file was added.

llvm/test/Transforms/SampleProfile/profile-format.ll

	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -S \| FileCheck %s
	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.prof -S \| FileCheck %s			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.prof -S \| FileCheck %s
	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.compactbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.compactbinary.afdo -S \| FileCheck %s
	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.compactbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.compactbinary.afdo -S \| FileCheck %s
	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.extbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.extbinary.afdo -S \| FileCheck %s
	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.extbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.extbinary.afdo -S \| FileCheck %s
	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.md5extbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.md5extbinary.afdo -S \| FileCheck %s
	; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.md5extbinary.afdo -S \| FileCheck %s			; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.md5extbinary.afdo -S \| FileCheck %s
				; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.fixlenmd5.extbinary.afdo -S \| FileCheck %s
				; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.fixlenmd5.extbinary.afdo -S \| FileCheck %s

	; Original C++ test case			; Original C++ test case
	;			;
	; #include <stdio.h>			; #include <stdio.h>
	;			;
	; int sum(int x, int y) {			; int sum(int x, int y) {
	; return x + y;			; return x + y;
	; }			; }
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/test/tools/llvm-profdata/show-prof-info.test

	REQUIRES: zlib			REQUIRES: zlib
	; RUN: llvm-profdata merge -sample -extbinary -use-md5 -compress-all-sections -gen-partial-profile -prof-sym-list=%S/Inputs/profile-symbol-list-1.text %S/Inputs/sample-profile.proftext -o %t.1.output			; RUN: llvm-profdata merge -sample -extbinary -use-md5 -compress-all-sections -gen-partial-profile -prof-sym-list=%S/Inputs/profile-symbol-list-1.text %S/Inputs/sample-profile.proftext -o %t.1.output
	; RUN: wc -c < %t.1.output > %t.txt			; RUN: wc -c < %t.1.output > %t.txt
	; RUN: llvm-profdata show -sample -show-sec-info-only %t.1.output >> %t.txt			; RUN: llvm-profdata show -sample -show-sec-info-only %t.1.output >> %t.txt
	; RUN: FileCheck %s --input-file=%t.txt			; RUN: FileCheck %s --input-file=%t.txt
	; CHECK: [[FILESIZE:.*]]			; CHECK: [[FILESIZE:.*]]
	; To check llvm-profdata shows the correct flags for ProfileSummarySection.			; To check llvm-profdata shows the correct flags for ProfileSummarySection.
	; CHECK: ProfileSummarySection {{.*}} Flags: {compressed,partial}			; CHECK: ProfileSummarySection {{.*}} Flags: {compressed,partial}
	; To check llvm-profdata shows the correct flags for NameTableSection.			; To check llvm-profdata shows the correct flags for NameTableSection.
	; CHECK: NameTableSection {{.*}} Flags: {compressed,md5}			; CHECK: NameTableSection {{.*}} Flags: {compressed,fixlenmd5}
	; To check llvm-profdata shows the correct file size.			; To check llvm-profdata shows the correct file size.
	; CHECK: [[FILESIZE]]			; CHECK: [[FILESIZE]]