This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
ProfileData/
1/2
SampleProf.h
-
SampleProfReader.h
2/2
SampleProfWriter.h
-
Transforms/IPO/
-
IPO/
-
SampleProfile.h
-
lib/
-
Passes/
-
PassBuilder.cpp
-
ProfileData/
-
SampleProfReader.cpp
-
SampleProfWriter.cpp
-
Transforms/IPO/
-
IPO/
3/7
SampleProfile.cpp
-
test/Transforms/SampleProfile/
-
Transforms/
-
SampleProfile/
-
Inputs/
-
ctxsplit.extbinary.afdo
2/5
ctxsplit.ll

Differential D94435

[SampleFDO] Add the support to split the function profiles with context into separate sections.
ClosedPublic

Authored by wmi on Jan 11 2021, 11:06 AM.

Download Raw Diff

Details

Reviewers

hoy
wenlei
davidxl

Commits

rG21b1ad0340a7: [SampleFDO] Add the support to split the function profiles with context into

Summary

For ThinLTO, all the function profiles without context has been annotated to outline functions if possible in prelink phase. In postlink phase, profile annotation in postlink phase is only meaningful for function profile with context. If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context. To have the profile splitting, we extend the ExtBinary format to support different section arrangement. It will be flexible to add other section layout in the future without the need to create new class inheriting from ExtBinary class.

Diff Detail

Repository: rL LLVM

Event Timeline

wmi created this revision.Jan 11 2021, 11:06 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJan 11 2021, 11:06 AM

wmi requested review of this revision.Jan 11 2021, 11:06 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2021, 11:06 AM

Thanks for working on this which will help the thinLTO throughput. I guess there will be a separate patch to turn on this work, i.e, setting CtxSplitLayout?

llvm/lib/Transforms/IPO/SampleProfile.cpp
462	I'm wondering if we can just use one field for the phase it's currently are at. We may also want to check it for fullLTO in the future.
2126	`F` may get a chance to update its entry count with a post-inline count during top-down processing with the default layout. Maybe put this under the check against `SkipNoContextProf`?

I guess there will be a separate patch to turn on this work, i.e, setting CtxSplitLayout?

I will use the resetSecLayout interface to select CtxSplitLayout. That will be done in our own branch of autofdo tool. I think for now most of the cases will still want to use the DefaultLayout.

llvm/lib/Transforms/IPO/SampleProfile.cpp
462	Good point. Will change it.
2126	I think for any case we don't want to reinitiate F's entrycount to initialEntryCount if it has already had a valid value. Even if SkipNoContextProf is false, in LTO/ThinLTO postlink phase, it is possible that the emitAnnotations function doesn't change anything which could affect profile annotation (i.e., the variable Changed in emitAnnotations is false), so if a function getting a valid entry count in prelink phase is reinitialized to -1 before entering emitAnnotations in postlink, emitAnnotations may not be able to update it with a valid entry count again. In addition, if F has a valid entrycount entering emitAnnotations, emitAnnotations can still update it without problem.

Thanks for the change.

If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context.

I guess the speed up is more visible with merged partial profile with explicit profile flattening? or did you see it speed up FDO with regular full profile too? For a regular FDO profile, I'd expect most function profiles to have some context except for really small functions..

hoy added inline comments.Jan 13 2021, 9:23 AM

llvm/lib/Transforms/IPO/SampleProfile.cpp
2126	Oh yeah, entrycount can be set in `emitAnnotations`. Thanks for pointing it out. I was wondering if `F`'s profile can change due to the counts returning from its inliner in postlink phase, thus it may need an update in postlink though itself doesn't have context samples. That might happen with fullLTO but with thinLTO, the counts returning is already less accurate since it cannot be done cross-threads.

In D94435#2495298, @wenlei wrote:

Thanks for the change.

If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context.

I guess the speed up is more visible with merged partial profile with explicit profile flattening? or did you see it speed up FDO with regular full profile too? For a regular FDO profile, I'd expect most function profiles to have some context except for really small functions..

Right, it is more visible with merged partial profile. Currently the speedup with regular profile is minor. However, we saw in an experiment that by keeping context for the hottest functions and flattened the rest of the warm/cold functions in regular FDO profile can shrink profile size significantly without hurting performance. That is only an experiment for one target. We havn't done evaluation for other targets because regular full profile size is not a major issue for us for the moment. But if we can have the same conclusion for more targets, we can flatten more functions in the profile and the profile splitting will be more effective to reduce profile reading cost in LTO/ThinLTO postlink.

wmi added inline comments.Jan 13 2021, 10:01 AM

llvm/lib/Transforms/IPO/SampleProfile.cpp
462	Sent out a NFC change in https://reviews.llvm.org/D94613 as a preparation for the change here.

Address Hongtao's comment.

hoy added inline comments.Jan 13 2021, 5:38 PM

llvm/include/llvm/ProfileData/SampleProf.h
169	Nit: how about name this `SecFlagNoCalleeContext` or `SecFlagFlat` or something? Just want to be a bit more clear as CS profile also uses the term context.
llvm/include/llvm/ProfileData/SampleProfWriter.h
159	Nit: name it `ExtBinaryHdrLayoutTable`?

Address Hongtao's comment.

wmi marked an inline comment as done.Jan 13 2021, 10:28 PM

wmi added inline comments.

llvm/include/llvm/ProfileData/SampleProf.h
169	Make sense. I chose SecFlagFlat.
llvm/include/llvm/ProfileData/SampleProfWriter.h
159	That is better.

LGTM, thanks.

This revision is now accepted and ready to land.Jan 13 2021, 10:41 PM

wenlei added inline comments.Jan 14 2021, 11:01 PM

llvm/lib/Transforms/IPO/SampleProfile.cpp
2126	I was wondering if F's profile can change due to the counts returning from its inliner in postlink phase, thus it may need an update in postlink though itself doesn't have context samples. I think In this case, `emitAnnotations` should still be the right place where the update happens, though we don't capture such profile merge as "change" today and `emitAnnotations` won't be triggered.
llvm/test/Transforms/SampleProfile/ctxsplit.ll
4	I noticed the input is a binary profile. Is the new layout of extended binary profile still two-way convertible with text profile? Text profile works better for small test cases?

wmi marked an inline comment as done.Jan 15 2021, 9:38 AM

wmi added inline comments.

llvm/test/Transforms/SampleProfile/ctxsplit.ll
4	Yes, I can start with a text profile, convert it to a extbinary profile then run the test. Then we need another option in llvm-profdata to switch the default layout to ctxsplit Layout. Currently I didn't add the option because I felt there were few cases using ctxsplit and adding another option in llvm-profdata may not worth it considering it already has many options. Going forward, whether we need an option in llvm-profdata for every change in extbinary format is a question.

wenlei added inline comments.Jan 18 2021, 8:20 AM

llvm/test/Transforms/SampleProfile/ctxsplit.ll
4	Ok, makes sense if your create_llvm_prof only generates extbin profile today. This is not critical. Then we need another option in llvm-profdata to switch the default layout to ctxsplit Layout. This is one step further, if we just want to have cxt layout, you could let create_llvm_prof produce text profile directly, or use llvm-profgen to convert extbinary with ctx layout to text with cxt layout? llvm-profgen doesn't have to support the conversion between the two layouts, which would need extra option. I guess what I missed is that there's not going to be a text representation of the ctx layout, i.e. there's not going to be 1:1 mapping and round trip conversion between text and extbin for ctx layout?

wmi added inline comments.Jan 18 2021, 10:44 PM

llvm/test/Transforms/SampleProfile/ctxsplit.ll
4	I guess what I missed is that there's not going to be a text representation of the ctx layout, i.e. there's not going to be 1:1 mapping and round trip conversion between text and extbin for ctx layout? Right. It is not going to have 1:1 mapping and round trip conversion between extbin and text format. We can convert ctxlayout extbinary profile to text profile using llvm-profdata, but we cannot do that the other way around without adding an option, because the default layout in extbinary format is not ctxlayout. It means currently if we want to create a test using ctxlayout extbinary profile, we need to change a little code and rebuild llvm-profdata before we can generate ctxlayout extbinary profile. That is something not convenient, but that is a tradeoff in order to have less options in llvm-profdata. In the test, I want to use ctxlayout extbinary profile instead of text profile converted from ctxlayout in the profile because I want to verify in thinlto postlink phase compiler won't read the part without context. That cannot be achieved if I use text profile.

wenlei accepted this revision.Jan 18 2021, 11:09 PM

wenlei added inline comments.

llvm/test/Transforms/SampleProfile/ctxsplit.ll
4	Got it, thanks for explaining.

Closed by commit rG21b1ad0340a7: [SampleFDO] Add the support to split the function profiles with context into (authored by wmi). · Explain WhyJan 19 2021, 3:16 PM

This revision was automatically updated to reflect the committed changes.

wmi added a commit: rG21b1ad0340a7: [SampleFDO] Add the support to split the function profiles with context into.

Revision Contents

Path

Size

llvm/

include/

llvm/

ProfileData/

SampleProf.h

4 lines

SampleProfReader.h

12 lines

SampleProfWriter.h

93 lines

Transforms/

IPO/

SampleProfile.h

7 lines

lib/

Passes/

PassBuilder.cpp

12 lines

ProfileData/

SampleProfReader.cpp

8 lines

SampleProfWriter.cpp

58 lines

Transforms/

IPO/

SampleProfile.cpp

21 lines

test/

Transforms/

SampleProfile/

Inputs/

ctxsplit.extbinary.afdo

ctxsplit.ll

59 lines

Diff 315866

llvm/include/llvm/ProfileData/SampleProf.h

Show First 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	struct SecHdrTableEntry {
uint32_t LayoutIndex;		uint32_t LayoutIndex;
};		};

// Flags common for all sections are defined here. In SecHdrTableEntry::Flags,		// Flags common for all sections are defined here. In SecHdrTableEntry::Flags,
// common flags will be saved in the lower 32bits and section specific flags		// common flags will be saved in the lower 32bits and section specific flags
// will be saved in the higher 32 bits.		// will be saved in the higher 32 bits.
enum class SecCommonFlags : uint32_t {		enum class SecCommonFlags : uint32_t {
SecFlagInValid = 0,		SecFlagInValid = 0,
SecFlagCompress = (1 << 0)		SecFlagCompress = (1 << 0),
		// Indicate the section contains only profile without context.
		SecFlagNoContext = (1 << 1)
		hoyUnsubmitted Not Done Reply Inline Actions Nit: how about name this `SecFlagNoCalleeContext` or `SecFlagFlat` or something? Just want to be a bit more clear as CS profile also uses the term context. hoy: Nit: how about name this `SecFlagNoCalleeContext` or `SecFlagFlat` or something? Just want to…
		wmiAuthorUnsubmitted Done Reply Inline Actions Make sense. I chose SecFlagFlat. wmi: Make sense. I chose SecFlagFlat.
};		};

// Section specific flags are defined here.		// Section specific flags are defined here.
// !!!Note: Everytime a new enum class is created here, please add		// !!!Note: Everytime a new enum class is created here, please add
// a new check in verifySecFlag.		// a new check in verifySecFlag.
enum class SecNameTableFlags : uint32_t {		enum class SecNameTableFlags : uint32_t {
SecFlagInValid = 0,		SecFlagInValid = 0,
SecFlagMD5Name = (1 << 0),		SecFlagMD5Name = (1 << 0),
▲ Show 20 Lines • Show All 796 Lines • Show Last 20 Lines

llvm/include/llvm/ProfileData/SampleProfReader.h

Show First 20 Lines • Show All 445 Lines • ▼ Show 20 Lines	public:
/// It includes all the names that have samples either in outline instance		/// It includes all the names that have samples either in outline instance
/// or inline instance.		/// or inline instance.
virtual std::vector<StringRef> *getNameTable() { return nullptr; }		virtual std::vector<StringRef> *getNameTable() { return nullptr; }
virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) { return false; };		virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) { return false; };

/// Return whether names in the profile are all MD5 numbers.		/// Return whether names in the profile are all MD5 numbers.
virtual bool useMD5() { return false; }		virtual bool useMD5() { return false; }

		/// Don't read profile without context if the flag is set. This is only meaningful
		/// for ExtBinary format.
		virtual void setSkipNoContextProf(bool Skip) {}

SampleProfileReaderItaniumRemapper *getRemapper() { return Remapper.get(); }		SampleProfileReaderItaniumRemapper *getRemapper() { return Remapper.get(); }

protected:		protected:
/// Map every function to its associated profile.		/// Map every function to its associated profile.
///		///
/// The profile of every function executed at runtime is collected		/// The profile of every function executed at runtime is collected
/// in the structure FunctionSamples. This maps function objects		/// in the structure FunctionSamples. This maps function objects
/// to their corresponding profiles.		/// to their corresponding profiles.
▲ Show 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	protected:
/// The uint64_t data has to be converted to a string and then the string		/// The uint64_t data has to be converted to a string and then the string
/// will be used to initialize StringRef in NameTable.		/// will be used to initialize StringRef in NameTable.
/// Note NameTable contains StringRef so it needs another buffer to own		/// Note NameTable contains StringRef so it needs another buffer to own
/// the string data. MD5StringBuf serves as the string buffer that is		/// the string data. MD5StringBuf serves as the string buffer that is
/// referenced by NameTable (vector of StringRef). We make sure		/// referenced by NameTable (vector of StringRef). We make sure
/// the lifetime of MD5StringBuf is not shorter than that of NameTable.		/// the lifetime of MD5StringBuf is not shorter than that of NameTable.
std::unique_ptr<std::vector<std::string>> MD5StringBuf;		std::unique_ptr<std::vector<std::string>> MD5StringBuf;

		/// If SkipNoContextProf is true, skip the sections with
		/// SecFlagNoContext flag.
		bool SkipNoContextProf = false;

public:		public:
SampleProfileReaderExtBinaryBase(std::unique_ptr<MemoryBuffer> B,		SampleProfileReaderExtBinaryBase(std::unique_ptr<MemoryBuffer> B,
LLVMContext &C, SampleProfileFormat Format)		LLVMContext &C, SampleProfileFormat Format)
: SampleProfileReaderBinary(std::move(B), C, Format) {}		: SampleProfileReaderBinary(std::move(B), C, Format) {}

/// Read sample profiles in extensible format from the associated file.		/// Read sample profiles in extensible format from the associated file.
std::error_code readImpl() override;		std::error_code readImpl() override;

/// Get the total size of all \p Type sections.		/// Get the total size of all \p Type sections.
uint64_t getSectionSize(SecType Type);		uint64_t getSectionSize(SecType Type);
/// Get the total size of header and all sections.		/// Get the total size of header and all sections.
uint64_t getFileSize();		uint64_t getFileSize();
virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) override;		virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) override;

/// Collect functions with definitions in Module \p M.		/// Collect functions with definitions in Module \p M.
void collectFuncsFrom(const Module &M) override;		void collectFuncsFrom(const Module &M) override;

/// Return whether names in the profile are all MD5 numbers.		/// Return whether names in the profile are all MD5 numbers.
virtual bool useMD5() override { return MD5StringBuf.get(); }		virtual bool useMD5() override { return MD5StringBuf.get(); }

virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {		virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {
return std::move(ProfSymList);		return std::move(ProfSymList);
};		};

		virtual void setSkipNoContextProf(bool Skip) override {
		SkipNoContextProf = Skip;
		}
};		};

class SampleProfileReaderExtBinary : public SampleProfileReaderExtBinaryBase {		class SampleProfileReaderExtBinary : public SampleProfileReaderExtBinaryBase {
private:		private:
virtual std::error_code verifySPMagic(uint64_t Magic) override;		virtual std::error_code verifySPMagic(uint64_t Magic) override;
virtual std::error_code		virtual std::error_code
readCustomSection(const SecHdrTableEntry &Entry) override {		readCustomSection(const SecHdrTableEntry &Entry) override {
return sampleprof_error::success;		return sampleprof_error::success;
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/include/llvm/ProfileData/SampleProfWriter.h

Show All 9 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#ifndef LLVM_PROFILEDATA_SAMPLEPROFWRITER_H		#ifndef LLVM_PROFILEDATA_SAMPLEPROFWRITER_H
#define LLVM_PROFILEDATA_SAMPLEPROFWRITER_H		#define LLVM_PROFILEDATA_SAMPLEPROFWRITER_H

#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/ADT/StringSet.h"
#include "llvm/IR/ProfileSummary.h"		#include "llvm/IR/ProfileSummary.h"
#include "llvm/ProfileData/SampleProf.h"		#include "llvm/ProfileData/SampleProf.h"
#include "llvm/Support/ErrorOr.h"		#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cstdint>		#include <cstdint>
#include <memory>		#include <memory>
#include <set>		#include <set>
#include <system_error>		#include <system_error>

namespace llvm {		namespace llvm {
namespace sampleprof {		namespace sampleprof {

		enum SectionLayout {
		DefaultLayout,
		// The layout splits profile with context information from profile without
		// context information. When Thinlto is enabled, ThinLTO postlink phase only
		// has to load profile with context information and can skip the other part.
		CtxSplitLayout,
		NumOfLayout,
		};

/// Sample-based profile writer. Base class.		/// Sample-based profile writer. Base class.
class SampleProfileWriter {		class SampleProfileWriter {
public:		public:
virtual ~SampleProfileWriter() = default;		virtual ~SampleProfileWriter() = default;

/// Write sample profiles in \p S.		/// Write sample profiles in \p S.
///		///
/// \returns status code of the file update operation.		/// \returns status code of the file update operation.
Show All 16 Lines	public:
/// For testing.		/// For testing.
static ErrorOr<std::unique_ptr<SampleProfileWriter>>		static ErrorOr<std::unique_ptr<SampleProfileWriter>>
create(std::unique_ptr<raw_ostream> &OS, SampleProfileFormat Format);		create(std::unique_ptr<raw_ostream> &OS, SampleProfileFormat Format);

virtual void setProfileSymbolList(ProfileSymbolList *PSL) {}		virtual void setProfileSymbolList(ProfileSymbolList *PSL) {}
virtual void setToCompressAllSections() {}		virtual void setToCompressAllSections() {}
virtual void setUseMD5() {}		virtual void setUseMD5() {}
virtual void setPartialProfile() {}		virtual void setPartialProfile() {}
		virtual void resetSecLayout(SectionLayout SL) {}

protected:		protected:
SampleProfileWriter(std::unique_ptr<raw_ostream> &OS)		SampleProfileWriter(std::unique_ptr<raw_ostream> &OS)
: OutputStream(std::move(OS)) {}		: OutputStream(std::move(OS)) {}

/// Write a file header for the profile file.		/// Write a file header for the profile file.
virtual std::error_code		virtual std::error_code
writeHeader(const StringMap<FunctionSamples> &ProfileMap) = 0;		writeHeader(const StringMap<FunctionSamples> &ProfileMap) = 0;
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	private:
SampleProfileWriter::create(std::unique_ptr<raw_ostream> &OS,		SampleProfileWriter::create(std::unique_ptr<raw_ostream> &OS,
SampleProfileFormat Format);		SampleProfileFormat Format);
};		};

class SampleProfileWriterRawBinary : public SampleProfileWriterBinary {		class SampleProfileWriterRawBinary : public SampleProfileWriterBinary {
using SampleProfileWriterBinary::SampleProfileWriterBinary;		using SampleProfileWriterBinary::SampleProfileWriterBinary;
};		};

		const std::array<SmallVector<SecHdrTableEntry, 8>, NumOfLayout>
		ExtBinaryLayoutTable = {
		hoyUnsubmitted Done Reply Inline Actions Nit: name it `ExtBinaryHdrLayoutTable`? hoy: Nit: name it `ExtBinaryHdrLayoutTable`?
		wmiAuthorUnsubmitted Done Reply Inline Actions That is better. wmi: That is better.
		// Note that SecFuncOffsetTable section is written after SecLBRProfile
		// in the profile, but is put before SecLBRProfile in SectionHdrLayout.
		// This is because sample reader follows the order in SectionHdrLayout
		// to read each section. To read function profiles on demand, sample
		// reader need to get the offset of each function profile first.
		//
		// DefaultLayout
		SmallVector<SecHdrTableEntry, 8>({{SecProfSummary},
		{SecNameTable},
		{SecFuncOffsetTable},
		{SecLBRProfile},
		{SecProfileSymbolList},
		{SecFuncMetadata}}),
		// CtxSplitLayout
		SmallVector<SecHdrTableEntry, 8>({{SecProfSummary},
		{SecNameTable},
		// profile with context
		// for next two sections
		{SecFuncOffsetTable},
		{SecLBRProfile},
		// profile without context
		// for next two sections
		{SecFuncOffsetTable},
		{SecLBRProfile},
		{SecProfileSymbolList},
		{SecFuncMetadata}}),
		};

class SampleProfileWriterExtBinaryBase : public SampleProfileWriterBinary {		class SampleProfileWriterExtBinaryBase : public SampleProfileWriterBinary {
using SampleProfileWriterBinary::SampleProfileWriterBinary;		using SampleProfileWriterBinary::SampleProfileWriterBinary;
public:		public:
virtual std::error_code		virtual std::error_code
write(const StringMap<FunctionSamples> &ProfileMap) override;		write(const StringMap<FunctionSamples> &ProfileMap) override;

virtual void setToCompressAllSections() override;		virtual void setToCompressAllSections() override;
void setToCompressSection(SecType Type);		void setToCompressSection(SecType Type);
Show All 14 Lines	public:
virtual void setPartialProfile() override {		virtual void setPartialProfile() override {
addSectionFlag(SecProfSummary, SecProfSummaryFlags::SecFlagPartial);		addSectionFlag(SecProfSummary, SecProfSummaryFlags::SecFlagPartial);
}		}

virtual void setProfileSymbolList(ProfileSymbolList *PSL) override {		virtual void setProfileSymbolList(ProfileSymbolList *PSL) override {
ProfSymList = PSL;		ProfSymList = PSL;
};		};

		virtual void resetSecLayout(SectionLayout SL) override {
		verifySecLayout(SL);
		// Make sure resetSecLayout is called before any flag setting.
		for (auto &Entry : SectionHdrLayout) {
		assert(Entry.Flags == 0 &&
		"resetSecLayout has to be called before any flag setting");
		}
		SecLayout = SL;
		SectionHdrLayout = ExtBinaryLayoutTable[SL];
		}

protected:		protected:
uint64_t markSectionStart(SecType Type, uint32_t LayoutIdx);		uint64_t markSectionStart(SecType Type, uint32_t LayoutIdx);
std::error_code addNewSection(SecType Sec, uint32_t LayoutIdx,		std::error_code addNewSection(SecType Sec, uint32_t LayoutIdx,
uint64_t SectionStart);		uint64_t SectionStart);
template <class SecFlagType>		template <class SecFlagType>
void addSectionFlag(SecType Type, SecFlagType Flag) {		void addSectionFlag(SecType Type, SecFlagType Flag) {
for (auto &Entry : SectionHdrLayout) {		for (auto &Entry : SectionHdrLayout) {
if (Entry.Type == Type)		if (Entry.Type == Type)
addSecFlag(Entry, Flag);		addSecFlag(Entry, Flag);
}		}
}		}
		template <class SecFlagType>
		void addSectionFlag(uint32_t SectionIdx, SecFlagType Flag) {
		addSecFlag(SectionHdrLayout[SectionIdx], Flag);
		}

// placeholder for subclasses to dispatch their own section writers.		// placeholder for subclasses to dispatch their own section writers.
virtual std::error_code writeCustomSection(SecType Type) = 0;		virtual std::error_code writeCustomSection(SecType Type) = 0;
		// Verify the SecLayout is supported by the format.
		virtual void verifySecLayout(SectionLayout SL) = 0;

virtual void initSectionHdrLayout() = 0;
// specify the order to write sections.		// specify the order to write sections.
virtual std::error_code		virtual std::error_code
writeSections(const StringMap<FunctionSamples> &ProfileMap) = 0;		writeSections(const StringMap<FunctionSamples> &ProfileMap) = 0;

// Dispatch section writer for each section. \p LayoutIdx is the sequence		// Dispatch section writer for each section. \p LayoutIdx is the sequence
// number indicating where the section is located in SectionHdrLayout.		// number indicating where the section is located in SectionHdrLayout.
virtual std::error_code		virtual std::error_code
writeOneSection(SecType Type, uint32_t LayoutIdx,		writeOneSection(SecType Type, uint32_t LayoutIdx,
const StringMap<FunctionSamples> &ProfileMap);		const StringMap<FunctionSamples> &ProfileMap);

// Helper function to write name table.		// Helper function to write name table.
virtual std::error_code writeNameTable() override;		virtual std::error_code writeNameTable() override;

std::error_code writeFuncMetadata(const StringMap<FunctionSamples> &Profiles);		std::error_code writeFuncMetadata(const StringMap<FunctionSamples> &Profiles);

// Functions to write various kinds of sections.		// Functions to write various kinds of sections.
std::error_code		std::error_code
writeNameTableSection(const StringMap<FunctionSamples> &ProfileMap);		writeNameTableSection(const StringMap<FunctionSamples> &ProfileMap);
std::error_code writeFuncOffsetTable();		std::error_code writeFuncOffsetTable();
std::error_code writeProfileSymbolListSection();		std::error_code writeProfileSymbolListSection();

		SectionLayout SecLayout = DefaultLayout;
// Specifiy the order of sections in section header table. Note		// Specifiy the order of sections in section header table. Note
// the order of sections in SecHdrTable may be different that the		// the order of sections in SecHdrTable may be different that the
// order in SectionHdrLayout. sample Reader will follow the order		// order in SectionHdrLayout. sample Reader will follow the order
// in SectionHdrLayout to read each section.		// in SectionHdrLayout to read each section.
SmallVector<SecHdrTableEntry, 8> SectionHdrLayout;		SmallVector<SecHdrTableEntry, 8> SectionHdrLayout =
		ExtBinaryLayoutTable[DefaultLayout];

// Save the start of SecLBRProfile so we can compute the offset to the		// Save the start of SecLBRProfile so we can compute the offset to the
// start of SecLBRProfile for each Function's Profile and will keep it		// start of SecLBRProfile for each Function's Profile and will keep it
// in FuncOffsetTable.		// in FuncOffsetTable.
uint64_t SecLBRProfileStart = 0;		uint64_t SecLBRProfileStart = 0;

private:		private:
void allocSecHdrTable();		void allocSecHdrTable();
Show All 29 Lines	private:
bool UseMD5 = false;		bool UseMD5 = false;

ProfileSymbolList *ProfSymList = nullptr;		ProfileSymbolList *ProfSymList = nullptr;
};		};

class SampleProfileWriterExtBinary : public SampleProfileWriterExtBinaryBase {		class SampleProfileWriterExtBinary : public SampleProfileWriterExtBinaryBase {
public:		public:
SampleProfileWriterExtBinary(std::unique_ptr<raw_ostream> &OS)		SampleProfileWriterExtBinary(std::unique_ptr<raw_ostream> &OS)
: SampleProfileWriterExtBinaryBase(OS) {		: SampleProfileWriterExtBinaryBase(OS) {}
initSectionHdrLayout();
}

private:		private:
virtual void initSectionHdrLayout() override {		std::error_code
// Note that SecFuncOffsetTable section is written after SecLBRProfile		writeDefaultLayout(const StringMap<FunctionSamples> &ProfileMap);
// in the profile, but is put before SecLBRProfile in SectionHdrLayout.		std::error_code
//		writeCtxSplitLayout(const StringMap<FunctionSamples> &ProfileMap);
// This is because sample reader follows the order of SectionHdrLayout to
// read each section, to read function profiles on demand sample reader
// need to get the offset of each function profile first.
//
// SecFuncOffsetTable section is written after SecLBRProfile in the
// profile because FuncOffsetTable needs to be populated while section
// SecLBRProfile is written.
SectionHdrLayout = {
{SecProfSummary, 0, 0, 0, 0}, {SecNameTable, 0, 0, 0, 0},
{SecFuncOffsetTable, 0, 0, 0, 0}, {SecLBRProfile, 0, 0, 0, 0},
{SecProfileSymbolList, 0, 0, 0, 0}, {SecFuncMetadata, 0, 0, 0, 0}};
};
virtual std::error_code		virtual std::error_code
writeSections(const StringMap<FunctionSamples> &ProfileMap) override;		writeSections(const StringMap<FunctionSamples> &ProfileMap) override;

virtual std::error_code writeCustomSection(SecType Type) override {		virtual std::error_code writeCustomSection(SecType Type) override {
return sampleprof_error::success;		return sampleprof_error::success;
};		};

		virtual void verifySecLayout(SectionLayout SL) override {
		assert((SL == DefaultLayout \|\| SL == CtxSplitLayout) &&
		"Unsupported layout");
		}
};		};

// CompactBinary is a compact format of binary profile which both reduces		// CompactBinary is a compact format of binary profile which both reduces
// the profile size and the load time needed when compiling. It has two		// the profile size and the load time needed when compiling. It has two
// major difference with Binary format.		// major difference with Binary format.
// 1. It represents all the strings in name table using md5 hash.		// 1. It represents all the strings in name table using md5 hash.
// 2. It saves a function offset table which maps function name index to		// 2. It saves a function offset table which maps function name index to
// the offset of its function profile to the start of the binary profile,		// the offset of its function profile to the start of the binary profile,
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO/SampleProfile.h

	Show All 19 Lines
	namespace llvm {			namespace llvm {

	class Module;			class Module;

	/// The sample profiler data loader pass.			/// The sample profiler data loader pass.
	class SampleProfileLoaderPass : public PassInfoMixin<SampleProfileLoaderPass> {			class SampleProfileLoaderPass : public PassInfoMixin<SampleProfileLoaderPass> {
	public:			public:
	SampleProfileLoaderPass(std::string File = "", std::string RemappingFile = "",			SampleProfileLoaderPass(std::string File = "", std::string RemappingFile = "",
	bool IsThinLTOPreLink = false)			bool IsThinLTOPreLink = false,
				bool IsThinLTOPostLink = false)
	: ProfileFileName(File), ProfileRemappingFileName(RemappingFile),			: ProfileFileName(File), ProfileRemappingFileName(RemappingFile),
	IsThinLTOPreLink(IsThinLTOPreLink) {}			IsThinLTOPreLink(IsThinLTOPreLink),
				IsThinLTOPostLink(IsThinLTOPostLink) {}

	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);			PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);

	private:			private:
	std::string ProfileFileName;			std::string ProfileFileName;
	std::string ProfileRemappingFileName;			std::string ProfileRemappingFileName;
	bool IsThinLTOPreLink;			bool IsThinLTOPreLink;
				bool IsThinLTOPostLink;
	};			};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_SAMPLEPROFILE_H			#endif // LLVM_TRANSFORMS_SAMPLEPROFILE_H

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 1,064 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
// FIXME: revisit how SampleProfileLoad/Inliner/ICP is structured.		// FIXME: revisit how SampleProfileLoad/Inliner/ICP is structured.
if (LoadSampleProfile)		if (LoadSampleProfile)
EarlyFPM.addPass(InstCombinePass());		EarlyFPM.addPass(InstCombinePass());
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(EarlyFPM)));		MPM.addPass(createModuleToFunctionPassAdaptor(std::move(EarlyFPM)));

if (LoadSampleProfile) {		if (LoadSampleProfile) {
// Annotate sample profile right after early FPM to ensure freshness of		// Annotate sample profile right after early FPM to ensure freshness of
// the debug info.		// the debug info.
MPM.addPass(SampleProfileLoaderPass(PGOOpt->ProfileFile,		MPM.addPass(SampleProfileLoaderPass(
PGOOpt->ProfileRemappingFile,		PGOOpt->ProfileFile, PGOOpt->ProfileRemappingFile,
Phase == ThinLTOPhase::PreLink));		Phase == ThinLTOPhase::PreLink, Phase == ThinLTOPhase::PostLink));
// Cache ProfileSummaryAnalysis once to avoid the potential need to insert		// Cache ProfileSummaryAnalysis once to avoid the potential need to insert
// RequireAnalysisPass for PSI before subsequent non-module passes.		// RequireAnalysisPass for PSI before subsequent non-module passes.
MPM.addPass(RequireAnalysisPass<ProfileSummaryAnalysis, Module>());		MPM.addPass(RequireAnalysisPass<ProfileSummaryAnalysis, Module>());
// Do not invoke ICP in the ThinLTOPrelink phase as it makes it hard		// Do not invoke ICP in the ThinLTOPrelink phase as it makes it hard
// for the profile annotation to be accurate in the ThinLTO backend.		// for the profile annotation to be accurate in the ThinLTO backend.
if (Phase != ThinLTOPhase::PreLink)		if (Phase != ThinLTOPhase::PreLink)
// We perform early indirect call promotion here, before globalopt.		// We perform early indirect call promotion here, before globalopt.
// This is important for the ThinLTO backend phase because otherwise		// This is important for the ThinLTO backend phase because otherwise
▲ Show 20 Lines • Show All 456 Lines • ▼ Show 20 Lines	if (Level == OptimizationLevel::O0) {
// Emit annotation remarks.		// Emit annotation remarks.
addAnnotationRemarksPass(MPM);		addAnnotationRemarksPass(MPM);

return MPM;		return MPM;
}		}

if (PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) {		if (PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) {
// Load sample profile before running the LTO optimization pipeline.		// Load sample profile before running the LTO optimization pipeline.
MPM.addPass(SampleProfileLoaderPass(PGOOpt->ProfileFile,		MPM.addPass(SampleProfileLoaderPass(
PGOOpt->ProfileRemappingFile,		PGOOpt->ProfileFile, PGOOpt->ProfileRemappingFile,
false /* ThinLTOPhase::PreLink */));		false /* ThinLTOPhase::PreLink /, false / ThinLTOPhase::PostLink */));
// Cache ProfileSummaryAnalysis once to avoid the potential need to insert		// Cache ProfileSummaryAnalysis once to avoid the potential need to insert
// RequireAnalysisPass for PSI before subsequent non-module passes.		// RequireAnalysisPass for PSI before subsequent non-module passes.
MPM.addPass(RequireAnalysisPass<ProfileSummaryAnalysis, Module>());		MPM.addPass(RequireAnalysisPass<ProfileSummaryAnalysis, Module>());
}		}

// Remove unused virtual tables to improve the quality of code generated by		// Remove unused virtual tables to improve the quality of code generated by
// whole-program devirtualization and bitset lowering.		// whole-program devirtualization and bitset lowering.
MPM.addPass(GlobalDCEPass());		MPM.addPass(GlobalDCEPass());
▲ Show 20 Lines • Show All 1,481 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfReader.cpp

Show First 20 Lines • Show All 734 Lines • ▼ Show 20 Lines	std::error_code SampleProfileReaderExtBinaryBase::readImpl() {
const uint8_t *BufStart =		const uint8_t *BufStart =
reinterpret_cast<const uint8_t *>(Buffer->getBufferStart());		reinterpret_cast<const uint8_t *>(Buffer->getBufferStart());

for (auto &Entry : SecHdrTable) {		for (auto &Entry : SecHdrTable) {
// Skip empty section.		// Skip empty section.
if (!Entry.Size)		if (!Entry.Size)
continue;		continue;

		// Skip sections without context when SkipNoContextProf is true.
		if (SkipNoContextProf &&
		hasSecFlag(Entry, SecCommonFlags::SecFlagNoContext))
		continue;

const uint8_t *SecStart = BufStart + Entry.Offset;		const uint8_t *SecStart = BufStart + Entry.Offset;
uint64_t SecSize = Entry.Size;		uint64_t SecSize = Entry.Size;

// If the section is compressed, decompress it into a buffer		// If the section is compressed, decompress it into a buffer
// DecompressBuf before reading the actual data. The pointee of		// DecompressBuf before reading the actual data. The pointee of
// 'Data' will be changed to buffer hold by DecompressBuf		// 'Data' will be changed to buffer hold by DecompressBuf
// temporarily when reading the actual data.		// temporarily when reading the actual data.
bool isCompressed = hasSecFlag(Entry, SecCommonFlags::SecFlagCompress);		bool isCompressed = hasSecFlag(Entry, SecCommonFlags::SecFlagCompress);
▲ Show 20 Lines • Show All 230 Lines • ▼ Show 20 Lines

static std::string getSecFlagsStr(const SecHdrTableEntry &Entry) {		static std::string getSecFlagsStr(const SecHdrTableEntry &Entry) {
std::string Flags;		std::string Flags;
if (hasSecFlag(Entry, SecCommonFlags::SecFlagCompress))		if (hasSecFlag(Entry, SecCommonFlags::SecFlagCompress))
Flags.append("{compressed,");		Flags.append("{compressed,");
else		else
Flags.append("{");		Flags.append("{");

		if (hasSecFlag(Entry, SecCommonFlags::SecFlagNoContext))
		Flags.append("nocontext,");

switch (Entry.Type) {		switch (Entry.Type) {
case SecNameTable:		case SecNameTable:
if (hasSecFlag(Entry, SecNameTableFlags::SecFlagFixedLengthMD5))		if (hasSecFlag(Entry, SecNameTableFlags::SecFlagFixedLengthMD5))
Flags.append("fixlenmd5,");		Flags.append("fixlenmd5,");
else if (hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name))		else if (hasSecFlag(Entry, SecNameTableFlags::SecFlagMD5Name))
Flags.append("md5,");		Flags.append("md5,");
break;		break;
case SecProfSummary:		case SecProfSummary:
▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

llvm/lib/ProfileData/SampleProfWriter.cpp

Show All 13 Lines
//		//
// See lib/ProfileData/SampleProfReader.cpp for documentation on each of the		// See lib/ProfileData/SampleProfReader.cpp for documentation on each of the
// supported formats.		// supported formats.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ProfileData/SampleProfWriter.h"		#include "llvm/ProfileData/SampleProfWriter.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/ADT/StringSet.h"
#include "llvm/ProfileData/ProfileCommon.h"		#include "llvm/ProfileData/ProfileCommon.h"
#include "llvm/ProfileData/SampleProf.h"		#include "llvm/ProfileData/SampleProf.h"
#include "llvm/Support/Compression.h"		#include "llvm/Support/Compression.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/EndianStream.h"		#include "llvm/Support/EndianStream.h"
#include "llvm/Support/ErrorOr.h"		#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/LEB128.h"		#include "llvm/Support/LEB128.h"
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	if (auto EC = writeCustomSection(Type))
return EC;		return EC;
break;		break;
}		}
if (std::error_code EC = addNewSection(Type, LayoutIdx, SectionStart))		if (std::error_code EC = addNewSection(Type, LayoutIdx, SectionStart))
return EC;		return EC;
return sampleprof_error::success;		return sampleprof_error::success;
}		}

std::error_code SampleProfileWriterExtBinary::writeSections(		std::error_code SampleProfileWriterExtBinary::writeDefaultLayout(
const StringMap<FunctionSamples> &ProfileMap) {		const StringMap<FunctionSamples> &ProfileMap) {
// The const indices passed to writeOneSection below are specifying the		// The const indices passed to writeOneSection below are specifying the
// positions of the sections in SectionHdrLayout. Look at		// positions of the sections in SectionHdrLayout. Look at
// initSectionHdrLayout to find out where each section is located in		// initSectionHdrLayout to find out where each section is located in
// SectionHdrLayout.		// SectionHdrLayout.
if (auto EC = writeOneSection(SecProfSummary, 0, ProfileMap))		if (auto EC = writeOneSection(SecProfSummary, 0, ProfileMap))
return EC;		return EC;
if (auto EC = writeOneSection(SecNameTable, 1, ProfileMap))		if (auto EC = writeOneSection(SecNameTable, 1, ProfileMap))
return EC;		return EC;
if (auto EC = writeOneSection(SecLBRProfile, 3, ProfileMap))		if (auto EC = writeOneSection(SecLBRProfile, 3, ProfileMap))
return EC;		return EC;
if (auto EC = writeOneSection(SecProfileSymbolList, 4, ProfileMap))		if (auto EC = writeOneSection(SecProfileSymbolList, 4, ProfileMap))
return EC;		return EC;
if (auto EC = writeOneSection(SecFuncOffsetTable, 2, ProfileMap))		if (auto EC = writeOneSection(SecFuncOffsetTable, 2, ProfileMap))
return EC;		return EC;
if (auto EC = writeOneSection(SecFuncMetadata, 5, ProfileMap))		if (auto EC = writeOneSection(SecFuncMetadata, 5, ProfileMap))
return EC;		return EC;
return sampleprof_error::success;		return sampleprof_error::success;
}		}

		static void
		splitProfileMapToTwo(const StringMap<FunctionSamples> &ProfileMap,
		StringMap<FunctionSamples> &ContextProfileMap,
		StringMap<FunctionSamples> &NoContextProfileMap) {
		for (const auto &I : ProfileMap) {
		if (I.second.getCallsiteSamples().size())
		ContextProfileMap.insert({I.first(), I.second});
		else
		NoContextProfileMap.insert({I.first(), I.second});
		}
		}

		std::error_code SampleProfileWriterExtBinary::writeCtxSplitLayout(
		const StringMap<FunctionSamples> &ProfileMap) {
		StringMap<FunctionSamples> ContextProfileMap, NoContextProfileMap;
		splitProfileMapToTwo(ProfileMap, ContextProfileMap, NoContextProfileMap);

		if (auto EC = writeOneSection(SecProfSummary, 0, ProfileMap))
		return EC;
		if (auto EC = writeOneSection(SecNameTable, 1, ProfileMap))
		return EC;
		if (auto EC = writeOneSection(SecLBRProfile, 3, ContextProfileMap))
		return EC;
		if (auto EC = writeOneSection(SecFuncOffsetTable, 2, ContextProfileMap))
		return EC;
		// Mark the section to have no context. Note section flag needs to be set
		// before writing the section.
		addSectionFlag(5, SecCommonFlags::SecFlagNoContext);
		if (auto EC = writeOneSection(SecLBRProfile, 5, NoContextProfileMap))
		return EC;
		// Mark the section to have no context. Note section flag needs to be set
		// before writing the section.
		addSectionFlag(4, SecCommonFlags::SecFlagNoContext);
		if (auto EC = writeOneSection(SecFuncOffsetTable, 4, NoContextProfileMap))
		return EC;
		if (auto EC = writeOneSection(SecProfileSymbolList, 6, ProfileMap))
		return EC;
		if (auto EC = writeOneSection(SecFuncMetadata, 7, ProfileMap))
		return EC;

		return sampleprof_error::success;
		}

		std::error_code SampleProfileWriterExtBinary::writeSections(
		const StringMap<FunctionSamples> &ProfileMap) {
		std::error_code EC;
		if (SecLayout == DefaultLayout)
		EC = writeDefaultLayout(ProfileMap);
		else if (SecLayout == CtxSplitLayout)
		EC = writeCtxSplitLayout(ProfileMap);
		else
		llvm_unreachable("Unsupported layout");
		return EC;
		}

std::error_code SampleProfileWriterCompactBinary::write(		std::error_code SampleProfileWriterCompactBinary::write(
const StringMap<FunctionSamples> &ProfileMap) {		const StringMap<FunctionSamples> &ProfileMap) {
if (std::error_code EC = SampleProfileWriter::write(ProfileMap))		if (std::error_code EC = SampleProfileWriter::write(ProfileMap))
return EC;		return EC;
if (std::error_code EC = writeFuncOffsetTable())		if (std::error_code EC = writeFuncOffsetTable())
return EC;		return EC;
return sampleprof_error::success;		return sampleprof_error::success;
}		}
▲ Show 20 Lines • Show All 409 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/SampleProfile.cpp

Show First 20 Lines • Show All 316 Lines • ▼ Show 20 Lines
///		///
/// This pass reads profile data from the file specified by		/// This pass reads profile data from the file specified by
/// -sample-profile-file and annotates every affected function with the		/// -sample-profile-file and annotates every affected function with the
/// profile information found in that file.		/// profile information found in that file.
class SampleProfileLoader {		class SampleProfileLoader {
public:		public:
SampleProfileLoader(		SampleProfileLoader(
StringRef Name, StringRef RemapName, bool IsThinLTOPreLink,		StringRef Name, StringRef RemapName, bool IsThinLTOPreLink,
		bool IsThinLTOPostLink,
std::function<AssumptionCache &(Function &)> GetAssumptionCache,		std::function<AssumptionCache &(Function &)> GetAssumptionCache,
std::function<TargetTransformInfo &(Function &)> GetTargetTransformInfo,		std::function<TargetTransformInfo &(Function &)> GetTargetTransformInfo,
std::function<const TargetLibraryInfo &(Function &)> GetTLI)		std::function<const TargetLibraryInfo &(Function &)> GetTLI)
: GetAC(std::move(GetAssumptionCache)),		: GetAC(std::move(GetAssumptionCache)),
GetTTI(std::move(GetTargetTransformInfo)), GetTLI(std::move(GetTLI)),		GetTTI(std::move(GetTargetTransformInfo)), GetTLI(std::move(GetTLI)),
CoverageTracker(*this), Filename(std::string(Name)),		CoverageTracker(*this), Filename(std::string(Name)),
RemappingFilename(std::string(RemapName)),		RemappingFilename(std::string(RemapName)),
IsThinLTOPreLink(IsThinLTOPreLink) {}		IsThinLTOPreLink(IsThinLTOPreLink),
		IsThinLTOPostLink(IsThinLTOPostLink) {}

bool doInitialization(Module &M, FunctionAnalysisManager *FAM = nullptr);		bool doInitialization(Module &M, FunctionAnalysisManager *FAM = nullptr);
bool runOnModule(Module &M, ModuleAnalysisManager *AM,		bool runOnModule(Module &M, ModuleAnalysisManager *AM,
ProfileSummaryInfo _PSI, CallGraph CG);		ProfileSummaryInfo _PSI, CallGraph CG);

void dump() { Reader->dump(); }		void dump() { Reader->dump(); }

protected:		protected:
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	protected:
/// Flag indicating whether input profile is context-sensitive		/// Flag indicating whether input profile is context-sensitive
bool ProfileIsCS = false;		bool ProfileIsCS = false;

/// Flag indicating if the pass is invoked in ThinLTO compile phase.		/// Flag indicating if the pass is invoked in ThinLTO compile phase.
///		///
/// In this phase, in annotation, we should not promote indirect calls.		/// In this phase, in annotation, we should not promote indirect calls.
/// Instead, we will mark GUIDs that needs to be annotated to the function.		/// Instead, we will mark GUIDs that needs to be annotated to the function.
bool IsThinLTOPreLink;		bool IsThinLTOPreLink;
		/// Flag indicating if the pass is invoked in ThinLTO compile phase.
		///
		/// If the function profiles with and without context are split, in thinlto
		/// postlink phase, only profiles with context will be read.
		bool IsThinLTOPostLink;
		hoyUnsubmitted Not Done Reply Inline Actions I'm wondering if we can just use one field for the phase it's currently are at. We may also want to check it for fullLTO in the future. hoy: I'm wondering if we can just use one field for the phase it's currently are at. We may also…
		wmiAuthorUnsubmitted Done Reply Inline Actions Good point. Will change it. wmi: Good point. Will change it.
		wmiAuthorUnsubmitted Done Reply Inline Actions Sent out a NFC change in https://reviews.llvm.org/D94613 as a preparation for the change here. wmi: Sent out a NFC change in https://reviews.llvm.org/D94613 as a preparation for the change here.

/// Profile Summary Info computed from sample profile.		/// Profile Summary Info computed from sample profile.
ProfileSummaryInfo *PSI = nullptr;		ProfileSummaryInfo *PSI = nullptr;

/// Profle Symbol list tells whether a function name appears in the binary		/// Profle Symbol list tells whether a function name appears in the binary
/// used to generate the current profile.		/// used to generate the current profile.
std::unique_ptr<ProfileSymbolList> PSL;		std::unique_ptr<ProfileSymbolList> PSL;

Show All 37 Lines
};		};

class SampleProfileLoaderLegacyPass : public ModulePass {		class SampleProfileLoaderLegacyPass : public ModulePass {
public:		public:
// Class identification, replacement for typeinfo		// Class identification, replacement for typeinfo
static char ID;		static char ID;

SampleProfileLoaderLegacyPass(StringRef Name = SampleProfileFile,		SampleProfileLoaderLegacyPass(StringRef Name = SampleProfileFile,
bool IsThinLTOPreLink = false)		bool IsThinLTOPreLink = false,
		bool IsThinLTOPostLink = false)
: ModulePass(ID), SampleLoader(		: ModulePass(ID), SampleLoader(
Name, SampleProfileRemappingFile, IsThinLTOPreLink,		Name, SampleProfileRemappingFile, IsThinLTOPreLink,
		IsThinLTOPostLink,
[&](Function &F) -> AssumptionCache & {		[&](Function &F) -> AssumptionCache & {
return ACT->getAssumptionCache(F);		return ACT->getAssumptionCache(F);
},		},
[&](Function &F) -> TargetTransformInfo & {		[&](Function &F) -> TargetTransformInfo & {
return TTIWP->getTTI(F);		return TTIWP->getTTI(F);
},		},
[&](Function &F) -> TargetLibraryInfo & {		[&](Function &F) -> TargetLibraryInfo & {
return TLIWP->getTLI(F);		return TLIWP->getTLI(F);
▲ Show 20 Lines • Show All 1,420 Lines • ▼ Show 20 Lines	bool SampleProfileLoader::doInitialization(Module &M,
auto ReaderOrErr =		auto ReaderOrErr =
SampleProfileReader::create(Filename, Ctx, RemappingFilename);		SampleProfileReader::create(Filename, Ctx, RemappingFilename);
if (std::error_code EC = ReaderOrErr.getError()) {		if (std::error_code EC = ReaderOrErr.getError()) {
std::string Msg = "Could not open profile: " + EC.message();		std::string Msg = "Could not open profile: " + EC.message();
Ctx.diagnose(DiagnosticInfoSampleProfile(Filename, Msg));		Ctx.diagnose(DiagnosticInfoSampleProfile(Filename, Msg));
return false;		return false;
}		}
Reader = std::move(ReaderOrErr.get());		Reader = std::move(ReaderOrErr.get());
		Reader->setSkipNoContextProf(IsThinLTOPostLink);
Reader->collectFuncsFrom(M);		Reader->collectFuncsFrom(M);
ProfileIsValid = (Reader->read() == sampleprof_error::success);		ProfileIsValid = (Reader->read() == sampleprof_error::success);
PSL = Reader->getProfileSymbolList();		PSL = Reader->getProfileSymbolList();

// While profile-sample-accurate is on, ignore symbol list.		// While profile-sample-accurate is on, ignore symbol list.
ProfAccForSymsInList =		ProfAccForSymsInList =
ProfileAccurateForSymsInList && PSL && !ProfileSampleAccurate;		ProfileAccurateForSymsInList && PSL && !ProfileSampleAccurate;
if (ProfAccForSymsInList) {		if (ProfAccForSymsInList) {
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	if (ProfAccForSymsInList) {
// imprecise debug information, or the callsites are all cold individually		// imprecise debug information, or the callsites are all cold individually
// but not cold accumulatively...), so the outline function showing up as		// but not cold accumulatively...), so the outline function showing up as
// cold in sampled binary will actually not be cold after current build.		// cold in sampled binary will actually not be cold after current build.
StringRef CanonName = FunctionSamples::getCanonicalFnName(F);		StringRef CanonName = FunctionSamples::getCanonicalFnName(F);
if (NamesInProfile.count(CanonName))		if (NamesInProfile.count(CanonName))
initialEntryCount = -1;		initialEntryCount = -1;
}		}

		// Initialize entry count when the function has no existing entry
		// count value.
		if (!F.getEntryCount().hasValue())
F.setEntryCount(ProfileCount(initialEntryCount, Function::PCT_Real));		F.setEntryCount(ProfileCount(initialEntryCount, Function::PCT_Real));
		hoyUnsubmitted Not Done Reply Inline Actions `F` may get a chance to update its entry count with a post-inline count during top-down processing with the default layout. Maybe put this under the check against `SkipNoContextProf`? hoy: `F` may get a chance to update its entry count with a post-inline count during top-down…
		wmiAuthorUnsubmitted Done Reply Inline Actions I think for any case we don't want to reinitiate F's entrycount to initialEntryCount if it has already had a valid value. Even if SkipNoContextProf is false, in LTO/ThinLTO postlink phase, it is possible that the emitAnnotations function doesn't change anything which could affect profile annotation (i.e., the variable Changed in emitAnnotations is false), so if a function getting a valid entry count in prelink phase is reinitialized to -1 before entering emitAnnotations in postlink, emitAnnotations may not be able to update it with a valid entry count again. In addition, if F has a valid entrycount entering emitAnnotations, emitAnnotations can still update it without problem. wmi: I think for any case we don't want to reinitiate F's entrycount to initialEntryCount if it has…
		hoyUnsubmitted Not Done Reply Inline Actions Oh yeah, entrycount can be set in `emitAnnotations`. Thanks for pointing it out. I was wondering if `F`'s profile can change due to the counts returning from its inliner in postlink phase, thus it may need an update in postlink though itself doesn't have context samples. That might happen with fullLTO but with thinLTO, the counts returning is already less accurate since it cannot be done cross-threads. hoy: Oh yeah, entrycount can be set in `emitAnnotations`. Thanks for pointing it out. I was…
		wenleiUnsubmitted Not Done Reply Inline Actions I was wondering if F's profile can change due to the counts returning from its inliner in postlink phase, thus it may need an update in postlink though itself doesn't have context samples. I think In this case, `emitAnnotations` should still be the right place where the update happens, though we don't capture such profile merge as "change" today and `emitAnnotations` won't be triggered. wenlei: > I was wondering if F's profile can change due to the counts returning from its inliner in…
std::unique_ptr<OptimizationRemarkEmitter> OwnedORE;		std::unique_ptr<OptimizationRemarkEmitter> OwnedORE;
if (AM) {		if (AM) {
auto &FAM =		auto &FAM =
AM->getResult<FunctionAnalysisManagerModuleProxy>(*F.getParent())		AM->getResult<FunctionAnalysisManagerModuleProxy>(*F.getParent())
.getManager();		.getManager();
ORE = &FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);		ORE = &FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);
} else {		} else {
OwnedORE = std::make_unique<OptimizationRemarkEmitter>(&F);		OwnedORE = std::make_unique<OptimizationRemarkEmitter>(&F);
Show All 24 Lines	PreservedAnalyses SampleProfileLoaderPass::run(Module &M,
auto GetTLI = [&](Function &F) -> const TargetLibraryInfo & {		auto GetTLI = [&](Function &F) -> const TargetLibraryInfo & {
return FAM.getResult<TargetLibraryAnalysis>(F);		return FAM.getResult<TargetLibraryAnalysis>(F);
};		};

SampleProfileLoader SampleLoader(		SampleProfileLoader SampleLoader(
ProfileFileName.empty() ? SampleProfileFile : ProfileFileName,		ProfileFileName.empty() ? SampleProfileFile : ProfileFileName,
ProfileRemappingFileName.empty() ? SampleProfileRemappingFile		ProfileRemappingFileName.empty() ? SampleProfileRemappingFile
: ProfileRemappingFileName,		: ProfileRemappingFileName,
IsThinLTOPreLink, GetAssumptionCache, GetTTI, GetTLI);		IsThinLTOPreLink, IsThinLTOPostLink, GetAssumptionCache, GetTTI, GetTLI);

if (!SampleLoader.doInitialization(M, &FAM))		if (!SampleLoader.doInitialization(M, &FAM))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

ProfileSummaryInfo *PSI = &AM.getResult<ProfileSummaryAnalysis>(M);		ProfileSummaryInfo *PSI = &AM.getResult<ProfileSummaryAnalysis>(M);
CallGraph &CG = AM.getResult<CallGraphAnalysis>(M);		CallGraph &CG = AM.getResult<CallGraphAnalysis>(M);
if (!SampleLoader.runOnModule(M, &AM, PSI, &CG))		if (!SampleLoader.runOnModule(M, &AM, PSI, &CG))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

return PreservedAnalyses::none();		return PreservedAnalyses::none();
}		}

llvm/test/Transforms/SampleProfile/Inputs/ctxsplit.extbinary.afdo

This binary file was added.

llvm/test/Transforms/SampleProfile/ctxsplit.ll

This file was added.

				; Check the nonflattened part of the ctxsplit profile will be read in thinlto
				; postlink phase while flattened part of the ctxsplit profile will not be read.
				; RUN: opt < %s -passes='thinlto<O2>' -pgo-kind=pgo-sample-use-pipeline -profile-file=%S/Inputs/ctxsplit.extbinary.afdo -S \| FileCheck %s --check-prefix=POSTLINK
				;
				wenleiUnsubmitted Not Done Reply Inline Actions I noticed the input is a binary profile. Is the new layout of extended binary profile still two-way convertible with text profile? Text profile works better for small test cases? wenlei: I noticed the input is a binary profile. Is the new layout of extended binary profile still two…
				wmiAuthorUnsubmitted Done Reply Inline Actions Yes, I can start with a text profile, convert it to a extbinary profile then run the test. Then we need another option in llvm-profdata to switch the default layout to ctxsplit Layout. Currently I didn't add the option because I felt there were few cases using ctxsplit and adding another option in llvm-profdata may not worth it considering it already has many options. Going forward, whether we need an option in llvm-profdata for every change in extbinary format is a question. wmi: Yes, I can start with a text profile, convert it to a extbinary profile then run the test. Then…
				wenleiUnsubmitted Not Done Reply Inline Actions Ok, makes sense if your create_llvm_prof only generates extbin profile today. This is not critical. Then we need another option in llvm-profdata to switch the default layout to ctxsplit Layout. This is one step further, if we just want to have cxt layout, you could let create_llvm_prof produce text profile directly, or use llvm-profgen to convert extbinary with ctx layout to text with cxt layout? llvm-profgen doesn't have to support the conversion between the two layouts, which would need extra option. I guess what I missed is that there's not going to be a text representation of the ctx layout, i.e. there's not going to be 1:1 mapping and round trip conversion between text and extbin for ctx layout? wenlei: Ok, makes sense if your create_llvm_prof only generates extbin profile today. This is not…
				wmiAuthorUnsubmitted Done Reply Inline Actions I guess what I missed is that there's not going to be a text representation of the ctx layout, i.e. there's not going to be 1:1 mapping and round trip conversion between text and extbin for ctx layout? Right. It is not going to have 1:1 mapping and round trip conversion between extbin and text format. We can convert ctxlayout extbinary profile to text profile using llvm-profdata, but we cannot do that the other way around without adding an option, because the default layout in extbinary format is not ctxlayout. It means currently if we want to create a test using ctxlayout extbinary profile, we need to change a little code and rebuild llvm-profdata before we can generate ctxlayout extbinary profile. That is something not convenient, but that is a tradeoff in order to have less options in llvm-profdata. In the test, I want to use ctxlayout extbinary profile instead of text profile converted from ctxlayout in the profile because I want to verify in thinlto postlink phase compiler won't read the part without context. That cannot be achieved if I use text profile. wmi: > I guess what I missed is that there's not going to be a text representation of the ctx layout…
				wenleiUnsubmitted Not Done Reply Inline Actions Got it, thanks for explaining. wenlei: Got it, thanks for explaining.
				; Check both the flattened and nonflattened parts of the ctxsplit profile will
				; be read in thinlto prelink phase.
				; RUN: opt < %s -passes='thinlto-pre-link<O2>' -pgo-kind=pgo-sample-use-pipeline -profile-file=%S/Inputs/ctxsplit.extbinary.afdo -S \| FileCheck %s --check-prefix=PRELINK
				;
				; Check both the flattened and nonflattened parts of the ctxsplit profile will
				; be read in non-thinlto mode.
				; RUN: opt < %s -passes='default<O2>' -pgo-kind=pgo-sample-use-pipeline -profile-file=%S/Inputs/ctxsplit.extbinary.afdo -S \| FileCheck %s --check-prefix=NOTHINLTO

				; POSTLINK: define dso_local i32 @goo() {{.*}} !prof ![[ENTRY1:[0-9]+]] {
				; POSTLINK: define dso_local i32 @foo() {{.*}} !prof ![[ENTRY2:[0-9]+]] {
				; POSTLINK: ![[ENTRY1]] = !{!"function_entry_count", i64 1001}
				; POSTLINK: ![[ENTRY2]] = !{!"function_entry_count", i64 -1}
				; PRELINK: define dso_local i32 @goo() {{.*}} !prof ![[ENTRY1:[0-9]+]] {
				; PRELINK: define dso_local i32 @foo() {{.*}} !prof ![[ENTRY2:[0-9]+]] {
				; PRELINK: ![[ENTRY1]] = !{!"function_entry_count", i64 1001}
				; PRELINK: ![[ENTRY2]] = !{!"function_entry_count", i64 3001}
				; NOTHINLTO: define dso_local i32 @goo() {{.*}} !prof ![[ENTRY1:[0-9]+]] {
				; NOTHINLTO: define dso_local i32 @foo() {{.*}} !prof ![[ENTRY2:[0-9]+]] {
				; NOTHINLTO: ![[ENTRY1]] = !{!"function_entry_count", i64 1001}
				; NOTHINLTO: ![[ENTRY2]] = !{!"function_entry_count", i64 3001}

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: norecurse nounwind readnone uwtable
				define dso_local i32 @goo() #0 !dbg !10 {
				entry:
				ret i32 -1, !dbg !11
				}

				; Function Attrs: norecurse nounwind readnone uwtable
				define dso_local i32 @foo() #0 !dbg !7 {
				entry:
				ret i32 -1, !dbg !9
				}

				attributes #0 = { "use-sample-profile" }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4, !5}
				!llvm.ident = !{!6}

				!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 8.0.0 (trunk 345241)", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !2, nameTableKind: None)
				!1 = !DIFile(filename: "a.c", directory: "")
				!2 = !{}
				!3 = !{i32 2, !"Dwarf Version", i32 4}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{i32 1, !"wchar_size", i32 4}
				!6 = !{!"clang version 8.0.0 (trunk 345241)"}
				!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !8, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !0, retainedNodes: !2)
				!8 = !DISubroutineType(types: !2)
				!9 = !DILocation(line: 2, column: 3, scope: !7)
				!10 = distinct !DISubprogram(name: "goo", scope: !1, file: !1, line: 8, type: !8, isLocal: false, isDefinition: true, scopeLine: 8, isOptimized: true, unit: !0, retainedNodes: !2)
				!11 = !DILocation(line: 10, column: 3, scope: !10)

This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Add the support to split the function profiles with context into separate sections.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 315866

llvm/include/llvm/ProfileData/SampleProf.h

llvm/include/llvm/ProfileData/SampleProfReader.h

llvm/include/llvm/ProfileData/SampleProfWriter.h

llvm/include/llvm/Transforms/IPO/SampleProfile.h

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/ProfileData/SampleProfReader.cpp

llvm/lib/ProfileData/SampleProfWriter.cpp

llvm/lib/Transforms/IPO/SampleProfile.cpp

llvm/test/Transforms/SampleProfile/Inputs/ctxsplit.extbinary.afdo

llvm/test/Transforms/SampleProfile/ctxsplit.ll

[SampleFDO] Add the support to split the function profiles with context into separate sections.
ClosedPublic